Rubik's Cubes and Genetic Algorithms in Swift
Javier Soto at Swift Summit San Francisco, 2016
Hello, everybody. My name is Javi. That's my Twitter handle, too. I'm an iOS engineer in the Twitter for iOS team here in San Francisco. Before that I worked on the Fabric and Pebble apps, for iOS obviously. But today, I'm going to talk about something very different. There's this little, fun side project that I built during a weekend. It's a little Rubik's Cube library in Swift, in a genetic solver algorithm. I'm by no means an expert in anything related to this field. I probably know more about the biology side of things, and the computer science side of things, and that's not saying much. I found it incredibly fun, and really, really fascinating. I would love to share that with you today.
Let's start with an overview of where this comes from. The theory of ‘evolution by natural selection’ was put forward by Charles Darwin in a 1959 book called “On the Origin of Species”. The theory is able to explain how a very simple process, natural selection, is able to create organisms with such complexity, like us. Through random mutations that occur as organisms reproduce, the idea of survival of the fittest and a few billion years, we got from single celled organisms like the one on the left, to incredibly complex structures like our eyes. It's this process of evolution by natural selection itself that allowed us to evolve with brains that enabled us to understand the mechanism itself, and ultimately applied in the field of computer science.
Isn't it mind-blowing? If you think it is, I really recommend this show that's on Netflix called “Cosmos”. It's really, really fascinating. I obviously don't have time to go over a full biology class here, but these are the basic steps that this process takes. We don't know quite how it started, but once we have the chemical ingredients for life, all evolution needed was time. Random mutations that occur, essentially as errors during the duplication of DNA material when organisms reproduce, create variety and change the genome. With time some of those mutations will prove advantageous to some organisms, and so that will allow them to survive at a higher rate. They will able to pass on those mutations to further generations.
Why are genetic algorithms so useful in computer science? This is my take on it: I think a lot of the problems that we solve day-to-day with programming can be approached with some combination of mathematics, and divide and conquer. However, there are some problems for which it's just hard to come up with serious, clear set of steps to solve it, or to write those steps in code.
For the Rubik's cube, I know four or five different methods to solve it. In fact, they are relatively easy to learn. Anybody can pick up one of the simple ones on the internet in a few hours, believe it or not. I could've written a program that just followed those same steps that I take. It would've probably been challenging, but more importantly I think it would've been no fun. Instead, with a genetic algorithm, we can come up with a program that can solve a Rubik's cube, or at least attempt to do it, without having to encode any sort of rules for how to approach solving a cube.
This is an example of one of the coolest things I've seen people do with genetic algorithms. By using triangles, just triangles of different color and shape and size, this program was able to approximate a picture the Mona Lisa painting. The results are just incredibly accurate. There is a link in the last slide with more details about this project. Before getting into the Swift code for the Rubik's cube that I wrote, let's look at a much simpler problem that will help us get an idea for what these algorithms look like.
We're going to implement one that, given a number that we're going to give it, we'll try to find that number. Now of course, a genetic algorithm is just overkill for that simple problem, but I picked it because it simply illustrates the very simple steps a genetic algorithm needs to take. Don't worry, we can follow all the code on the slides. There's a full Playground that's fun to play with, that you can download and the link is also in the last slide. We're going to start with a number X, and then we're going to start with a bunch of numbers. They're all zero in binary, and then we're going to mutate randomly, bit by bit out of those, and calculate how close we are to the solution. That's going to be our fitness.
Let's look at this in code. Our genetic algorithm is going to work with a pool of individuals. Each of these in the population represents a potential solution to the problem, and they have a number in them. The next step is to be able to calculate the fitness, and this right here is probably the most important part that we need to be able to implement to get our genetic algorithm. We need to have a way to calculate the fitness of each individual to determine which ones are the fittest. You can think of this value as how close we are to what we're looking for. In this case, it's how close it is to the number that we're trying to find, and in the case of a Rubik's cube is how close we are to a solved cube.
I'm not showing some of the helper functions that I call from this code. In this case, there's this bit function that can convert a number to a binary representation, and we use it here to flip some of the bits. In this case, 5% of the time, just randomly. Here's the thing about random mutations: Whether this is in our genetic algorithm or just in the real world, these mutations are not guaranteed to take us anywhere closer to the solution, or to a more fit individual, or an organism more likely to survive. In fact, most mutations are either negative or they do nothing at all.
Now we can put it all together. This is how simple the core of the algorithm is. These are the steps that we talked about before that occur in nature. The least fit individuals are less likely to survive, so they just go away. We remove them from the array. The rest mutate randomly, potentially getting us closer to the solution, and then we store them all by their fitness, putting the fittest in the front. In this sample run of this program with population size of 10,000, we found the solution in 12 generations. We can tweak the parameters, like the population size or the rate of mutations, and we'll get slightly different results. It's actually super fun to spend time, and change those values, and see what different behaviors we get.
All right, so now we know the basics about genetic algorithms. Let's move onto a cooler problem, a Rubik's cube. I'm sure you've all heard about this puzzle, but here are some random trivia that you might not know. The Rubik's cube was invented in 1974 by a Hungarian architect named Ernő Rubik. He created it to help him explain three dimensional geometry. Since then, they've sold 350 million of these. This is my collection, and this is what a cube looks like inside. The center pieces don't move, because they're affixed to six axis around which the faces rotate. Then you have two different types of pieces, you have corners that have three stickers, and edges that have two.
Despite it's apparent simplicity on the inside, it has 43 quintillion, however you read that number, different arrangements that you can find a Rubik's cube in. Only one of them is this solved position, the solved state. We could try each and every one of these 43 quintillion, one per second, it would take you 100 times the age of the universe to go through all of them. This is a fun fact, and you may think, "Wow, that's cool, but whatever." The reason why it's really relevant to us is that, if you were to approach solving a Rubik's cube using just brute force ... Even at a much higher rate than one per second, obviously with our computers we can do much better than that, you easily see that it's just not going to take you anywhere, in any reasonable time anyway. Some people can solve them really fast, though. This is not even the current world record.
All right, before we can get started with our genetic algorithm, we're going to need some API to be able to represent Rubik's cubes in their current state in Swift, in code. Like we saw before, we have two types of pieces and we're going to represent them separately, because we will manipulate them separately as well. We have edges and corners, these are the edges. All pieces are characterized by both their location in the cube, but also their orientation. Edges have two orientations, the correct one, but also a flipped one. This is a picture that illustrates what I mean. This is an almost solved cube, where two of the edges have been flipped. All of the pieces are in the correct location, but those two have the wrong orientation.
Same for corners, except in this case, because a corner faces three sides it's going to have three different orientations. The correct one, and then rotated clockwise and counter-clockwise. With this, we can already assemble a cube such that all pieces are in the correct location with their correct orientation. From here the next step is going to be creating an API that will allow us to manipulate the cube in the same way that we do with our hands, moving each of the faces, and tracking what happens to each of the pieces in the cube. Everything we can do in a cube is move one of it's faces either clockwise, counter-clockwise, or half a turn. A move is going to be which face we're moving, and the magnitude. Here are three examples of what I mean by that magnitude. In these three picture, we've moved the top face clockwise, counter-clockwise, and half a turn. Half a turn is equivalent to two quarter turns in either direction.
Lastly, we need to implement this function to apply a move. It'll modify the current state of our cube struct by applying that move. I'm not going to show the implementation of this function, it's probably the trickiest part of all of it. It would just take out a few slides. If you're curious about it, all this code is open source in my GitHub account, and you can take a look at it in more detail.
Let's get started with the genetic algorithm. Just like before, we're going to have individuals in our pool. Our algorithm to solve the Rubik's cube is going to start with one random state of a Rubik's cube, and that's the one that's it's going to try to solve back to a solved state. Each of these individuals is going to apply a sequence of moves, on top of that random state to try to solve it. The fitness of the individuals is going to be determined by how many pieces are correctly solved after applying those series of moves. We'll call that number of solved pieces variable, that member, that I'm not showing the implementation for, but it simply iterates through each of the positions and checks that it has a right piece in the right orientation.
The individual with higher fitness, in this case it's just an integer, means that's it's closer to solving a cube than one with a lower fitness. This is the beauty right here of a genetic algorithm. If you look here, we haven't needed a way to determine to tell the algorithm how to move the cube to go towards the solution. We've only needed to tell it how close we are in any given state. Mutating can be done in a few different ways for our algorithm, and here I'm showing one simple answer. This basic approach simply appends random moves to the sequence of moves in hopes that that will take it closer to the solved state.
In my actual implementation, if you take a look at the GitHub repo, I ended up with something a little more sophisticated than this with a lot of ideas from my girlfriend, who actually works with bacterial DNA in a lab, and had really good tips for this. This solver becomes, once again, trivial. We remove the least fit individuals, we mutate the ones that we have remaining, and we sort by which ones are closer to the solution. It's essentially the same thing.
Now the question is, how well does this actually work? Was it really worth it? Well, bad news is that unfortunately I haven't gotten it to solve an entire cube, but it got it incredibly close, and I'm really proud of that. Let's look at it. This is a summarized output of one of the runs where I actually left my computer running for over 15 hours to see how far it would get. The letters in those sequences that you see is how we represent a sequence of moves. Each letter represents a face. Up, down, front, back, left, right. If there is nothing next to it, it means clockwise. Two means half a turn, the other ones are counter-clockwise.
The first line is the sequence of moves that we applied to a solved cube to scramble it. The last line is after 80,000 generations. It tried close to 500 million different sequences of moves. The solutions that you see here are abbreviated, because the best ones I found were actually around 200 moves. It wasn't efficient at all, but if you look at the second to last line at the end, you see the fitness. That's the number 16 right there. A cube has 12 edges, and eight corners. That means that it has 20 total pieces, so a fitness of 16 means that it was four pieces shy of solving the cube. That's what it looked like after that solution. It was so close. In fact, if you look at it more closely, every piece is actually in the correct location, it's just that those four pieces are just not in the correct orientation.
These are some useful links if you want to look at all this code, and my Twitter account again is @Javi, if you have any questions while looking at this code or you want to discuss any of this topic, really feel free to reach out to me. I love talking about this. I'll also be available later in the fireplace if you have any more questions. Thank you so much.
Q1: How did you come up with the approach in how to represent the cube in code?
Javier: That’s actually the part that blocked me for like a week. I had this idea like, "Oh, it would be cool if I did a genetic algorithm for a Rubik's cube!" And I was like, "Well, how would I represent ... I've never written code for a Rubik's cube." I went through it in my head while watching some boring Baseball or something like that. Then one day I was like, "You know, maybe there's a simple approach where I have a member for each of the locations, and there I have which piece is in that location." I tried it out, I had to refactor a couple times, but it ended up being fairly simple. I didn't show quite that code, but there’s a sort of simple struct for the cube, and the corners, and the edges. So, that wasn't too bad.
Implementing moves? That was trickier. In fact, I thought I had it, and then I started implementing the genetic algorithm. Then I realized that it was tracking the pieces wrong, and I had to write a lot of unit tests to make sure that the pieces were in ... Because I would apply the same move that I see on the screen on an actual cube, and the pieces would be in the wrong place compared to what the computer thinks that they are. That's the part that was a little tricky. Yeah, thank you. Anymore questions?
Absolutely. The second question is with the way that I calculate fitness. If I have a cube that's solved but one of the faces is rotated one way, all of those pieces are going to be considered in the wrong location, but their almost there, right?
That's true. Every time I had a thought like that, I started thinking, "Okay, I'm going to go and do it this way," but I was like, "Okay, if I end up doing that, then I'm going to end up including a lot of these rules for how to solve it." Then it makes it more complicated.
With genetic algorithms, there's this really interesting trade off where you're trying obviously a lot of generations, so if you make it faster then you can try a lot more. Maybe that's what takes you to the solution. If you make it smarter, maybe that will take it to a solution, but then you're also going to try less, because your algorithm takes more time, because it's doing more things, right? You're constantly balancing those two things.