Swift at Scale; Rewriting the Uber App

Alan Zeino at Playgrounds Conference 2017


Video transcript:

My name is Alan, I'm here from Uber. I just wanted to say, before I started, a huge thank you to Playgrounds Con for having me here. I do have a weird accent, that's because I moved to America about three years ago, but I'm trying to get my Australian accent back. The Playgrounds Con has been so lovely in having me here. I looked at the speaker list and I was so surprised at how many people that I admired, and open source projects that I'd used were also on this conference, so it's absolutely great, and I'm glad to be here.

You might remember this. This was the old Uber app. It had served us so well for very many years. When I joined Uber at the end of 2014 we had about 11 iOS engineers. Today we have over 200. This application, for a couple of years, is the thing that sustained our growth. Unfortunately though, it was time to start with something else. 


The re-write

We re-wrote and re-designed the Uber app from the ground up. Back in November you might recall, if you use the Uber app, that we replaced it with this. We didn't just replace the programming language and we didn't just replace the architecture but we replaced the design as well. We flipped the way that you request a ride. We changed the way that you select between different products, and we even added a feed on trip where we can show you cool things like Foursquare restaurants at your destination, or Snapchat filters if you want to do that.


3 changes

There were three things that we changed. The focus of this talk is really on architecture and programming language and actually it's mostly programming language since that's the most exciting part, but let's start with architecture. 


Architecture

We had a 5000 line view controller. I really wanted to have this slide in here because I had to work on that view controller. As you know, we have always been guided by Apple in this respect on how we do architecture in our apps. At Uber we did Apple and VC, but the problem with this was that we didn't separate out the important components of our application and that made them really hard to test and also really hard to read. When you came along and you needed to make a bug fix or you needed to add something new to the entire ‘Request a ride’ core flow, it actually took quite a long time to even understand how it worked.

So we came up with something new. An entire new effort which we called Presidio, which is a cool name and all new efforts need cool, internal code names. Presidio, if you haven't been to San Francisco, is the Golden Gate Bridge plus the park plus the entire Bay Area. This is my photo from a wonderful cloudy San Francisco day.

We had a number of architectural goals of Presidio. The first one was, we wanted to set ourselves a reliability target of 99.99%. I missed a couple of the 9's here. We wanted the features that everybody needs to take a ride such as signing up, logging in, requesting a driver, to be absolutely rock solid. 

We also wanted to provide rails for the design and the code, so the old application wasn't really scaling with all the different things that we wanted to do. That little products slider was starting to get confusing, people were like; “what is the difference between uberX and Uber Black and Uber Wave and all these other things?”. 

We also wanted to make monitoring a first class citizen. Over the years we'd built a lot of tools and systems around logging and analytics and tracing but a lot of it was manual. If you wanted that stuff you kind of had to opt in. We decided that we would just build something that was opt out from the get go. 

We wanted to support Uber's growth for years to come.

We wanted to de-risk experimentation. If you came to the talk I did on Tuesday you'll know that I talked a little bit about experimentation because that's an important part of how we ship apps every single week. We want to make it really really easy for people to do AB testing and experimentation in the application so we came up with something called a plugin API and we separate our app into core flows and optional flows, so the core flow is all the important stuff that we absolutely need every week, and the optional flow is just about everything else. We have a rule internally which is that if an optional plugin is crashing in production, we just turn it off, because if it's not core and it's crashing it doesn't need to be in the app.

Finally, we had this one little thing internally that we just kept saying to ourselves, which was, make magic. We wanted to make magic with this application. We wanted it to be really fast for every single person in the world. As you know, not everybody has the latest devices. India is one of our most important markets and a lot of the users there are slowly transitioning to 3G right now. We wanted the app to be as fast as it could be for those people as well.


What design pattern did we pick?

This raises the question; what design pattern did we pick? We came up with something called "Riblets". It's a little bit like Viper, in fact it's mostly like viper, just with a couple of small little changes. The rib comes from the words Router, Interaction, Builder, exactly the same definition as in Viper. The router is the thing that decides where things go, the interactor is the thing that sort of interacts with the data sources and any models, and the builder… actually I don't remember what the builder is. I'll read the Viper thing. And optionally, a Presenter and a View. With Viper you kind of say that like, okay, you usually need a View, but what we said to ourselves at Uber was, not every single part of our application tree actually needs a view. Sometimes when we're switching between some parts of our application there will be steps where we do a network request, where we do some business logic, and we don't actually have to show anything. That's really the biggest difference between Viper.

I'm not going to go into too much detail. Thankfully we wrote a big long blog post that I can just send you all to. If you go to t.Uber.com/rider-app, you can see a lot more about riblets. Someone came up to me the other day and was like, "I read your blog post," and I was like, "Oh no, now I can't rely on it for this talk," but it needs some sample code so I'm going to go back to the team and ask them to do some kind of sample code or a sample project.

So, the timeline here. I apologize for the confusing nature of this slide. I had to take some things out at the last minute. Pretty much we started the project in February. In February we just worked on the core frameworks and the fundamental parts of the app and we tried out this riblets idea. We tried out a lot of different ideas until we got something that we thought worked. Then in June we brought a wider team to work on the core flow. Everything from the rider, take a trip and sign in and sign up and all that kind of stuff. Finally in August we let everybody on. About 150 engineers at the peak were working on this project. Then they were starting to build all those little features that everybody kind of needs. Things that are specific to markets and all that other stuff, which was in the old app, started to be re-written in August.


Programming language - Swift

Since this is the most exciting part and the reason why everyone's here, let's talk a little bit about what it was like for us to use Swift. It was actually not our first attempt at Swift. When I joined in September 2014, when we were small enough to all fit in the same room in our iOS meeting, we all were like, let's rewrite everything in Swift, and let's write all of our new code in Swift. We were so excited, we didn't know the language so well but we knew it was new and we knew we wanted to replace Objective-C, so we started to write some stuff in Swift. I was actually the person who set up the project to be able to be written in Swift, in November 2014. For some reason I thought that it was a good design pattern to have a file with all the global static functions you might need. Like I said, we had no idea how to write Swift at this point. Then I was also the person who a year later deleted the last piece of Swift 1.0 code from our code base.

Pretty much what had happened was, we were like, yes, let's write lots of stuff in Swift, so we started to do that and then we started to see some of the things that did not work so well. Like compile times, like Xcode crashing constantly, and we started to rewrite those parts back in Objective-C. I came along to do the Xcode 7 upgrade and I was like, I don't want to convert this code to Swift 2. Oh, it's not even being used, I'm just going to delete it.

Then we started again with Swift 2. In this new application we have 8000 Swift files. A small, maybe 10 to 15% of those are code generated. What we do is, we take specifications and we code gen some of the model objects that our engineers need. We have 700,000 lines of Swift. That's a lot of Swift. We've been talking to a lot of other companies in the Bay, we think this might be one of the biggest ones. It's also quite slow to compile this much Swift.


Uber loves Swift

But we do love Swift. The funny thing is that I was talking to someone at work about this and they were like; “We love Swift? But we've had so many problems with it, I don't understand why we would say that.” On the spot I was like; well, my cat bites me all the time and wakes me up in the middle of the night and knocks things off tables but I still love him, right? There are so many features in Swift that we do, really do love. Sometimes what I do is like, when we're all complaining about some of the things that we don't like so much I'm like, yeah, but think about like all the stuff that we get today that we did not have in Objective-C. A lot of the people at Uber were like, Objective-C people for like, a very long time, maybe six, maybe a little bit more years. Some people 12 years, a little bit longer. We got so much cool stuff in just three years of Swift. I'm always trying to remind people about all the good stuff as well.


Some pain points

There were some pain points. 

Compile Times

On the day that we launched the new application, this was how long it took to build it, and that on the left was how long it took to build the old one. Two minutes, and you can guess how long the new one took. We kind of lived with this for a little while. Since I was one of the people on the build time, well, build tooling and developer experience team, I got a lot of feedback constantly for months about these ever-growing compile times. Suddenly out of nowhere one day, we got an email. One of our engineers, his brother works at Airbnb. Airbnb had noticed that if they just fiddled with some of the settings that they could actually squeeze a little bit of compiler performance out of their app. You might be thinking to yourself wait, hold on, isn't whole module optimization supposed to slow down our app? I mean you usually turn that on for release builds, right? We actually turned it on for debug builds, and I'll show you how we did that.

If you go to Xcode and you go to the Swift compiler section and you set whole module optimization on for debug, what it actually does is it prepends the fast optimization to the actual Swiftc compile thing that you can see in your build log. That's not actually what you want. 

The trick here is, this one weird trick, is that you set your optimization level for debug to none, which I think is actually the default. Then you go down to your user-defined flags and you add SWIFT_WHOLE_MODULE_OPTIMIZATION > debug > yes. 



Instead what you end up seeing inside your build log is the same sort of thing but you'll see Onone, so no optimization, and the whole module optimization flag.

We found that by turning this on we actually reduced our clean build times by about 50%. We've been wary about talking about this because we don't really super understand why. We have some theories, like, we just hired some people to work on the Swift compiler so hopefully soon we'll be able to write a blog post and tell you why this is. Our theory is that by turning on whole module optimization with no optimizations, it actually just, well if you look at the whole module optimization manifesto inside the Swift repository, the way that it changes the compilation is that instead of like triggering as many parallel compiles of Swift files and then generating object files directly of those, it actually ingests all of them and looks at the whole module and then starts spitting out object files. What we think is that it's skipping some of the actual slow bits but it is applying that parallel optimization which we think is actually improving the compile time for us because it's just compiling way more Swift files than it could because it's spending a little bit of time up front just figuring out what the best way to compile them is.

Another thing. We turned this on very recently. 



There's the front end option, this is apparently an unsupported feature that they committed sometime late last year. If you turn this on you can set what your minimum acceptable compile time is for type checking. We turn this on for debug builds just as warnings, and we show engineers as they're typing their code whether or not it takes a little bit longer than expected to type check.


Id

Another problem that we had was with linking. One of the issues is, you might have an engineer who's making a change to the code and they've decided to import another framework. We have 180, well, we have 80 internal frameworks that are just for this application but in total we have 180 dynamic frameworks. 



If you're importing another module, you might put this code up, and then it passes on CI but then fails on master. We were like, how is this possible? What we learned was that sometimes people would import something that they thought they could import, it passes green, and the reason why it passed green was because the compiler would nondeterministically build that thing before you needed to link it so at link time it would go; oh, yeah, I have it, cool, I'll use it. Then for other builds such as on master or someone else's pull request later, it would not build in the exact same order and then it would fail because the import doesn't have anything in the project to say; hey, where is this framework?

Pretty much the fix there is to obviously link against that library. With Buck what we do is, we've written some tooling around that. Buck files they specify dependencies, we just rolled out Swift support for Buck internally at Uber, and what we do is we just look at all the imports and we look at the dependencies and we fail pull requests that don't match. That's how we kind of got around that, but for a long time we were doing this manually. Something would land in the master and it would break other people's pull requests and we would have to manually add this little thing in.


Xcode

Who's seen this before? 



Maybe more often in the early days of Swift than now. We still see this all the time, and it's just because of the size of our code base and the amount of stuff that needs to be indexed. We don't have a good fix for that. The only thing we do is the same thing everyone else does: we file Radars and we attach sysdiagnose logs.

For us, indexing is a huge problem. When you've just checked out the repo fresh or you've done a pull from master for several day's changes, Xcode will open up and it'll just sit there indexing for quite a long time. The worst thing is is, you kind of take for granted how much the indexing actually gives you. While you're typing, pretty much nothing works. Another problem for us is that while we're typing it takes half a second for every character to appear on screen. As you can imagine our engineers are very angry about this. They were so angry that some of them just turned off indexing completely entirely. However, there is some good news. In Xcode 8.3, this actually is about 50% faster. As you can imagine we're going to try and update to Xcode 8 3 as soon as we can and I recommend that you do that as well.


App Size

Let's talk a little bit about app size. As you know, there is a limit on how big an application can be when it comes to downloading it over cellular networks. For us this is even more important because sometimes somebody will be out in the middle of nowhere and they need to get a ride and they're downloading Uber for the first time. There's no wifi network near them. They need to be able to download this app. With app thinning obviously this gets a little easier, but we support iOS 8, and we really wanted the experience for iOS 8 users to be just as good as for everyone else.

So, there was a couple of things that we noticed, the first one being Structs, large structs, we code gen our model objects, right, so we ended up with quite large structs. They're stored on the stack depending on how big, once they get to a certain size they start to be stored on the stack. The other thing also is when they get flattened, because they're big and nested, it also increases the binary size. 

There's also the problem of the Swift Standard Libraries. We're not getting ABI stability in Swift 4, is what I heard recently, so this is going to continue to be a problem. This adds a couple of megabytes to our app binary that we absolutely cannot do anything about. 

Finally, Generic Specialization. As I think someone gave this talk, maybe yesterday. When generics are expanded out to match all the different types that they're used for, this also increases your binary size because essentially they're copied in a couple of places.


Startup Time

Then the other problem for us, which was startup time. Here when I say pre-main, what I essentially mean is all of the bits that you can't super control, right? From the moment that somebody taps "open" on the application to where your execution starts, there's this period right here where the operating system is doing some funky things to get you all set up. One of those things that it does is it starts to load all of the dynamic libraries. There is a problem where at a certain point you have so many dynamic libraries that your application gets quit by the watchdog on slow devices.

There's a couple of things here to talk about. 

The first one is, what we did is, we re-linked the object files back into the main application library as part of our build step. This doesn't work for the Swift runtime libraries so there is a fixed cost of 250 milliseconds even on an iPhone 6S. 

The other thing also is like, we'd really learned early on that we just needed to start doing performance tests one every single commit that goes through our code base. We learned the hard way that the number of developer or enterprise provisioning profiles can affect the startup speed. Now when we do performance tests we have to completely wipe that device of all of its provisioning profiles before we install the app and run it to make sure that we're always getting the right values. We built a lot of tooling to graph these pre-main times so that we know that we're never going to get caught in the future with something unexpected. 

Also we learned that if we reorder the symbols in the application binary we could re-link the application with the order that they're actually used. You start up your application and the order that they're actually used might be different to the order that they were linked in. By reordering that we actually got a speed boost on the iPhone 4S of 20%.


Swift Versions

Finally, you've probably wondered at this point, like, what version of Swift were these guys using? We started off with Swift 2.2 when we began our re-write, and then we migrated to Swift 3. A lot of code got migrated to Swift 3. 6000 files changed, a lot of insertions, a lot of deletions. The way that we did this, because obviously this is super hard, anyone who's done this migration can attest to how horrible it might have been for you and it was even worse for us. For a couple of weeks we just got some engineers to even understand the problem. To look at all the different changes, to predict how much time it might take, to attempt to run the migrator, which would crash every single time after waiting for an hour. Then finally what we did was, okay, we came up with a plan. A couple of engineers worked on some of the core frameworks that were pretty much depended by everyone, some of our testing modules, some of our foundation modules. All of that kind of stuff. Then what we did was, for five days we threw 45 engineers at all of our modules and we broke master on a Friday and we said, Master is broken, we've updated all of these bottom low level libraries, now everybody else have at it. Finally on Monday the app was compiling and not much else work.


Uber’s Takeaways

I just wanted to end with some of the things that we learned and some of the things that you could take away from this talk. 

The first one is, keep an eye on compile times. I've worked on small teams before where we took compile times for granted. Now that I've seen an app like this, the only thing I can say is that, start with tooling from the get go on monitoring your compile times so that later you're not finding out from your engineers that they're bad, you're finding out because a computer is telling you that they're bad.

The other one is, no compiler setting is untouchable. We just thought that we could not turn on whole module optimization without increasing our build times, and it wasn't until we got a random email out of nowhere that we found out that yes we could. The mantra from now on at Uber is, you can touch any compiler setting you want. Most of them will break the app but as long as we're learning and we're measuring and we're profiling, we might find something there where we can squeeze a little bit more performance out.

Monitoring the binary size before you ship. We only learned about our binary size problems a few weeks before we shipped. The worst time to find out about that. My suggestion is, if this is a concern for you, if not being able to download on cellular might be a problem for your customers, start monitoring it from the get go when you start your application.

Too many dynamic libraries can affect your startup performance. I think that there is a great open source CocoaPods plugin that can fix this. I don't remember the name of it myself but I would recommend that you use that. I don't use an iPhone 4S every day, I don't use an iPhone 5, so in general the only way that we know that these things are slow is because we've set up a lot of monitoring around these things and we do performance tests. I would expect that not everybody is also testing their app on every possible device, but if you are using quite a number of dynamic frameworks you should consider actually using that plugin, that CocoaPods tool.

Aggressively adopt new versions of Xcode. Unfortunately I am the person that does Xcode upgrades at Uber and they take quite a long time. They usually take a couple of weeks to prep. The point updates are usually okay, the major updates break everything. Our motto recently has just been; how can we change our process so that we can test all the little things that we need to do when we manually test a new version of Xcode before it comes out? So that on the day that it's out we can just switch to it, especially if it gives us things like improved indexing time.

Consider using Buck. I actually didn't mention much about Buck in this talk because some of you may have been to the talk I gave at Sydney CocoaHeads last week. I think there will be a video out about that soon. We've also got a blog post that's going up on our engineering blog about how we migrated to a monolithic repository using Buck. We've been working very closely with the Facebook team that built Buck on improving the Swift support. Right now the Swift support in Buck can only support essentially targets with just Swift and no other code in them and no dependencies on other code, which doesn't really work very well. In the last couple of months we've actually been improving this. We've built out full Swift support for targets that have both Swift and Objective-C and dependencies on other targets that have that and we've also added support for dynamic frameworks into Buck. We're slowly merging that stuff in in the next few months so just check out buckbuild.com and the docs will soon start reflecting some of the changes we've made. If you want to learn a little bit more I'll be over in the speaker room.

The last one is, file radars. I will admit that there was a period where I didn't want to file radars ever because Apple kept closing them as duplicates or never looking at them or never telling me if they were fixed. Then I learned that once Apple went open source, this whole thing really changed. They look at every single radar. They need radars to be able to fix stuff. Now when we see a problem in Xcode or in the compiler or anywhere, the first thing we say is, "Have we filed a radar?" Because if we haven't filed a radar, why are we complaining about it? Let's file a radar, let's let Apple know, and then let's see what we can do.


What's next for Uber? 

Like I said, the Buck Swift support is going to get a little bit better. I know Buck is a very different tool and I don't expect that everyone is just going to go home and then switch everything to Buck. Obviously that takes a lot of effort, but if you are thinking of using a different build tool, we want to make it really good for Swift and there's a lot of things that you get out of using Buck, so we're going to work on making that better. 

Compiler contributions, we realize that we were benefiting a lot from the work that other teams in other companies had done to the Swift compiler and since it had become open source, and now we want to start doing some of these things ourselves. We just started hiring some people for this, and if you're interested in that I'd love to talk to you. 

And of course, more talks like this. We've only spoken a little bit about this new application and Swift but we're going to start doing a little bit more of that and a little bit more blogging as well.


Thank you

This is my last slide. Usually when I do talks I always put my email address in really small text because I'm like, please don't email me, but actually it's actually been a huge honor for me to come back to Australia. I love this place, I was born here, I've met so many of you over the last week, I want to help you, and not just about this tooling stuff. Not just about Swift. If you've got anything you want to talk about, working at Uber, working in Silicon Valley, what it's like to move over there, career stuff, if you're a graduate, I'd love to hear from you. Here's my email address, send me anything you want and I promise I'll respond. Thank you very much.


Links:

If you enjoyed this talk, you can find more info: