Why did we decide to reinvent the Uber driver end

This is the first in a series of articles on how Uber’s client engineering team is developing the latest version of the driver side, codenamedCarbonIs a core component of our ride-sharing business. Among other new features, the driver-side APP provides income to more than 3 million drivers, guiding them to earn money. In 2017, we started to redesign the driver side based on the feedback from drivers, and launched the project in September 2018.

In early 2017, we decided to refactor the driver side. It’s a decision That StackOverflow CEO Joel Spolsky once called “the biggest mistake any software company can make.”

Refactoring is extremely risky and requires a large concentration of resources and time to deliver benefits to users. This major refactoring involved hundreds of engineers, in addition to designers, PMS, data analysts, operations, legal, and marketing. In fact, it took us a year and a half to complete and roll it out globally.

Our situation is an extreme problem faced by engineers in all departments. If you’re an engineer preparing to write or refactor some code or a feature, you might ask, “How much time do we have to develop?” If you’re on a small team in a large department, you might ask, “Is it worth making such a big change to a feature that hasn’t been developed yet?” A good engineer and a good team will look at these broad issues carefully before taking on the challenge of refactoring.

Therefore, while there are some important technical decisions involved in the refactoring process (to be covered in a future article), they need to be made by combining technical considerations with broad business issues. While these questions are difficult to answer, a good answer is needed to justify refactoring to your team or department.

Finally, these decisions don’t come out of nowhere. Our decision to refactor the APP was not based on theoretical architectural thinking (” Our code could be better if we… “). ), but three months of intensive research with hundreds of pages of documentation and extensive cross-organizational support. In the following chapters, we will explore the decision to restructure Uber’s driver end and what we discovered in the process.

Lay the foundation

The need for refactoring does not always come naturally from a simple understanding of a new architecture. The cost of refactoring is huge, and although engineering organizations often want to refactor code, engineers have to spend their time doing new requirements, rather than refactoring the same functionality over and over again. On the driver side, there are three trends that help drive the decision to refactor:

Technical debt

First, the driver side itself has technical debt. The debt is a result of Uber’s rapid growth, as well as changes in demand for its products (discussed in the next section). Not only that, but the technical debt also comes from repairing previous technical debt: the application itself gets caught up in multiple continuous migrations, making the functionality seem more and more complex.

It’s worth pointing out that the technical debt at the driver’s end isn’t just theoretical. We are seeing a real business impact as developer productivity declines due to ongoing outages and maintenance costs. At the end of 2016, we had to suspend the development of the APP to fix several feature degradations. Until we solve these problems, it becomes very difficult to develop and launch new features.

Any outage is a huge problem for us on the driver side because users depend on it for a living. We believe that nothing gets less than 99.99% usable time, yet we often release releases that have serious problems with the core flow of our apps.

Product challenge

One of the biggest problems we faced was that the previous version of the driver side didn’t fit well with the new business scenarios. In the early stage, the driver end was simply designed and iterated according to preferred cars, but now our business has developed to include carpooling, special offers and offline cash transaction experience serving the market, etc.

We found that drivers also need other functions to manage their assets and private business. Income and transparent ratings, for example, are crucial to a driver’s experience, and were underinvested in early versions of Uber’s driver side. We need to provide scalability like this to improve the product experience.

Figure 1: In the previous version of the driver side, the function TAB at the bottom exceeded previous expectations (left). Maps are overloaded with business pins and route maps (right) that exceed expectations.

We took some initial steps to reduce these pain points in 2015 and 2016, releasing an updated version. Unfortunately, we designed some OF the UI for different teams rather than around the driver’s needs and workflow. If you take a closer look at our UI during this time, you’ll see four tabs under the home page: Income, ratings, Settings, and Home. Each feature TAB is becoming more and more bloated, and revenue and rating tabs often change intent for demand, contrary to design.

The lessons we learned from the iteration of the APP, and the long-term product plan for us, have actually led us to completely rethink how the driver side should find our driver partners. Even if refactoring is not inevitable, a redesign is needed.

Engineering is aligned

Our r&d team made some investments in the new direction up front. In particular, along with the passenger side refactoring in 2016 we introduced a new mobile architecture, which we called RIBs (an evolution of the VIPER architecture), to help us handle this growing scale. It solves most of the problems on the driver side: the extensibility of the framework, a powerful application architecture, and a compelling memory management module. We open-source the RIBs architecture in 2017.

While the RIBs architecture does improve our passenger side, it also represents a new direction for our mobile organization. In the future, our core platform will be focused on the RIBs architecture. Each application using its own RIBs architecture will cost more money than using a standardized one.

The decision making process

With the new UI design and a new architecture in hand, we essentially have three different approaches: redesign the driver side without the RIBs architecture, make the existing driver side work with the RIBs architecture, and completely refactor the driver side based on the RIBs architecture.

Do not use the RIBs architecture

The first approach is that we redesign without the RIBs architecture. The reason is that we consider the migrating architecture to be resource intensive. While RIBs introduces a large new code base, it is also a new way of building applications: decoupling business logic from display logic. The RIBs architecture is persuasive, but very immersive.

First, we consider whether the existing application can handle the major product changes we are considering. We found that a lot of the business logic and the view presentation layer are strongly coupled because some of the logic is in the view controller. This means that a UI redesign will necessarily involve a lot of business logic changes.

Secondly, as mentioned earlier, there are some problems with the existing driver-side architecture that need to be addressed. Part of the problem has to do with app logic, and in some places (Android in particular) developing in one pattern is a common problem for mobile developers: different versions of MVC have bloated view controllers, and most of our core code is written in a controller file of thousands of lines. Therefore, we don’t want to make the existing architecture worse and more complex and difficult to maintain.

Finally, while older versions of the driver-side architecture used to work perfectly, long-term strategic considerations to adapt the RIBs architecture can avoid architectural differences across Uber apps. A powerful architecture that doubles the benefits of our platform, where code written on one team (like the passenger side) can be reused on another team (like the driver side).

If we just loaf, what should we do?

adapter

Many teams prefer a cautious migration that allows them to continue developing new functionality as the architecture of the system changes. This works perfectly in most cases, and here we discuss some of the problems with Uber’s approach.

First, we analyzed ten major adaptations implemented at Uber over the past few years and found that they had a high failure rate. As they said, we were going to start adapting a base library, but it failed completely. The new features are based on the new library, and some of the old features have been adapted, but there is still some legacy code running in the base library.

After further investigation, we found that the root cause of much of the driver’s technical debt is caused by such adaptations. For example, we have competitive conditions because our app’s publish and subscribe model is split in two on Android. Our core application architecture was initially built using Android fragments and later partially adapted to the internal framework. This incomplete adaptation leads to confusion for the adaptation layer and developers in general. These incomplete architectural advances will eventually result in outages that directly affect our users.

Second, we often find that there is a lot of instability in the fit. Many of our outages were due to attempts to improve the underlying application framework, such as the network protocol. Technically, these will not have a direct impact on our users, but will eventually affect the core functions of the app.

Finally, in our experience, not even the promise of continued feature development has been fulfilled. If a team relies on an ongoing adaptation, it is often blocked until the adaptation is complete. It also leads to this: rollback adaptation usually means we have to roll back a lot of functionality as well.

Therefore, when we evaluate whether to do a complete product redesign and adopt the RIBs architecture, the risk of incomplete adaptation or endless adaptation layers greatly increasing application instability is too high.

refactoring

To some extent, we make this decision by negation (the other option, not using RIBs architecture and adaptation, is unreasonable), but refactoring has the benefit of decision making, and it increases our confidence in making a final decision.

First, refactoring allows us to be more productive in redesigning an app without having to understand how it worked before. That means its design can be more versatile.

Second, choosing to refactor also means that our architecture will be clearer, because it will have a compelling strategic direction from the start. If we choose adaptation, we may get stuck with legacy code for ease of reuse or convenience.

Thirdly, the reconfiguration of app will force us to go to the drawing board and think more completely about the product direction we want. As a result, some of the major frameworks in the APP will be refactored.

Refactoring is a challenging opportunity for an engineer, and we can’t wait to get started.

conclusion

It’s worth emphasizing that the decision to refactor the driver’s side was not based on a “it would be better if we could redo it.” In fact, some engineers might be surprised to hear that even after the refactoring, we released the app with not only new features and architecture, but also a bit of new technical debt.

In other words, you can’t handle things perfectly. Engineers who read this article should question the conclusion that refactoring is the only way to produce perfect code without proper adaptation. Instead, it is important to realize that refactoring should be determined within very specific organizational, business, and technical requirements.

If we hadn’t produced a new mobile architecture a few months ago, we probably wouldn’t have refactored it. If we didn’t have a product team to investigate, we probably wouldn’t have refactored. If uber’s previous adaptation had been successful, we might not have refactored. Of course, what drives us to refactor is not that refactoring is necessarily good, or even a good idea.

Instead, the driver’s side of the refactoring came from wanting to create a more reliable and robust product experience for our users, while also empowering our team in this release. This decision may not have been as exciting as creating an optimal abstraction layer, but it was a successful and completely improved mobile application.