background

Interconnection Service Type

 

HBase is a Database based on the Hadoop ecosystem. The source is friendly to offline tasks. Because the LSM tree is an excellent high-throughput Database structure, it also connects to many online services. Online businesses are sensitive to access latency and access tends to be random, such as orders, customer service track queries. Offline services are usually scheduled and mass processing tasks of warehouses, which process data within a period of time and produce results. They are not sensitive to the time requirements of task completion, and the processing logic is complex, such as day-level reports, security and user behavior analysis, model training, etc.

 

Multilingual support

 

HBase provides a multi-language solution. In addition, the development languages used by the RD of each didi service line have their own preferences. Therefore, multi-language support is crucial for HBase development within Didi. We provide users with access in multiple languages: HBase Java native API, Thrift Server (mainly used in C++, PHP, and Python), Java JDBC (Phoenix JDBC), and Phoenix QueryServer (multi-language solution provided by Phoenix), MapReduce Job (Htable/Hfile Input), Spark Job, and Streaming.

 

The data type

 

HBase stores the following data types in Didi:

 

  1. Statistical results and report data: mainly operation, transport capacity, revenue and other results, usually need to cooperate with Phoenix for SQL query. The amount of data is small, and the flexibility of query is high.
  2. Raw factual data: orders, GPS tracks of drivers and passengers, logs, etc., mainly used for online and offline data supply. Large amount of data, high requirements for consistency and availability, delay sensitive, real-time write, single point or batch query.
  3. Intermediate result data: refers to the data required for model training. A large amount of data has low availability and consistency requirements, and high throughput requirements for batch query.
  4. Backup data of online systems: Users save original data to other relational databases or file services and use HBase as a remote Dr Solution.

Application Scenarios

Scenario 1: Order events

 

This data should be available to all users who have used Didi’s products, namely, the historical orders on the App. Queries for recent orders will be placed on Redis after a certain period of time, or will be placed on HBase when Redis is unavailable. Requirements of the business side are as follows:

  1. Query the status of the order life cycle online, including status, event_type, and order_detail. The main queries come from the customer service system.
  2. Online historical order details enquiry. The upper layer has Redis to store recent orders. If Redis is unavailable or the query scope exceeds Redis, the query is directly sent to HBase.
  3. Offline analysis of order status.
  4. Write 10K events per second, read 1K events per second, the data must be available within 5s.




 

Order status table

Rowkey: Reverse (Order_ID) + (MAX_LONG — TS) Columns: indicates the status of the order

 

Order history sheet

Rowkey: reverse (passenger_id | driver_id) + (MAX_LONG – TS) Columns: user within the scope of the orders in time and other information

 

Scenario 2: Driver and passenger tracks

 

It is also a closely related data of Didi’s users, used by online users, Didi’s various business lines and analysts. A few examples of usage scenarios: when a user views a history order, a map shows the route taken; In case of disputes between operators and passengers, the customer service call order track reappears; Map department users analyze road congestion.

 

 

  1. Real-time or quasi-real-time track coordinate query for App users or back-end analysts;
  2. Meet the offline large-scale trajectory analysis;
  3. Meet a given geographical range, take out the track of all users within the range or users within the range appeared.

Among them, about a third requirements, location query, we know the mongo for this kind of geographical index of active support, but in the case of drops the magnitude may occur storage bottleneck, there is no pressure but no built-in HBase storage and extensibility similar mongo location index function, did not need our own implementation. Through research, we know that there is a set of relatively common GeohHash algorithm for geographical index.

 

GeoHash converts a two-dimensional latitude and longitude into a string, each string representing a rectangular region. That is, all the points in the rectangle will share the same GeoHash string. For example, if I am at the Youtang Hotel and a friend of mine is at the Youtang Shopping mall, our longitude and latitude points will get the same GeoHash string. This allows for privacy (representing approximate area locations rather than specific points) and makes caching easier.

 




 

 

The Rowkey design of the two query scenarios is as follows:

 

  1. Single user by order or time segment: reverse(user_id) + (integer-max_long-ts /1000)
  2. Trajectory query within a given range: Reverse (GeoHash) + TS /1000 + user_id

Scenario 3: ETA

 

ETA is the estimated time and price that is displayed each time a destination and destination is selected. The initial version runs in offline mode, but the later version realizes real-time effect by using HBase as a KeyValue cache, which reduces training time, enables multi-city parallelism, and reduces manual intervention.

The whole ETA process is as follows:

 

  1. Model training is conducted every 30 minutes for each city through Spark Job.
  2. In the first stage of model training, read all urban data from HBase according to the set conditions within 5 minutes;
  3. In the second stage of model training, ETA was calculated within 25 minutes.
  4. Data in HBase is stored in the HDFS periodically for new model testing and feature extraction.

Rowkey: Salting +cited+type0+type1+type2+TS Column: Order, feature

 


Scenario 4: Monitoring tool DCM

 

It is used to monitor the resource usage of Hadoop cluster (Namenode, Yarn Container usage, etc.). The relational database will have various performance problems after the time dimension process. At the same time, we hope to do some analysis and query through SQL, so we use Phoenix, and use the acquisition program to regularly input data. Reports are generated and stored in HBase. Query results can be returned in seconds and displayed in the front end.

 




 






Didi manages multiple tenants in HBase

 

We think ChanJiQun multi-tenancy is one of the most efficient and energy saving plan, but because HBase to multi-tenant basic management, no use will encounter many problems: such as the use of resources in terms of the user is not do not do after analysis, the storage volume change adjustment and inform, project online offline without a plan, want to most of the resources and authority, etc.; Platform managers also encounter difficulties in understanding user services through online communication, unclear status of each project connected to HBase, inability to determine whether user requirements are reasonable, resource competition among multiple tenants in a cluster, and long time for locating and troubleshooting problems.

 

In view of these problems, we developed DHS system (Didi HBase Service) for project management, and divided user resources, data and permissions through Namespace, RS Group and other technologies on HBase. Control resource allocation by calculating cost and billing.

 




 

  1. Project life cycle management: including project approval, resource estimation and application, project demand adjustment and demand discussion;
  2. User management: authority management, project approval;
  3. Cluster resource management;
  4. Table level usage monitoring includes read and write monitoring, memstore, blockcache, and Locality.

When a user needs to use HBase storage, we ask the user to register a project on the DHS. This section describes service scenarios, product details, and high SLA requirements.

 

Then we create tables and estimate table performance requirements. We require users to have an accurate estimate of the resources they are going to use. If the user is difficult to estimate, we will discuss with the user in the form of online or offline discussion to help determine the information.

The project overview page is then generated for administrators and users to track the progress of the project.

 

JXM information provided by HBase is summarized to Region and RegionServer data. Administrators often use the JXM information, but users rarely pay attention to this level. Based on this situation, we developed HBase table-level monitoring, and have permission control, so that RD can only see its own related tables and know the throughput and storage occupancy of its own project tables.

 

On the basis of making users clear about their use of resources through DHS, we use RS Group technology to divide a cluster into multiple logical subset groups, allowing users to choose exclusive or shared resources. Both sharing and exclusivity have their own advantages and disadvantages, as shown in Table 1.

 

 

  1. For data in the backup or test phase with low access latency, low access volume, and low availability requirements: Use a shared resource pool.
  2. For delate-sensitive, high-throughput, high-traffic, high-availability, and online services: Allow RegionServer Group to monopolize RegionServer Group resources composed of a certain number of machines, and allocate 20% to 30% extra resources based on the user’s estimated resource capacity.

Finally, we periodically calculate and bill users based on their use of resources.

RS Group

RegionServer Group for details, see HBase hbase-6721 Patch. On this basis, Didi made some optimization of allocation strategy to suit the modification of Didi’s business scenario. RS Group summary Refers to an RS Group that allocates a batch of RegionServer lists. Each Group can mount different tables as required. If a table in a Group is abnormal, Region does not migrate to another Group. In this way, each Group acts as a logical subset, thus achieving the effect of resource isolation and reducing administrative costs without having to build a separate cluster for each high SLA line of business.

 

conclusion

In didi’s promotion and practice of HBase, we believe that it is crucial to help users make good table structure design and resource control. With these two premises, the probability of subsequent problems is greatly reduced. Good table structure design requires the user to achieve a clear understanding of HBase, most business users to put more effort in the business logic, and know little about architecture implementation, this would require the platform manager to help and guidance, have a good start and success stories, through these users to other business promotion. Resource isolation control help us effectively reduce the number of clusters, reduce operational costs, let the platform manager from multiple cluster endless work, to put more effort into component community to follow up and platform management system research and development work, make the business platform to enter a virtuous circle, improve user experience, to better support the development of the company’s business.

 

This article was originally published in Programmer magazine

Li Yang is a senior software development engineer at Didi Chuxing. In 2015, I joined the Basic Platform Department of Didi Chuxing, mainly responsible for HBase and Phoenix and related distributed storage technology. Prior to Didi, He worked as a data engineer at Sina, focusing on distributed computing and storage.