700 million requests per second, how does Ali's new generation database support?

The original address: https://mp.weixin.qq.com/s/I3FnP1JtY5QZCY2D3QioyQ

Lindorm is an important part of big data storage processing in Flying Cloud operating system. Lindorm is a distributed NoSQL database developed by HBase for big data. It integrates large-scale, high-throughput, fast, flexible, and real-time mixing capabilities, and provides the world’s leading high-performance, cross-domain, multi-consistent, and multi-model mixed storage and processing capabilities for massive data scenarios. At present, Lindorm has fully served the structured and semi-structured storage scenarios of big data in Ali economy.

Note: Lindorm is another name for the HBase branch within Ali. The version sold on Ali Cloud is HBase Enhanced Edition. The HBase enhanced edition and Lindorm in the following articles refer to the same product.

Since 2019, Lindorm has served dozens of BU, including Taobao, Tmall, Ant, Cainiao, Mom, Youku, Autonide, And Daycare. In this year’s Double Eleven, the peak request for Lindorm will reach 750 million times per second, with daily throughput of 22.9 trillion times, and the average response time is less than 3ms. The total amount of data stored reaches hundreds of PeTabytes.

Behind these numbers, condensed HBase&Lindorm team for many years of sweat and efforts. Lindorm is derived from HBase. After years of carrying hundreds of PB data, hundreds of requests, and thousands of services, Lindorm is a new product for comprehensive reconstruction and engine upgrade under the pressure of scale cost and HBase’s own defects. Lindorm is a huge leap forward in performance, functionality, and usability compared to HBase. This article will introduce Lindorm’s core capabilities and business performance in terms of functionality, availability, performance cost, and service ecosystem, and finally share some of our ongoing projects.

Extreme optimization, super performance

Compared with HBase, Lindorm is deeply optimized in RPC, memory management, cache, log writing, and many new technologies are introduced to greatly improve read/write performance. With the same hardware, Lindorm can reach more than 5 times that of HBase, and burr can reach 1/10 that of HBase. These performance data were not generated under laboratory conditions, but were produced using the open source testing tool YCSB without changing any parameters. We published the test tools and scenarios in the Aliyun help file, and anyone can follow the guide to run the same results themselves.

Behind this remarkable performance is the “dark tech” that has accumulated over the years in the Lindorm kernel. Below, we will briefly introduce some of the dark tech used in the Lindorm kernel.

Trie Index

The file LDFile for Lindorm (similar to HFile in HBase) is a read-only B+ tree structure, in which the file index is a critical data structure. Block cache has high priority and needs to reside in memory as much as possible. If we can reduce the size of file indexes, we can save valuable memory space needed for indexes in the Block cache. Or increase the index density and reduce the size of data blocks while the index space remains unchanged to improve performance. The HBase index block stores a full number of Rowkeys. In a sorted file, many rowkeys have the same prefix.

The Trie (prefix tree) structure in the data structure can save only one copy of the common prefix, avoiding the waste caused by repeated storage. However, in the traditional prefix tree structure, the pointer from one node to the next node takes up too much space, and the overall gain is not worth the loss. This situation is expected to be fixed with the Succinct Prefix Tree. Sigmod’s best paper of 2018, Surf, proposed a Succinct Prefix Tree to replace the Bloom Filter while also providing range filtering. From the Succinct Trie, we used the Succinct Trie file block index.

The index structure implemented by Trie Index is used in many of our online businesses. It turns out that the Trie Index can greatly reduce the size of the index in each scenario, by up to 12 times the index space! This saved precious space allows more indexes and data files to be stored in the in-memory Cache, greatly improving the performance of requests.

ZGC support, 100 GB heap average 5ms pause

ZGC(Powerd by Dragonwell JDK) is one of the representatives of the next generation Pauseless GC algorithm. Its core idea is that Mutator uses the memory Read Barrier to recognize pointer changes. Enables most of the Mark and Relocate work to be performed in the concurrent phase.

The Lindorm team worked closely with the AJDK team to improve and modify this experimental technology. Making ZGC production-level available in the Lindorm scenario includes:

Lindorm memory self-management technology that reduces the number of objects and memory allocation rate by an order of magnitude. (For example, CCSMap contributed by Ali HBase team to the community).
AJDK ZGC Page caching mechanism optimization (lock, Page caching policy).
AJDK ZGC trigger timing optimization, ZGC no concurrent failure. AJDK ZGC runs stably for two months on Lindorm and successfully passes the Double Eleven exam. Its JVM pause times are stable at around 5ms, with maximum pause times not exceeding 8ms. ZGC greatly improves THE RT and burr indexes of online running clusters, with an average RT optimization of 15% ~ 20% and a P999 RT reduction of twice. In this year’s Double 11 ant risk control cluster, under the support of ZGC, P999 time was reduced from 12ms to 5ms.

Note: The units in the figure should be US, with an average GC at 5ms

LindormBlockingQueue

The preceding figure shows the process of RegionServer in HBase reading RPC requests from the network and distributing them to handlers. RPC Readers in HBase read RPC requests from sockets and place them in BlockingQueue. The Handler subscribes to this Queue and executes the request. In this BlockingQueue, HBase uses the LinkedBlockingQueue of the Java native JDK.

LinkedBlockingQueue uses Locks and conditions to ensure thread-safety and synchronization between threads, and while classic, this queue can cause serious performance bottlenecks as throughput increases. Therefore, LindormBlockingQueue was redesigned in Lindorm to maintain elements in Slot arrays. Maintain the head and tail Pointers, read and write the incoming queue through CAS operation, and eliminate the critical section. Cache Line Padding and dirty read Cache acceleration are used, and multiple wait strategies (Spin/Yield/Block) can be customized to avoid frequent Park state when the queue is empty or full. The performance of LindormBlockingQueue is outstanding, more than 4 times better than the original LinkedBlockingQueue.

VersionBasedSynchronizer

The LDLog is used for data recovery during system failover in the Lindorm to ensure atomicity and reliability of data. Every time data is written, LDLog must be written first. After the LDLog is successfully written, subsequent operations such as writing memstore can be performed. Therefore, handlers in Lindorm must wait for WAL writes to complete before being woken up for further operations. In high pressure conditions, unawoken causes a large number of CPU Context switches, resulting in performance degradation. To address this problem, Lindorm developed a version-based high concurrency multi-path synchronizer to greatly optimize context switching.

The versionbasic Synchronizer allows the wait conditions of the Handler to be sensed by the Notifier to reduce the wake up pressure of the Notifier. The module-tested version basic synchronizer is more than twice as efficient as the JDK’s built-in ObjectMonitor and J.U.C(Java util concurrent packages).

Complete lock-free

The HBase kernel has a large number of locks on the critical path. In high concurrency scenarios, these locks cause thread contention and performance degradation. The Lindorm kernel does not lock locks on key links, such as MVCC or WAL. In addition, various indicators, such as QPS, RT, and cache hit ratio, are generated during HBase running. There are also lots of locks in the “humble” operations that record these Metrics. In the face of such problems, Lindorm referenced by the ideas of the tcmalloc LindormThreadCacheCounter, is developed to solve the problem of the performance Metrics.

Handler association programs

In high concurrency applications, the implementation of an RPC request often consists of multiple sub-modules, involving several IO. These sub-modules collaborate with each other, and the system ContextSwitch quite frequently. ContextSwitch optimization is an unavoidable topic for high-concurrency systems, and there are many ideas and practices in the industry. Among them, Coroutine (coroutine) and SEDA(staged event-driven) schemes are the ones we focus on. Considering engineering cost, maintainability, and code readability, Lindorm chooses coroutine for asynchronous optimization. We use the built-in Wisp2.0 function of Dragonwell JDK provided by Ali JVM team to realize the coprogramming of HBase Handler. Wisp2.0 out of the box effectively reduces the resource consumption of the system, and the optimization effect is objective.

New Encoding algorithm

To improve performance, HBase usually loads Meta information into block cache. If the block size is small and the Meta information is large, the Meta information cannot be fully loaded into the Cache, resulting in performance deterioration. If the block size is large, the performance of sequential queries through Encoding blocks can become a performance bottleneck for random reads. To solve this problem, Lindorm has developed Indexable Delta Encoding, which can also be searched quickly within the block through indexes, and seek performance has been greatly improved. Indexable Delta Encoding principle is shown below:

With Indexable Delta Encoding, random seek performance of HFile is doubled compared to that before using Indexable Delta Encoding. For 64K block, random seek performance is almost the same as that without Encoding (other Encoding algorithms have some performance loss). In the case of random Get with full cache hits, the ratio of Diff Encoding RT decreases by 50%

other

Compared to HBase community edition, Lindorm has dozens of performance enhancements and reconstructs, and introduces many new technologies. Due to space constraints, this list is limited. Other core technologies include:

CCSMap

Quorum-based Write protocol for Automatically Avoiding Failures

Efficient Group Commit

High performance cache without fragmentation – Shared BucketCache

Memstore Bloomfilter

Efficient data structures for reading and writing

Gc-invisible Memory management

Separation of online computing and offline job architecture

JDK/ operating system deep optimization

FPGA offloading Compaction
TCP acceleration in user mode
…

Rich query model, reduce the development threshold

Native HBase supports only KV structure query. Although simple, it is not able to meet the complex requirements of various services. So in Lindorm, we developed a variety of query models tailored to the business, lowering the barriers to development through API and index design that are closer to the scene.

WideColumn model (native HBase API)

WideColumn is an HBase identical access model and data structure that makes Lindrom 100% compatible with the HBase API. Users can access Lindorm through the WideColumn API in the high-performance native client provided by Lindorm, or through the Alihbase-Connector plug-in using the HBase client and API(no code modification is required). At the same time, Lindorm uses a lightweight client design to sink a lot of data routing, bulk distribution, timeouts, retries, and other logic to the server, as well as a lot of optimization in the network transport layer to save CPU consumption on the application side. Compared with HBase, Lindorm improves the CPU usage and network bandwidth efficiency by 60% and 25%, as shown in the following table.

Note: The client CPU in the table represents the CPU resources consumed by the HBase/Lindorm client. The smaller the CPU resources, the better.

In addition, the HBase native API supports high-performance secondary indexes. When data is written using the HBase native API, index data can be transparently written into index tables. In the query process, the Scan + Filter large query that may Scan all tables is changed to query the index table first, greatly improving the query performance.

TableService model (SQL, secondary index)

HBase supports only Rowkey index. Therefore, multi-field query is inefficient. Therefore, users need to maintain multiple tables to meet the query requirements of different scenarios, which not only increases the complexity of application development to a certain extent, but also cannot guarantee the data consistency and writing efficiency perfectly. In addition, HBase provides only KV apis for simple operations, such as Put, Get, and Scan. There is no data type. All data must be converted and saved by users. For developers who are used to SQL, the threshold of entry is very high and error prone.

In order to solve this pain point, reduce the user usage threshold, and improve the development efficiency, we added TableService model in Lindorm, which provides rich data types, structured query expression API, and native support for SQL access and global secondary index, which solves many technical challenges and greatly reduces the development threshold for ordinary users. With SQL and SQL-like apis, users can easily use Lindorm as they would a relational database. Here is a simple example of Lindorm SQL.

-- Primary table and indexDDL
create table shop_item_relation ( shop_id varchar, item_id varchar, status varchar constraint primary key(shop_id, item_id)) ;
create index idx1 on shop_item_relation (item_id) include (ALL); Select * from primary key; select * from primary keycreate index idx2 on shop_item_relation (shop_id, status) include (ALL); -- Multiple column index, redundant all columns -- write data, update synchronously2indexupsert into shop_item_relation values('shop1'.'item1'.'active');
upsert into shop_item_relation values('shop1'.'item2'.'invalid'); Select * from shop_item_relation WHERE item_id = select * from shop_item_relation WHERE item_id ='item2'; Select * from shop_item_relation where shop_id ='shop1' and status = 'invalid'; - hit idx2Copy the code

Compared to SQL for relational databases, Lindorm does not have the ability for multi-row transactions and complex analysis (such as Join, Groupby), which is another location difference between the two.

Compared with the secondary index provided by Phoenix on HBase, the secondary index of Lindorm far outperforms Phoenix in functions, performance, and stability. The following figure shows a simple performance comparison.

Note: This model has been tested in the HBase enhanced version of Ali Cloud. Interested users can contact the cloud HBase to answer questions about the nail number or initiate work order consultation on Ali Cloud.

FeedStream model

Message queue plays a very important role in modern Internet architecture, which can greatly improve the performance and stability of the core system. Typical application scenarios include system decoupling, peak clipping and current limiting, log collection, final consistency assurance, distribution and push, etc.

Common message queues include RabbitMq, Kafka and RocketMq. Although these databases differ slightly in architecture and usage and performance, their basic usage scenarios are relatively similar. However, the traditional message queue is not perfect, and it has the following problems in news push, feed flow and other scenarios:

Storage: Not suitable for long-term storage of data, usually expire in days

Deletion ability: The specified data entry cannot be deleted

Query capability: Does not support complex query and filtering conditions

Consistency and performance are difficult to guarantee at the same time: databases such as Kafka are heavy throughput, there is the possibility of losing data in some cases to improve performance, and transactional message queues are limited in throughput.
Partition rapid expansion: The number of partitions in a Topc is fixed and rapid expansion is not supported.
Physical queue/logical queue: Usually only a small number of physical queues are supported (for example, each partition can be regarded as a queue). However, services need to simulate logical queues based on physical queues. For example, in an IM system, a logical message queue is maintained for each user, so users often need a lot of extra development work.

To meet the above requirements, Lindorm launches a queue model called FeedStreamService, which can solve problems such as message synchronization, device notification, and self-added ID assignment for a large number of users.

FeedStream model plays an important role in mobile Taobao message system this year, solving the problems of protecting order and power of mobile Taobao message push. In this year’s Double eleven, the building and big red envelope pushed by hand will have Lindorm. In the push of mobile shopping message, the peak value exceeded 100W /s, achieving the minute level push to the whole network users.

Full-text indexing model

Although the TableService model in Lindorm provides data types and secondary indexes. However, Solr and ES are excellent full-text search engines. Using Lindorm+Solr/ES maximizes the strengths of both Lindorm and Solr/ES, allowing us to build sophisticated big data storage and retrieval services. Lindorm has a built-in external index synchronization component that automatically synchronizes data written to Lindorm to an external index component such as Solr or ES. This model is very suitable for businesses that need to save a large amount of data, but the field data of query conditions only accounts for a small part of the original data, and need a combination of various conditions to query, for example:

In common logistics service scenarios, a large amount of track logistics information needs to be stored and query conditions need to be arbitrarily combined based on multiple fields

In traffic monitoring business scenarios, a large number of vehicle passing records are saved, and the records of interest are retrieved according to any combination of vehicle information conditions
A variety of site members, commodity information retrieval scenarios, generally save a large number of commodity/member information, and need to carry out complex and arbitrary query according to a few conditions, in order to meet the arbitrary search needs of site users.

Full-text indexing model has been launched on Aliyun, supporting Solr/ES external indexing. At present, index query users also need to directly query Solr/ES and then reverse search Lindorm. Later, we will use TableService syntax to package external index query. Users only need to interact with Lindorm in the whole process to obtain full-text index capability.

More models on the way

In addition to the above models, we will also develop more easy-to-use models according to business needs and pain points, so as to facilitate the use of users and reduce the threshold of use. Time series models, graphic models, etc., are on the way, so stay tuned.

High availability with zero intervention and second recovery

From a baby to a young man, Ali HBase has fallen many times, even broken head and blood, and we are fortunate to grow under the trust of customers. During the 9 years of Alibaba application, we have accumulated a large amount of high availability technology, which has been applied to HBase enhanced version.

MTTR optimization

HBase is an open source implementation based on Gooogle’s famous paper BigTable. Its core feature is that data is persistently stored in HDFS, the underlying distributed file system. HDFS maintains multiple copies of data to ensure high reliability of the entire system, while HBase itself does not need to care about multiple copies of data and consistency. This will help the overall engineering simplified, but also introduces the defects of “single point of service”, namely, speaking, reading and writing of data to determine service only a fixed a node server, this means that when a node after downtime, data needs to be through the replay Log restore memory state, after loading and sending them to a new node, to return to service.

In a large cluster, it may take 10 to 20 minutes to recover from a single point of failure of HBase. It may take several hours to recover from a large-scale cluster outage. In the Lindorm kernel, we made a number of optimizations for MTTR (Mean Fail-over time), including implementing regions first, parallel replays, reducing small file generation, and many more. Increase the fault recovery speed by more than 10 times! The value is close to the HBase design value.

Adjustable multi consistency

In the HBase architecture, each region can be online only in one RegionServer. If the RegionServer is down, a region needs to go through steps such as Re-Assgin, REGION segmentation by WAL, and WAL data playback before it can be read and written again. This recovery time can take several minutes, which is an unsolvable pain point for some demanding businesses. In addition, HBase has active/standby synchronization, but in case of a fault, the cluster granularity can only be manually switched. In addition, data between the active and standby services can only be consistent. Some services can only be strongly consistent.

Lindorm implements a based on Shared inside the Log of the agreement, through the partition service under multiple copy mechanism to malfunction automatically the ability to recover quickly, perfectly fit the store separation structure, using the same set of system can support of a strong consistent semantics, and the sacrificing consistency can be chosen for better performance and availability, Achieve multiple activities, high availability and other capabilities.

In this architecture, Lindorm has the following consistency levels, which users can choose based on their business:

Note: This function is not available on aliCloud HBase Enhanced edition

Client HA switchover

Currently, HBase can work in active/standby mode, but there is no efficient client switching access solution in the market. HBase clients can access only HBase clusters of a fixed address. If the primary cluster is faulty, stop the HBase client, modify the HBase configuration, and restart the HBase client to connect to the secondary cluster. Alternatively, users must design a set of complex access logic on the service side to access the primary and secondary clusters. Alibaba HBase has modified HBase clients. Traffic switchover occurs inside the clients. After the switchover command is sent to the clients over the high availability channel, the clients close the old link, open the link with the secondary cluster, and retry the request.

Cloud native, lower cost of use

Lindorm considers the cloud from the beginning of the project, and various designs can reuse the cloud infrastructure as much as possible, specially optimized for the cloud environment. In the cloud, for example, in addition to supporting cloud disks, we also support data storage in OSS, a low-cost object store to reduce costs. We have also made a lot of optimizations for ECS deployment, adapting to models with small memory specifications, strengthening deployment flexibility, all for cloud native, in order to save customers’ costs.

ECS+ the ultimate elasticity of cloud disk

The HBase enhanced version of Lindorm on the cloud uses ECS+ cloud disk deployment (some large customers may use their own sites). The ECS+ cloud disk deployment provides extreme flexibility for Lindorm.

At the beginning, HBase is deployed on physical servers. Before each service goes online, plan the number of machines and disk sizes. In physical machine deployment, there are several problems that are difficult to solve:

Service elasticity is difficult to meet: When unexpected service traffic peaks or abnormal requests occur, it is difficult to find new physical servers for capacity expansion in a short period of time.
Poor flexibility due to storage and computing binding: The ratio of cpus to disks on a dedicated server is fixed, but each service has different characteristics. If the same dedicated server is used, some services have insufficient computing resources but excessive storage resources, while some services have excessive computing resources and storage bottlenecks. In particular, after mixed storage is introduced in HBase, it is difficult to determine the ratio between HDDS and SSDS. Some demanding services often use up SSDS while HDDS are available, and a large number of out-of-line service SSDS cannot be used.
Heavy o&M pressure: When using a physical machine, o&M needs to pay attention to whether the physical machine is over warranty, whether there is disk failure, network card failure and other hardware faults that need to be dealt with. The repair of the physical machine is a long process, and the machine needs to be stopped at the same time, so the o&M pressure is huge. For massive storage services like HBase, it is common for several disks to fail every day. These problems will be solved when Lindorm is deployed on ECS+ cloud disk.

ECS provides a nearly infinite pool of resources. In case of emergency capacity expansion, you only need to apply for a new ECS in the resource pool and then join the cluster within minutes, regardless of service traffic peaks. Cooperate with cloud disk such storage and computing separation architecture. We have the flexibility to allocate different disk space for various services. If the disk space is insufficient, you can directly expand or shrink the disk capacity online. At the same time, the operation and maintenance no longer need to consider hardware failure, when the ECS has a failure, the ECS can be pulled in another host, and the cloud disk completely shield the upper layer of the bad disk processing. Extreme flexibility also leads to cost optimization. We do not need to reserve too many resources for the business, and when the business promotion is over, we can quickly reduce capacity and cost.

Integrated cold and heat separation

In the scenario of massive big data, some service data in a table is only archived data or rarely accessed over time. At the same time, this part of historical data has a large volume, such as order data or monitoring data. Reducing the storage cost of this part of data will greatly save the cost of enterprises. How to greatly reduce storage cost for enterprises with minimal operation and maintenance configuration cost, Lindorm cold and heat separation function arises at the historic moment. Lindorm provides a new storage medium for cold data. The cost of a new storage medium is only one third of that of an efficient cloud disk.

Lindorm separates the cold and hot data in the same table. The system automatically archives the cold data in the table to the cold storage based on the demarcation line set by the user. There is almost no difference between the user’s access mode and that of ordinary tables. In the query process, the user only needs to configure query Hint or TimeRange, and the system automatically determines whether the query should fall in the hot data area or the cold data area according to the conditions. It is always a table to the user and is almost completely transparent to the user. For details, please refer to:

https://yq.aliyun.com/articles/718395

Zstd-v2, increase the compression ratio by another 100%

Two years ago, we replaced the group’s storage compression algorithm with ZSTD, achieving an additional 25% compression gain over SNAPPY. This year, we further optimized this problem by developing and implementing a new ZSTD-V2 algorithm. For the compression of small chunks of data, we proposed a method of training dictionaries with pre-sampled data and then accelerating them with dictionaries. We took advantage of this new capability to sample and train the data, build the dictionary, and then compress the data when building LDFile for Lindorm. In various business data tests, we achieved a maximum compression ratio of 100% over the native ZSTD algorithm, which means we can save customers another 50% on storage costs.

HBase Serverless, preferred for getting started

The HBase Serverless version of Ali Cloud is a new HBase service built based on the Lindorm kernel and the Serverless architecture. Alibaba Cloud HBase Serverless truly turns HBase into a service. Users do not need to plan resources in advance, select the number of CPU and memory resources, and purchase clusters. There is no need to perform complex operation and maintenance operations such as capacity expansion in response to business peak and business space growth, and no need to waste idle resources in business trough.

Users can purchase requests and space resources based on the current service volume. With the HBase Serverless version of Alibaba Cloud, users are using an HBase cluster with unlimited resources, meeting sudden changes in service traffic at any time, and only paying for the portion of resources they actually use.

Security and multi-tenant capabilities for large accounts

The Lindorm engine has a complete username and password architecture built in, providing multiple levels of permission control and authentication on every request to prevent unauthorized data access and secure user data access. In addition, Lindorm is equipped with multi-tenant isolation functions such as Group and Quota limits to ensure that services in an enterprise using the same HBase cluster will not be affected by each other and share the same big data platform in a secure and efficient manner.

User and ACL system

The Lindorm kernel provides an easy-to-use user authentication and ACL system. For user authentication, enter the user name and password in the configuration. The user password is stored in plaintext on the server and is not transmitted in plaintext during authentication. Even if the ciphertext is intercepted, the communication content used for authentication cannot be reused or forged.

There are three levels of authority in Lindorm. Global, Namespace, and Table. These three are mutually overlapping relations. For example, if user1 is granted the read and write permission of Global, user1 has the read and write permission of all tables in the namespace. If user2 is granted read/write access to Namespace1, it will automatically have read/write access to all tables in Namespace1.

Group isolation

When multiple users or services use the same HBase cluster, resource contention may occur. The read and write of some important online services may be affected by the batch read and write of offline services. The Group function is provided by HBase Enhanced (Lindorm) to solve the multi-tenant isolation problem.

RegionServer is divided into different Groups and hosts different tables on each Group to achieve resource isolation.

For example, in the figure above, we create a Group1 to partition RegionServer1 and RegionServer2 into Group1, and create a Group2 to partition RegionServer3 and RegionServer4 into Group2. At the same time, we move Table1 and Table2 into Group1. In this case, all regions in Table1 and Table2 are allocated to RegionServer1 and RegionServer2 in Group1.

In the same way, regions of Table3 and Table4 of Group2 are allocated and balanced only on RegionServer3 and RegionServer4. RegionServer1 and RegionServer2 serve only the requests sent to Table1 and Table2, and RegionServer3 and RegionServer4 serve only the requests sent to Table3 and Table4. Thus achieving the purpose of resource isolation.

Quota limit flow

There is a complete Quota system built into the Lindorm kernel to limit resource usage for individual users. For each request, the Lindorm kernel has an exact calculation of the CU (Capacity Unit) consumed, which is calculated in terms of the resources actually consumed. For example, a Scan request receives little data due to filter, but The RegionServer may have consumed a large amount of CPU and I/O resources to filter data. The actual resource consumption is calculated in the CU. When using Lindorm as a big data platform, an enterprise administrator can allocate different users to different services and limit the number of CU reads per second for a user or the total number of CU reads per second by using the Quota system to limit the user’s usage of resources and affect other users. Quota limits also support Namesapce level and table level limits.

conclusion

The new-generation NoSQL database Lindorm is the result of technical accumulation of Alibaba HBase&Lindorm team for 9 years. Lindorm provides the world’s leading high-performance, cross-domain, multi-consistent, multi-model hybrid storage and processing capability for massive data scenarios. Focus on simultaneously solving the demands of big data (unlimited expansion, high throughput), online services (low latency, high availability), multi-functional query, to provide users with seamless expansion, high throughput, continuous availability, millisecond level of stable response, adjustable strength, low storage cost, rich index real-time mixed data access capabilities.

Lindorm has become one of the core products in Alibaba’s big data system, successfully supporting thousands of BU businesses of the group, and also withstood the test of “Technology group building” on Tmall Double 11. Ali CTO Xing Epilepsy once said that Ali’s technology should be exported through Ali Cloud to benefit millions of customers from all walks of life. Therefore, Lindorm has been in the form of “HBase Enhanced Version” on Ali Cloud and exported in private cloud since this year, so that customers on the cloud can enjoy the technological bonus of Alibaba and help the business to take off!

The last

Welcome everyone to pay attention to my public kind hao [programmer tracking wind], organized 1000 2019 Java interview questions of many companies more than 400 pages of PDF documents, articles will be updated in it, organized information will also be placed in it.

If you like the article, remember to pay attention to me. Thank you for your support!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

700 million requests per second, how does Ali’s new generation database support?