preface

Hello everyone, I am ChinaManor, which literally translates to Chinese code farmer. I hope I can become a pathfinder on the road of national rejuvenation, a ploughman in the field of big data, an ordinary person who is unwilling to be mediocre.

This is the mind map for real time technologymanorWill update the reading “Alibaba big Data practice” chapter 5 real-time technology

The great Promotion is the carnival of e-commerce industry. During this period, the peak value of each business system will reach the highest point. The massive data processing of the great promotion every year poses great challenges to the performance and guarantee of real-time computing.

Big promote characteristics

Compared with daily life, there are very big differences in data volume and requirements. The daily points that do not pay attention to will be magnified during the big promotion, and the peak value in a day is particularly obvious. The amount of data is several times or even dozens of times of other time points, which is very important for the systemCompressive abilityThe requirements are very high and the system cannot be overwhelmed by the flood.

  • 1. During the millisecond delay period, both the business side and the user will pay much attention to the real-time data, especially when crossing the zero point. The beat of the first real-time number is of great significance to the business side, indicating the real start of the promotion carnival. Other products, such as large screens for live global media, require milliseconds of delay. This requires both throughput and latency, and some specific optimization must be done to meet the requirements. 2. The flood peak is obviously promoted is the carnival of the whole country and even the world. The peak peak when selling at zero point is very obvious, which is generally dozens of times of the daily peak, which is a great challenge to each system of data processing link. Therefore, the whole link pressure test and plan sorting will be carried out several times before the promotion to ensure that the system can bear the impact of flood peak 3. High security Because the number of people concerned is very high, as long as there is a data delay or data quality problem, the business side will be relatively large rebound, and will immediately perceive the data anomaly. Therefore, in the promotion of the general requirements of high security, some special circumstances even need to do strong security. For strongly guaranteed data, multi-link redundancy is required (physical isolation is required for all data links from the slave set and processing to the data service) (see Figure 5.7). When any article links appear asked, could be key to switch to the link, and need to transparent to the business side, let the downstream perceived switching on the link (due to differences in calculation of each link are fixed, there was a short time can lead to data in the switch down, use judgment index size needs to be done, avoid index fell to the user guide).4. During the promotion of PUBLIC relations features, timely disclosure of data to the public is an important task, which requires very high quality of data calculated in real time. This involves a series of problems such as primary key filtering, the precision of weight removal and the unification of caliber. Only by doing every link well can we ensure the consistency of data with offline data.

Big push is a challenge to high throughput, low delay, high guarantee and high accuracy of data calculation.

Great for security

How to optimize the real-time task is very important in real-time computing, if the throughput can not keep up, it will lose the real-time characteristics. There are many reasons for poor throughput, some of which are related to the source of the eye system, some of which are related to the way the eye is implemented.

  1. (1) Strategies for exclusive and shared resources In a machine, the shared resource pool can be preempted by multiple real-time tasks. If each task needs to grab resources during more than one running time, it is necessary to allocate more exclusive resources to it to avoid a sharp decline in throughput caused by the failure to grab resources.

    (2) Select a cache mechanism to minimize the number of read and write libraries. Memory provides the best read and write performance. Select different cache mechanisms based on service characteristics to keep the hottest and most likely used data in memory. (3) Merging of computing units and lowering the topological level The deeper the topological level, the worse the performance, because when data is transmitted between each node, part of it needs to be serialized and deserialized, which consumes CPU and time. (4) Sharing of memory objects to avoid character copy in massive data processing. Most objects exist in the form of strings. Sharing objects between threads can greatly reduce the performance cost of character copying, but beware of memory overflow caused by improper use. (5) Balancing the two characteristics of high throughput and low latency is a pair of contradictions. When multiple read/write library operations or ACK operations are combined into one, the consumption caused by network requests can be greatly reduced, but the delay will also be higher, which is a trade-off in business.

  2. How to ensure data link

When the data processing link is very long (data synchronization โ†’ data calculation โ†’ data storage โ†’ data service), problems in each link will lead to real-time data update. Real-time computing belongs to distributed computing, and the failure of a single node is normal. This situation is particularly obvious in the live broadcast large screen, because the data is no longer updated, and all users will find that the data has problems. Therefore, to ensure the availability of real-time data, multi-link construction is required for the entire computing link to achieve multi-room or even remote Dr (see Figure 5.8).There are many link problems and the cause cannot be located at the second level. The result data calculated by multiple links will be compared by tools. When a problem occurs on a link, the value calculated by the link will be smaller than that calculated by other links, and the difference will become larger and larger. In this case, a one-click switch to the standby link takes effect in seconds in the form of push configuration. All interface calls are immediately switched to the standby link, which is completely transparent to the live broadcast large screen, and the user is not aware of the occurrence of a fault. 3. How to carry out pressure test In the preparation for the big promotion, the real-time link will be tested for several times, mainly to simulate the “double peak condition, to verify whether the system can run normally. The pressure measurement is carried out in the online environment, which is divided into data pressure measurement and product pressure measurement. Data pressure measurement is mainly flood storage pressure measurement, which is to accumulate the data of several hours or even several days, and release all the data at a certain moment to simulate the situation of “Double 11” flood peak flow, the data in which is real. For example, by moving the subscription data points of real-time operations to a few hours or days in advance, each batch is reading the most data, and the pressure on real-time computing is the greatest. Product pressure testing is also subdivided into product pressure testing and front-end page stability testing. 4. (1) Pressure measurement of the product itself collects the URL of all read operations on the big-screen server, and plays back the pressure measurement flow volume through the pressure measurement platform, and carries out pressure measurement according to the target of QPS: 500 / SEC. During the pressure test, the server performance is continuously optimized iteratively to improve the data processing performance of large-screen applications. (2) Front-end page stability test: Open the large-screen page in the browser and conduct a 24-hour front-end page stability test. Monitor the memory and CPU consumption of large screen front-end JS on the client browser, detect and repair front-end JS memory leakage and other problems, and improve the stability of front-end pages.

conclusion

That’s all the content of Alibaba’s big data practice | Real-time technology

May you have your own harvest after reading, if there is a harvest might as wellThree even a keySee you next time ๐Ÿ‘‹ยท