In the popular Internet today, all kinds of distributed systems have become common. Search engine, e-commerce website, Weibo, wechat, O2O platform. All involve large-scale user, high concurrent access, all is not distributed. But no matter what kind of business, no matter what kind of distributed system, some basic ideas are the same. This paper will comb and summarize these basic ideas.

split

System spin-off

The architect of wechat once said: “A big system is small.” For a large complex system, the first thought is to break it up into multiple subsystems. Each subsystem has its own storage /Service/ interface layer, and each subsystem is independently developed, tested, deployed, operated and maintained. From the perspective of team management, different teams can also use their own familiar language system and cooperate with each other based on interfaces, so that their responsibilities are clear and each performs its own duties.

Subsystem split

After the subsystem is disassembled, the subsystem can be layered and divided into modules. Of course, “system”, “subsystem”, “layer” and “module” are all relative concepts. In a system, a module is complex enough that it is pulled out and made into a separate system; And in the early days, very simple modules, may not be split back, concentrated in a system. Like a biological organization, it is constantly growing, evolving, merging, changing and developing.

Store split

Nosql: Nosql databases, such as MongoDB, are distributed by nature and can be sharded easily. Mysql: For Mysql, or any other relational database, there will be separate libraries and separate tables. However, the partition of database and table will involve several key issues: the partition dimension, the processing of join and distributed transaction

Calculate a spin-off

There are two approaches to computing splitting: data splitting: a large data set is split into several smaller data sets, and parallel computing is performed. Like large-scale data merge sort

Task splitting: divide a long task into several links, each link is calculated in parallel.

The multi-threaded Fork/Join framework in Java and Map/Reduce framework in Hadoop are typical frameworks for computing split. The idea is similar, first split the calculation, and then merge the results.

In a distributed search engine, for example, data is split, indexes are built separately, and query results are merged.

concurrent

The most common is multithreading, which maximizes the concurrency of programs. For example, multiple sequential RPC calls are converted into concurrent calls through asynchronous RPC. For example, data sharding, if you have a Job that scans the entire table and runs for hours, data sharding, if you have multiple threads, performance is several times faster.

The cache

Caching is no stranger, and when it comes to performance problems, the first thing that comes to mind is caching. One key point about caching is the granularity of caching.

For example, in the architecture of Tweets, the Cache granularity varies from small to large, including Row Cache, Vector Cache, Fragment Cache, and Page Cache.

The smaller the granularity, the better the reuse, but the query needs multiple times, need data assembly; The larger the granularity is, the more likely it is to fail. Any small change may cause the cache to fail.

Online computing vs. offline computing/synchronous vs. asynchronous

In the actual business requirements, not all needs need to be completely real-time: for example, the internal report inquiry and analysis system for product and operation development; For example, the spread of microblog, I send a microblog, my fans wait a few seconds to see, this is acceptable, because he will not notice a few seconds late; If I post a blog post, it may take a few minutes for it to be indexed by a search engine. For instance pay treasure transfer account, take, also be not after turn here, the other side receives immediately; .

There are many examples of this. This “non-real-time acceptable” scenario gives the architecture plenty of leeway in its design.

Because it’s not real time, we can do asynchronous, like using message queues, like using jobs in the background, periodically processing certain tasks;

Also because it’s not real time, we can do read/write separation, read/write is not completely synchronized, such as Mysql master-slave.

Full plus delta

Full/incremental is also online/offline thinking: for example, search engine full index + incremental index, the former is for throughput, the latter for real-time; For example, in the OceanBase database, each update is stored in a small table.

Push vs. Pull

In all distributed systems, a basic problem is involved: status notification between nodes (or between two subsystems). For example, if A node’s state changes, there are two strategies to notify another node: Push: node A’s state changes, and Push to node B Pull: that is, polling. Node B periodically asks for the status of node A.

This problem is not limited to distributed systems, but is a fundamental problem in writing code. This corresponds to the coupling problem in object-oriented programming, often referred to as “bidirectional correlation”.

A calls B, and B calls A back. This situation often occurs in system development. To make things a little more complicated, multiple modules call each other like a spider web.

This problem is closely related to the Push/Pull strategy: A calls B, and the logic is written on B’s side; B calls A, and the logic is going to be on A’s side. Therefore, whether to use the pull method of active invocation or the push method of callback will seriously affect the distribution of responsibilities within each module or subsystem.

batch

Batching is an online/offline idea that turns a real-time problem into a batching problem, reducing the stress on system throughput, such as batch messaging in Kafka; For example, in the advertising deduction system, the multiple clicks are accumulated together; .

Re-read vs. re-write

Rewrite light reading, the essence is “space for time”. I can do it in advance and store it. When you do, just go get it.

The way we use Mysql in general, we tend to stress and write lightly, when we write, it’s simple; When searching, perform complex join calculation and return the result. The advantage of this method is that it is easy to achieve strong data consistency, and data inconsistency is not caused by field redundancy. But performance could be a problem. And if you use rewrite light reading, how do you do it? If you want to check your Feeds, have a feed, or inbox, for everyone. When someone tweets, it spreads to everyone’s inboxes, asynchronously, in the background. So everyone can go to their own inbox when they look at their own Feeds.

Reading and writing separation

Similarly, for a traditional standalone Mysql database, reads and writes are completely synchronized. You can read it right away. But in many business scenarios, reading and writing do not need to be completely synchronized. At this point, it can be stored separately, written to one location, and asynchronously synchronized to another location. This enables read/write separation. For example, Mysql Master/Slave is a typical example. Data on the Slave is not synchronized with the Master in real time. For example, report analysis, OLTP/OLAP, separation of online and offline data, and periodic synchronization of online data to Hive cluster for analysis.

Dynamic and static separation

A good example of static separation is the front end of a web site, a dynamic page, on a Web server; Static CSS/JSS/IMG, directly into the CDN, this not only improves performance, but also greatly reduces the server pressure.

In this vein, many large web sites are committed to making dynamic content static so that it can be easily cached.

Hot and cold separation

For example, periodically synchronize historical data from mysql to Hive

Current limiting

Now many e-commerce will have seckill activities, one of the characteristics of seckill is that there are few goods, but the traffic surge in a short time, the server can not handle so many requests.

A basic idea to deal with this kind of problem is to limit the flow, since we can’t handle so many requests, since there are so many people in, we can’t grab them. Then don’t put so many people in.

This and our daily life, holidays, a scenic spot too many people, limit the flow of people is the same truth.

Service fuses and downgrades

Service downgrades are the last insurance against the system. Within a complex system, one system often calls the services of other large systems. In the case of heavy traffic, we may downgrade other services while ensuring the normal operation of the main process.

Degradation means that when a service is unavailable, it is simply not available and returns the default result. Although this service is not available, it does not paralyze the main process, which maximizes the availability of the core system.

Theory of CAP

All of these ideas can be summed up in a larger idea, CAP.

Consistency: Data Consistency, this is easy to understand, no dirty data. We know that in Mysql there are concepts of consistency, such as referential integrity constraints, transactions, etc. But the C here mainly refers to consistency between multiple backups of the same data.

Availability: Stability, service Availability, Availability. The other is performance, which is to be fast, if the latency is very high, and you often time out, it’s not that different from dying.

Partition tolerance: actually refers to network Partition. When you divide data from one physical device to multiple physical devices, the devices must communicate with each other over a network. This leads to network partitioning, a classic “two-checkmate problem” where network timeouts vary. The academic term is “asynchronous communication environment”.

CAP theory says that for a distributed system, only two of the above three can be satisfied at the same time. But that’s not true, P is definitely there, you can’t avoid it. What you can do is basically trade off between C and A.

Welcome to join us in the exchange

Take Mysql for example, its C is the strongest, A is the second, and P is the weakest. If you do redundant data for A, like overwrite light reading, it’s hard to guarantee C; In order to P, to do a sub-database sub-table data, that can not do transaction;

For example, Nosql, P is the strongest, can do data separation, but C is not enough, can not do transactions;

For example, if the requirement for C is reduced, A lot of caching can be added to improve A. Data sharding, increase P;

However, payment and transaction transfer have high requirements on C, so Cache cannot be used simply to improve performance

Final consistency

As mentioned earlier, in distributed systems, strong consistency is difficult to guarantee because of the fragmentation of data and services. At this point, the term “final consistency” is most commonly used.

Strong consistency, weak consistency, and final consistency are different levels of consistency. In traditional relational databases, strong consistency is guaranteed through transactions.

However, in distributed systems, strong consistency is usually compromised into final consistency to solve the problem of distributed transactions.

The ultimate consistency implementation usually requires a highly reliable message queue.