In the field of Internet, especially in the era of mobile Internet, Feed streaming products are very common. For example, moments of friends and Weibo, which we use every day, are very typical Feed streaming products. There are also Feed streaming products in another form, such as Pinterest and Pepetal. In addition, many apps have a module, either called dynamic or message squares, which are also Feed streaming products, so to speak, Feed streaming products are all over the world.

concept

Before we talk about how to design a Feed flow system, let’s take a look at some of the concepts in Feed flows:

Feed: Every status or message in the Feed stream is a Feed, for example, a status in moments is a Feed, and a microblog is a Feed.
Feed stream: A stream of information that is constantly updated and presented to users. Everyone’s moments, Micro blog page and so on is a Feed stream.
Timeline: Timeline is actually a type of Feed flow. Microblog and moments are all the Feed flows of Timeline. However, as the type of Timeline appears earliest, is the most widely used and the most well-known, it is sometimes used to represent the Feed flow.
Timeline: A page that displays other people’s Feed messages, such as moments and the home page of weibo.
Timeline: Displays the page of Feed messages sent by oneself, such as photo albums in wechat and personal pages in Weibo.

Characteristics of the

Feed streaming systems have some very typical features, such as:

Multi-account content stream: There must be thousands of accounts in the Feed stream system. You can follow, close, add friends, block and other operations between accounts. As long as this is true, you can design it as a Feed flow system.
Unstable account relationship: the relationship between users in the system will change all the time due to the existence of operations such as attention and clearance, which is an unstable state.
Reading and writing ratio 100:1: Serious imbalance in reading and writing, with more reading and writing and less writing. Generally, the reading and writing ratio is 10:1 or even more than 100:1.
The message must reach the sex requirement is high: for example, after sending a circle of friends, the result part of friends saw, some friends did not see, if the girlfriend did not see, then may produce very serious emotional contradiction, the consequence is very serious.

These are some of the features of Feed streaming products, and let’s take a look at the classification of Feed streaming systems.

classification

There are many categories for Feed flows, but the most common are two:

Timeline: Sorted by release time, with the first to be seen and the second to be released at the top, similar to wechat moments and Weibo. This is also the most common form. If the product selects the Timeline type, it is consideredThere aren't many feeds in the Feed stream, but each one is important and needs to be seen by the user.
Rank: Users are ranked by a non-chronological factor, usually by their preferences. The user’s favorite is ranked first and the user’s least favorite is ranked second. This general assumption assumes that the user may see a large number of feeds, and the user’s time is limited, so the user chooses the Top N results that the user most wants to see. The application scenarios include picture sharing, news recommendation, product recommendation, etc.

The above two are the most typical and common classification methods. In addition, there are other classification standards, and in other classification standards, there are two more types:

Aggregate: A type of aggregation. For example, several friends have watched the same movie, which can be aggregated into A Feed: A, B, and C have watched the movie Your Name. This aggregation function is suitable for the client. The Aggregate type is Timeline + client aggregation.
Notice: Notice type, which is already a functional type. Notice type is generally used for all kinds of notifications in APP, such as private messages. This is also a Timeline type or an Aggregate type.

implementation

After introducing the concepts, characteristics, and classification of Feed streaming systems, we move on to the key part: how to implement a ten-million-level Feed streaming system. Since not all users in the system can be online, and it is impossible to refresh and publish feeds at the same time, a system that can support tens of millions of Feed streams can actually support hundreds of millions of users on the product.

If you want to design a Feed streaming system, the two most critical cores are storage and push.

storage

Let’s start with storage. The content that needs to be stored in the Feed stream system is divided into two parts, one is account relationship (such as follow list) and the other is Feed message content. Regardless of the type of storage, there are several issues to consider:

How to support 100TB, or even PB data volume?
With a large amount of data, the cost is critical. How can the cost be cheaper?
How do I keep account relationships and feeds from being lost?

We’ll answer those three questions later, but keep watching the feed

push

Push system needs two functions, one is to publish the Feed, one is to read the Feed stream. For delivery systems, there are still some issues that need to be considered before selection:

How to provide tens of millions of TPS and QPS?
How can I keep read and write latency below 10ms, or even 2ms?
How can FEEDS be guaranteed to be reachable?

Before answering these questions, let’s have a general understanding of Aliyun TableStore.

TableStore

TableStore is a professional-level distributed NoSQL database independently developed by Ali Cloud. It is a semi-structured data storage platform with high performance, low cost, easy expansion and full hosting based on shared storage, supporting efficient calculation and analysis of Internet and Internet of Things data.

At present, both inside Alibaba Group and external public cloud users, there are thousands of systems in use. It covers heavy-throughput offline applications as well as heavy-stability, performance-sensitive online applications. Some systems currently in use can write more than 35 million rows per second, with more than 5GB of traffic per second, more than 10 trillion rows per table, and more than 10PB of data per table.

The specific features of table storage can be seen in the following image.

Here is not a detailed description of the TableStore (TableStore) functions and features, if interested, you can go to the official website page and cloud habitat blog to understand, address is as follows:

Form the store’s website address: www.aliyun.com/product/ots…
Table to store the cloud blog: yq.aliyun.com/teams/4/typ…
Table storage spikes AC group: 11789671

Storage System Selection

So let’s go ahead and solve the problem that we raised earlier. There are two types of systems that need to be stored in the Feed flow system: account relationships (such as follow lists) and Feed messages.

Storage Account Relationship

Let’s first look at the storage of account relationships (such as follow lists). For account relationships, it has some characteristics:

It’s a series ofVariable-length list.The length can reach 100 million grades.
This will lead toLarge amount of data, butThe relationship is extremely simple.
Another point is performance sensitivity, which directly affects the response speed of concerns and takeoffs.

The most suitable system for storing account relation (concern list) should be distributed NoSQL database, because of the large amount of data, simple relation and high performance requirement without complex join. The internal design is simple and the external user experience is good.

In addition to the above features, there is another feature:

Ordering: Ordering does not require sorting, just the ability to sort by primary key. As long as it can sort by primary key, the order of the following list and the fan list is fixed and predictable.

Use open-source HBase to store account relationships

Open source HBase is one of the distributed NoSQL databases that can meet the requirements of order. Therefore, many enterprises choose open source HBase to store account relationships or follow lists.

Although the above four characteristics are satisfied, the system can be built, but there will be some troublesome problems:

The need to operate and maintain, investigate problems and Fix bugs will bring great complexity and cost.
GC can cause large burrs that affect the user experience,

Use a TableStore to store account relationships

In addition, Ali Cloud’s table storage also belongs to the ordered distributed NoSQL database. Before, many famous systems choose to use table storage, which brings benefits to the system in the following places:

Single table support10 trillion rows plus 10 Pb rows plusThe amount of data, no matter how fast the data growth rate do not worry.
According to the dataPrimary key column sortTo ensure order and predictability.
The single-key read/write delay is inmsLevel, ensure concern, take off response time.
isAll managedDistributed NoSQL database services,No operations required.
allUsing c + +Come true throughNo GC problem, there will be no larger burrs due to GC.

A good choice is to use a TableStore to store account relationships.

Next, take a look at the storage of Feed messages.

Storing Feed messages

Feed messages have one of the biggest features:

The amount of data is large, and many times in Feed streaming systems write spread (push) mode is used, where the amount of data can be expanded by several orders of magnitude, so the amount of data can easily reach 100TB or even petabyte level.

In addition, there are some other features:

Simple data format
Data cannot be lost and has high reliability requirements
The auto-increment primary key function ensures that the message ID of the individual Feed is strictly incremented in the individual outbox, so that only one range is required to read. Because of the low concurrency of feeds posted by individuals, timestamps can meet basic requirements, but they still do not guarantee strict increments when application layer queues are clogged, network latency is high, or time is rolled back. It would be nice to have autoincrement here.
The lower the cost, the better

Potential storage systems

Based on these characteristics, the best system would be a distributed NoSQL database with primary key increment, but there are no open source systems, so there are two common approaches:

Relational database + sub – library sub – table
Relational database + distributed NoSQL database: Relational database provides primary key increment function.

Use a relational database to store Feed messages

There are a lot of users in the industry who choose to use relational database + sub-database and sub-table, including some very famous Feed stream products. Although this architecture works, it has some problems.

I’ve brought the sub-database and sub-tableOperation and maintenance complexity.
Separate library and table bring the logical layer and data layerMaximum coupling.
The primary key increment function of relational databases, such as the open source MySQL database, performs poorly. Whether using MyISAM or InnoDB engine, to ensure that the self-increment ID is strictly incrementally, you must use table locks. This granularity is very large, which severely limits concurrency and affects performance.
Some users find relational databases to be more reliable, but the reliability of relational databases is usually up to six nines, which is completely different from distributed databases, four or five levels lower.

Use TableStore to store account relationships

For these reasons, some technology companies are starting to consider using TableStore, a distributed NoSQL database with auto-increment primary keys, so that only one system is needed. In addition to the following considerations:

A single table can be up to 10 Pb, 10 trillion rows.
10 9's of SLA guaranteeFeed content is not lost.
Natural distributed database,No need for separate library and table
The two instance types are as follows: High performance Instance Uses all SSD storage media to provide excellent read and write performance. Hybrid storage uses SSD and SATA storage media to provide low storage cost.
The primary key increment function performs well. All other systems require locks when they are used to increment columns, but the primary key increment function of table storage does not require locks at all when writing to increment columns, neither table nor row locks.

From the above point of view, using TableStore is more suitable in terms of functionality, performance, scalability and cost.

After looking at the choice of push system, let’s look at the choice of push scheme.

Push plan

Let’s review the main features of the Feed streaming system:

There is a serious imbalance in reading and writing, with more reading and less writing, and the average reading and writing ratio is around 10. 1, even over 100:1.

In addition, there is another aspect that will be affected by the push scheme:

The delay in publishing, refreshing feeds is essentially caused byPush planDecided that any other operation can only be optimization, qualitative change, cannot qualitative change.

Compare the push and pull modes

In the push scheme, there are two schemes:

Pull scheme: also known asRead the diffusion.
Push plan: also becomeWrite a diffusion.

For pull and push scenarios, they are opposite in many ways, but before we look at the comparison, one thing should be emphasized:

For users of the Feed stream product, the latency sensitivity of refreshing the Feed stream (reading) is much greater than that of publishing it (writing).

Pull mode (read diffusion)	Push mode (write diffusion)
release	Individual Page Timeline (Outbox)
reading	The personal page Timeline of all followers
Maximum network cost	When the user refreshes
Read and write amplification	Enlarge reading: Read/write ratio to 10000 :1
personalized	Does not support
Targeted advertising	Does not support

A side effect of push mode

It’s clear from the above comparison that push is far better than pull, but there is a side effect:

The data will swell tremendously.

In view of this shortcoming, two aspects can be considered:

The current storage price is very low, take table storage as an example, capacity type instance storage 10TB data volume, at present (October 2017) the annual cost is 16,000 yuan, the price will continue to reduce with hardware technology upgrade, software performance optimization. And the bigger the data, the cheaper the price.
Want to save some money, that can continue to optimize:
- Pull mode is used for large V, while push mode is used for ordinary users. This mode has a disadvantage, which will be analyzed later.
- Push mode is adopted for active fans and pull mode is adopted for inactive fans (this method can better avoid the impact of heavy traffic on the platform)

Applicable scenario

After the comparison of the two schemes, the applicable scenarios of each scheme are summarized as follows:

Mode:
- Many Feed streaming products adopt this approach in their first versions, but quickly abandon it.
- On the other hand, pull + graph computing is a different story, but at this point the center of gravity is graph computing.
Driving mode:
- The most common and effective pattern in Feed streaming systems;
- The number of user relationships is more uniform, or has an upper limit, such as moments of friends;
- Partial recommendation category, the same Feed has different value for different users, need to calculate scores for different users, such as Pinterest.
Push-pull combination
- Most users have several hundred accounts, but some users have more than 10 million, such as Weibo.

After knowing the push scheme above, let’s look at the selection of push system

Push system

To implement a ten-million-scale Feed stream, a push system needs to have some features:

10 million TPS/QPS capability.
Read/write link delay is sensitive. Read/write directly affects the delay when users publish and refresh Feed streams, especially the extremely sensitive delay when refresh.
Feed messages are highly reachable.
The primary key increment function still ensures that the Feed ID in the user’s inbox is strictly incremented, ensuring that the latest unread message can be read through Scan(the maximum ID read last time –>MAX).
It is best to store all feeds in Timeline for users.

In view of the above characteristics, it is better to have a NoSQL system with excellent performance and reliable self-increasing functions as the push system. Therefore, if the industry chooses the open source system, it will choose the open source Redis on the basis of choosing the relational database as the storage system, so as to cover the above characteristics. It also keeps the Feed streaming system up and running, but it also introduces some other problems:

Pure memory system, memory price is very high, the overall cost is relatively high.
In order to support tens of millions of TPS and ensure message reachability, cluster and Replica modes are needed for a single-machine system. As a result, not only the complexity of operation and maintenance is brought, but also the increase of machines and the cost rises again.
As costs rise, some architects begin to consider whether they can save money by reducing the amount of data stored in open source Redis. There are two ways to reduce the amount of data stored in Redis:
- Only store Feed ids in open source Redis, not Feed content. The total amount of data is greatly reduced, but the Feed ID needs to be read first, and then the Feed content needs to be read in the storage system. The network overhead is doubled, and it is serial, which has a great impact on the refresh delay of the user.
- Push mode is used only for ordinary or active users, and pull mode is used directly for large V and inactive users.

Although the above two schemes can save costs, they are at the cost of sacrificing user experience, which ultimately requires a trade-off between cost and user experience.

Use TableStore as the push system

In addition to the open source system, Ali Cloud’s TahleStore can also be used. Many users choose TableStore as the push system for the following reasons:

Natural distributed, single table can support tens of millions of TPS/QPS.
The LSM storage engine is hugeOptimize the writing, high performance instances are extremely largeOptimization of reading.
Data is successfully written to a disk, and data reliability is guaranteed by SLA of 10 9.
Disk-based databases cost several orders of magnitude less than in-memory databases.
A single table can store more than ten trillion rows of data at a low price and can easily store all the Feed data in a user’s Feed stream.

The above said the use of open source Redis and Ali Cloud TableStore similarities and differences, if the use of open source can use Redis, if you choose Ali cloud research NoSQL database, you can use TableStore.

Architecture diagram

Let’s take a look at the TableStore architecture diagram. Here, for commonality, the combination of push and pull is selected, and the push mode is simpler.

storage

Let’s look at the black box in the middle, which is the data from TableStore, from left to right:

Individual page Timeline: This is the outbox of each user, which is his or her own personal page.
Timeline: This is the inbox of each user, that is, the following page of the user. The content is all the messages published by the followers.
Follow list: save account relationship, such as friend relationship in moments; The following list in micro-blog.
Virtual follow lists: This is primarily used for personalization and advertising.

Publishing Feed process

When you publish a Feed, the process goes like this: 1. The Feed enters a queue service first. 2. Read your fan list from the following list and determine whether you are a big V. 3. Write your own Feed messages into the personal page Timeline (sending box). If it’s big V, the write process ends there. 4. If you are an ordinary user, you need to write your Feed message to your fans. If you have 100 fans, you need to write your Feed message to 100 users, including the Feed content and Feed ID. 5. Steps 3 and 4 can be combined to write multiple rows of data to the TableStore at once using the BatchWriteRow interface. 6. This is the end of the Feed publishing process.

Read the Feed flow process

When refreshing your Feed stream, the process goes like this: 1. Read the list of big V’s you care about first. 2. To read your inbox, you only need a GetRange to read a range. The range starts with the ID of the last Feed read and ends with the current time or MAX. Since primary key increment was used earlier, this can be read using GetRange. 3. If there are big V’s of interest, then the outbox of each big V is read concurrently again. If there are 10 big V’s of interest, then 10 accesses are required. 4. Merge the results of steps 2 and 3, sort them by time, and return them to the user.

At this point, with push-pull publishing, the process of reading the Feed stream is complete.

Easier push mode

It’s even simpler if you just use push mode:

Publish the Feed:
1. Regardless of whether it’s big V or not, the process is the same for all users, with three steps.
Read the Feed stream:
1. No need for the first step, no need for the third step, just need the second step, reduce the previous 2 + N(N is the number of large V of concern) network cost to 1 network cost. Read latency is significantly reduced.

Personalized and targeted advertising

Personalized and targeted advertising are two strong product needs. Personalization can serve users well and increase product competitiveness and user engagement, while targeted advertising can increase profit channels for products and avoid user aversion. Then how to achieve these two ways? In the feed stream, the two functions are implemented in the same way. Let’s take targeted advertising as an example: 1. Classify users by analyzing their characteristics. For example, one of them is freshmen: freshmen who just entered university this year. TableStore + MaxCompute = TableStore + MaxCompute 2. Create an advertising account: freshmen advertising 3. Let these users with freshmen characteristics follow freshmen advertising account virtually. Users don’t see this layer of concern. 4. You will be able to send ads through your freshman AD account from July. 5. Ultimately, each user may have multiple characteristics, and it is possible to follow multiple AD accounts virtually.

The above is a relatively simple way to implement targeted advertising, and the other ways will not be described.

earnings

We’ve talked in detail about using the TableStore architecture as a storage and push system, so let’s see how much benefit we can get from the new architecture.

Using only one system, the architecture and implementation are simple. Eliminating the need to access multiple systems, architecture, development, testing, and operations can save a lot of manpower and time.
TableStore primary key autoincrement function excellent performance. Because of architectural differences, not only table locks are not required, but also row locks are not required, so the performance is much better than that of relational databases.
You can save all feeds. One is that the system can support the storage of all feeds, and the other is that it is cheap and affordable.
Feed ids and content do not need to be stored separately. It’s cheaper, and it eliminates the need to store ids and content separately.
Full hosting service, no operation and maintenance operations, not to mention the need for sub-database sub-table.
Disk type (SSD, Hybrid) database, low cost.
Reliability 10 9, data more reliable, less easy to lose.
The sharding threshold of large V and ordinary users is higher, the number of reads of large V is less, and the overall delay is lower.

A design flaw

There is a significant risk in using a large V/ ordinary user shard, with a large V using a pull pattern and a common user using a push pattern. For example, if a big V suddenly posts a very topical Feed, it is possible that all users in the entire Feed product will be unable to read new content. Here’s why:

Big V sends Feed messages.
Big V, use pull mode.
Big V’s active fans (User group A) began to read big V’s new feeds in pull mode (step 3 read in the architecture diagram, read 3 for short).
Feed content is too topical and spreads quickly.
The unlogged big V fans (user group B) will log in the product and automatically refresh after logging in. They will read the Feed content of Big V again through 3 steps.
Non-fans (user group C) go to the personal Timeline of Big V to watch, and need to read the personal Timeline of Big V again, the same as 3.

As A result, the normal traffic is only for user group A, but now it is for user group A + user group B+ user group C. The traffic increases several times or even tens of times. As A result, the service module that reads path 3 is sent to Server BUSY or the machine resources are filled up, and the read path 3 that reads large V cannot return requests. If all the users in the Feed product are paying attention to the big V, then almost all of them will be stuck on the read 3 path that reads the big V, and they won’t be able to refresh.

So when designing here, we need to focus on the following two points:

The unavailability of a single module should not prevent the entire critical read Feed stream path, if the large V is unreadable, but the ordinary user can return it, and the service is restored, then the large V content can be replenished.
When a module cannot handle this amount of traffic, it should not be completely unserviceable, but should be able to continue to provide maximum service capacity, and reject the excess.

So how do you optimize?

Do not use large V/ ordinary user optimization mode, use active/inactive user optimization mode. In this way, user group A and part of user group B can be split into other, more distributed, multiple paths. Moreover, even if the read 3 path is not available, there is still no impact on active users.
Totally use push mode can completely solve the problem, but will lead to increase storage capacity, big V weibo send total time increases, from to the first fan to send a final fans may be a few minutes (one hundred million fans, 1 million lines per second, need 100 seconds), but also for the maximum concurrent reserved resources well, if you use the table storage, because it is a cloud services, There is no need to consider the problem of reserving maximum resources.

practice

Next let’s implement a message square function. Many apps have the function of dynamic or message square. In the message square, there are generally two tabs, one is to follow people, the other is to square. We focus on following people here.

The functions to be implemented are as follows:

Users can follow each other
Users can post new messages
Users can view a list of their own posted messages
Users can view the messages of the people they follow

Take the previous approach:

Use TableStore as storage and push system
Timeline display mode is adopted, so that users can carefully view each Feed
Adopt push mode

role

Next, let’s take a look at the roles and the functionality each role requires:

The sender
- Send status: add_activity()
The receiver
- Attention: follow ()
- Read the Feed stream: get_activity()

The Feed message should contain at least the following:

Message:
- Sender: Actor
- Type: verb, such as picture, video, or text
- Text Text: message

Architecture diagram

Release a new message
- Interface: add_activity ()
- Implementation:
  - The get_range interface calls the concern list and returns the list of fans.
  - The batch_write_ROW interface writes feed content and ids in batches to the individual page table (outbox) and to all fans’ attention page tables (inbox), or multiple times if the volume is too large. Or call the asynchronous batch_write_row interface, which is currently provided by the C++ SDK and JAVA SDK.
Focus on
- Interface: follow ()
- Implementation:
  - The put_row interface writes a row of data (followers, followers) to the following list and the followers list (followers, followers).
Get the Feed flow message
- Interface: get_activity ()
- Implementation:
  - Get the ID of the last message read from the client: last_id
  - The get_range interface is used to read the latest message, starting at last_id and ending at MAX.
  - If you are reading the contents of a personal page, access the personal page table. To read the contents of a concern page, access the concern page table.

plan

The above shows how to do this using the TableStore API for tables. This only uses a few interfaces, but it still takes time to learn the TABLE storage apis and features.

For ease of use, we will provide a complete feed stream solution next, providing a LIB interface directly like add_activity(), follow() and get_activity(), which will be easier and faster to use.

extension

The above mentioned Feed flow types are all Timeline type, but there is another Feed flow type that is more common, that is, news recommendation and Rank type commonly used by photo sharing websites. Let’s review what Rank types excel at:

There are so many potential Feed contents that the user cannot and does not need to read all of them, so the user needs to choose the content she wants to read most, typically picture sharing websites, news recommendation websites, etc.

Let’s start with an architecture diagram:

This Rank mode is lightweight and suitable for push-pull scenarios.
The writing process is basically the same
In the reading process, all Feed contents are read first, which is the same as Timeline. In Timeline, the content is directly returned to the user, but the Rank type needs to be sorted by a certain attribute value in a sorting module, and all results are stored in a Timeline cache. And return the highest score of the N results, the next time to return [N+1, 2N] results.

Here’s another one:

This comparison is weighty and applies to pure extrapolation.
The write process is the same as Timeline.
Each user has two inboxes:
- One is the focus on page Timeline, which saves the original Feed content. Users cannot view the inbox directly.
- The first is rank Timeline, which stores the selected Feed content of the user and allows the user to view the inbox directly.
After the write process, there is a data processing process. The personalized ranking system obtains the new Feed content from the original Feed inbox and calculates a score according to the characteristics of the user and the Feed. The score of each Feed may be different in the Timeline of different users. After the calculation, the ranking is then written into the final rank Timeline.
This approach can truly achieve “thousands of faces” for each user.

The above two methods are relatively simple and commonly used to achieve Rank.

The last

From the above content, TableStore can support 10PB level in storage and push can support tens of millions of TPS/QPS per second, which can play a great value in the Feed stream system.

There are already a number of well-known companies using the TableStore to build their own Feed streaming systems, which have resulted in significant benefits for the system, the product, and the company.

If you are interested in table storage, or have a question about table storage, you can join:

Table stores open exchange group: pin group number: 11789671Copy the code

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

How to build a ten-million-level Feed streaming system

concept

Characteristics of the

classification

implementation

storage

push

TableStore

Storage System Selection

Storage Account Relationship

Use open-source HBase to store account relationships

Use a TableStore to store account relationships