The “like” function is the basic function of APP development at present

Today we are going to talk about the db database design of the likes, comments, favorites and other scenes.

1. Let’s first look at the requirements of the scene:

A. Display the number of likes

B. Determine whether the user has clicked the “like” button, which is a necessary judgment for de-duplication

C. Display a list of personal likes, usually in the user center

D. Display the likes list of articles

Let’s start with headlines and microblogs

The likes of the headlines

Likes on Weibo

Both of these are top-of-the-line traffic, and there’s definitely a complex architecture on the back end, so we’re just going to talk about generic solutions today.

Mysql 2.1 solution

Mysql solution, with the popularity of NOSQL, big data continues to be hot, but mysql is still irreplaceable, for most small and medium-sized projects, less than ten million levels of data, using mysql sub-table +cache, is fully competent, and stability is incomparable to other solutions:

-- Create table POST {post_id int(11) NOT NULL AUTO_INCREMENT,...... star_num int(11) COMMENT'Likes'} -- Create table user {user_id int(11) NOT NULL AUTO_INCREMENT,...... star_num int(11) COMMENT'Likes'} -- create table star {id int(11) NOT NULL AUTO_INCREMENT, post_id, user_id,..... }Copy the code

Common queries:

Queries articles that users have liked

select post_id from star where user_id=?

Query the likes of articles

select user_id from star where post_id=?

The number of likes can be updated to the POST and User tables by timed asynchronous statistics.

For small amounts of data, this design is basically fine,

Disadvantages:

When there is a large amount of data, a table is under great pressure during query and needs to be divided into tables. However, whether post_id or user_id is used to hash tables conflicts with our requirements. The only solution is to make two tables redundant. This increases storage and maintenance effort, and there may be consistency issues.

2.2 redis scheme

When the amount of data reaches hundreds of millions, the cache is a necessary stage, because the action of “like” is very random, many people see the thumb to click, so the data volume grows rapidly, after the data scale up, there is a great pressure on mysql reading and writing, then we should consider memcache, Redis for storage or cache.

Why generally choose Redis, Redis as a popular NOSQL, has rich data types, can adapt to the needs of multiple scenarios.

Redis has two uses, one is storage, one is pure cache, need +mysql together. Pure cache is to write data from mysql to Redis first. The user reads data from cache first, then pulls data from mysql after miss, and synchronizes data from cache.

cache

In most scenarios, both are used together without conflict.

Redis as storage:

Scenario A: The number of likes is displayed

In the “like” area, only a “like” number is displayed, which can distinguish whether the user has “like”. Ordinary users do not care about this list, this scene only needs a number, when the number is large, generally displayed as “7K”,”10W” and so on.

Take the article ID as the key

127.0.0.1:6379[2]>setIncr star:tid: 6379[2]> incr star:tid: 6379integer) 
899Copy the code

Scenario B: “Like”, avoid repeating “like”

To implement this requirement, you must have a list of uuIds for article likes. In the UID key scenario C, you can usually see the user’s own likes list in the user center

This requirement can be implemented using data from scenario B.

User-centric likes list

Scenario D: Indicates the thumbs-up list of the article, similar to scenario B, with the article ID as the key

[2]> sadd star:list:tid:888 123 456 789integer) 3 127.0.0.1:6379[2]> sismember Star :list:tid:888 456integer1)Copy the code

If you like it, it’s red, if you don’t, it’s black and white,

Toutiao is no place to see the list of likes, and micro blog click in, details page can see the list of likes, but only the latest dozens will be displayed, there is no paging display.

As shown below, I have selected a hot topic, “Pig” with many fans.

Post likes list

Some people may feel that no one cares about the likes list, and saving it will waste a lot of resources, so it is better not to save it! However, this data is a must. Two points:

A. to heavy. The likes can be imprecise, but the likes have to be precise,

B. For another social product, every bit of user behavior needs to be recorded, which is meaningful for subsequent user behavior analysis and data mining.

In addition to string, you can also use hash to store the number of “likes” by dividing article ids into 100 pieces and storing them in a hash table. Each article ID is a hash key, and value stores the user ID of “likes”. If there are many “likes”, To avoid performance problems caused by too many ids, they can be listed separately and saved with sorted set structure. After all, there are only a few hotspots.

hash

Comparison of advantages and disadvantages of schemes

Hash: Saves memory by using fewer global keys; But it also brings problems

How to route to the hash based on the article ID?

Is looking for a user ID in a hash or a set? There is uncertainty

Using hash saves space but adds complexity, and the choice is up to the individual.

Do you have any other methods besides that?

3. Data consistency

When redis is used as a storage, data persistence must be enabled. RDB and AOF must be enabled. As a result, services can only use half of the machine memory.

In addition, as long as there is data copy, there will be consistency issues, which is another very important topic. Later have time again fine talk!


A BLOG address:www.liangsonghua.com

Pay attention to wechat public number: songhua preserved egg bulletin board, get more exciting!

Introduction to our official account: We share our technical insights from working in JD, as well as JAVA technology and best practices in the industry, most of which are pragmatic, understandable and reproducible