Abstract: This article belongs to the original, welcome to reprint, reprint please reserve source: github.com/jasonGeng88…

scenario

If everyone is doing the backend development, must implement the list query interface, and of course some query condition is very simple, have to a SQL, but some query conditions are extremely complex, coupled with the library table design all kinds of unreasonable, cause the query interface is particularly difficult to write, and work overtime of what not to mention the (I wonder if you have the feeling ~).

Let’s start with an example. This is the search criteria of a shopping website. If you implement such a search interface, how will you implement it? (Of course you said you could do it with a search engine like Elasticsearch. But what I’m saying is, what if you had to do it yourself?)

As you can see from the figure above, the search is divided into six categories, each subcategory. In the middle, the intersection of various categories of conditions is taken, and there are single selection, multiple selection, and custom cases in each subclass, and the final output meets the conditions of the result set.

Ok, now that the requirements are clear, let’s start implementing them.

To achieve 1

The first entry is A student, he is writing SQL “expert”. Small A confidently said: “is not A query interface? It looks like a lot of conditions, but with my rich SQL experience, this is not difficult for me.”

The result is the following code (MYSQL as an example) :

select.from table_1
left join table_2
left join table_3
left join (select.from table_x where...). tmp_1 ...where.order by.limit m,nCopy the code

The code ran through the test environment and seemed to match, so it was ready to be preshipped. This advance, the problem began to reveal. In order to make the online environment as realistic as possible, the amount of data is naturally much larger than the test. So such a complex SQL, its execution efficiency can be imagined. Test students decisively to the small A code to hit back.

Implement 3

Little C sees room for optimization from the above scheme. He found that little B had no problem in thinking. He split the complex conditions, calculated the result sets of each sub-dimension, and finally combined all the sub-result sets to get the final result he wanted.

So he wondered if it would be possible to cache the result sets of each subdimension in advance, so that when querying, he could directly fetch the subset he wanted instead of checking the library for calculation every time.

Here little C uses Redis to store cached data, the main reason for using it is that it provides a variety of data structures, and it is very easy to do the union operation of sets in Redis.

The specific scheme is shown in the figure:

Here, each condition stores the calculated result Set ID into the corresponding key in advance, and the data structure chosen is Set. The query operations include:

  • Single subclass: directly obtain the corresponding result set according to the condition key;
  • Subclass multiple selection: Perform the union operation according to multiple condition keys to obtain the corresponding result set.
  • Final result: The final result is obtained by the intersection operation of all subclass result sets;

That’s actually what it’s calledInverted index.

And you’ll notice that there’s a price condition missing. We know from demand that the price condition is a range and is infinite. Therefore, the above key-value method of exhaustive conditions cannot be done. Here, we use another data structure of Redis, Sorted Set:

Add all goods into the ordered set with Key as price, value as commodity ID, and the score corresponding to each value is the value of commodity price. In this way, the ZRANGEBYSCORE command can be used to obtain the corresponding result set according to the score (price) interval in the ordered set of Redis.

At this point, the optimization of scheme 3 has been completed, and the data query and calculation have been separated by means of caching. In each search, the result can be obtained by simply searching Redis several times. The query speed meets the acceptance requirements.

extension

Data update

There are two ways to update index data. One is through the modification of commodity data, to trigger the update operation immediately, the other is through the timing script for batch update. Note here that the index content update, if the violence of the delete Key, then reset the Key. Since the two operations in Redis will not be atomic, there may be a blank gap between them. It is suggested to remove only invalid elements in the set and add new elements.

conclusion

This is just a simple Demo of using Redis to optimize query search, which is lighter and cheaper to learn than existing open source search engines. Second, some of its ideas are similar to those of open source search engines, and if word parsing is added, full-text search functions can also be achieved.

Finally, unfinished, to be continued…