Welcome to follow our wechat official account: Shishan100
My new course ** “C2C e-commerce System Micro-service Architecture 120-day Practical Training Camp” is online in the public account ruxihu Technology Nest **, interested students, you can click the link below for details:
120-Day Training Camp of C2C E-commerce System Micro-Service Architecture
If the interview encountered such an interview question: ES in the case of a large amount of data (billions of levels) how to improve the efficiency of the query?
The question is, have you actually used ES, because what? In fact, ES performance is not as good as you think.
Many times the amount of data is large, especially when there are hundreds of millions of pieces of data, you may find that the meng force, run a search how to 5~10s, pit dad.
For the first time, it was 5 to 10 seconds, but then it was faster, maybe a few hundred milliseconds.
Every user’s first visit is slow. Is it slow? So if you’ve never played ES before, or you’ve played a Demo by yourself, it’s easy to get confused when asked this question, which shows that you really don’t play ES well?
To be honest, there is no silver bullet for ES performance optimization. What does that mean? Just don’t expect a single parameter to be a one-size-fits-all response to all slow performance scenarios.
There may be scenarios where you can change the parameters or adjust the syntax, but not all scenarios.
Performance optimization killer: Filesystem Cache
The data you write to ES is actually written to disk files, which are automatically cached in Filesystem Cache by the operating system.
The ES search engine relies heavily on the underlying Filesystem Cache. If you add as much memory to Filesystem Cache as possible to accommodate all IDX Segment files, your search will be based on the underlying Filesystem Cache. The performance will be very high.
How big can the performance gap be? Many of our previous tests and pressure tests, if the disk is generally certain on the second, search performance is absolutely second level, 1 second, 5 seconds, 10 seconds.
Filesystem Cache, however, provides an order of magnitude higher performance than disk Cache, ranging from a few milliseconds to hundreds of milliseconds.
** Here is a real example: ** A company ES node has 3 machines, each machine seems to have a lot of memory 64GB, the total memory is 64 * 3 = 192G.
The ES JVM Heap for each machine is 32 GB, leaving only 32 GB for Filesystem Cache per machine. The total amount of memory allocated to Filesystem Cache is 32 x 3 = 96 GB.
At this time, the index data files on the whole disk occupy a total of 1T disk capacity on the three machines. If the ES data amount is 1T, then the data amount on each machine is 300G. Is it a good performance?
Filesystem Cache has only 100 gigabytes of memory, so one-tenth of the data can be stored in memory, and the rest is stored on disk. Then you perform searches, most of which are performed on disk, and performance is poor.
At the end of the day, you want ES to perform well, and in the best case, your machine’s memory should hold at least half of your total data.
Based on our own experience in production environments, it is best to store only a small amount of data in ES, the indexes you want to search for. If the Filesystem Cache has 100 GIGABytes of memory, you should limit the index data to 100 GIGABytes.
In this way, your data is almost all searched in memory, and the performance is very high, usually in less than a second.
Let’s say you have a row of data: ID, name, age…. 30 fields. But if you’re searching now, you just need to search by id, name, age.
If you foolishly write one line of data into ES and all the fields, you’re going to say 90% of the data is unsearchable.
As a result, Filesystem Cache occupies space on the ES machine. The larger the size of a single data block is, the less data Filesystem Cahce can Cache.
In fact, just write a few fields in ES to retrieve, for example, write ES ID, name, age three fields.
You can then store other field data in MySQL/HBase. We generally recommend ES + HBase.
HBase is suitable for online storage of massive data. Massive data can be written to HBase, but simple operations such as query based on ID or range do not need to be performed.
If you search ES based on name and age, you may get 20 doc ids. Then, you can query the complete data corresponding to each DOC ID in HBase based on the DOC ID, and then return it to the front end.
The size of Filesystem Cache must be smaller than or slightly larger than the size of Filesystem Cache.
It may take you 20ms to retrieve data from ES, and 30ms to retrieve 20 pieces of data from HBase based on the id returned by ES.
Maybe you used to play with ES for 1T of data, and the query would be 5~10s each time. Now you might have high performance, and each query would be 50ms.
Data preheating
Even if you did this, the ES cluster would still write twice as much data per machine as Filesystem Cache.
For example, if you write 60 gigabytes of data to a machine and the result is 30 gigabytes of Filesystem Cache, 30 gigabytes of data remains on disk.
You can actually do data preheating. For example, take weibo as an example, you can set up a system in the background in advance for some big V data that many people usually read.
Every once in a while, your own background system searches for hot data, flushes it into Filesystem Cache, and when the user actually looks at the hot data later, they’re searching directly from memory, very quickly.
Or for e-commerce, you can create a program in the background that automatically accesses the hot data of some items you normally view most, such as the iPhone 8, every minute, and flush it into Filesystem Cache.
For data that you feel is hot and that is frequently accessed, it is a good idea to have a dedicated cache warm-up subsystem.
Every once in a while, hot data is accessed in advance and stored in Filesystem Cache. The next time someone accesses it, the performance will be much better.
Hot and cold separation
ES can do a horizontal split similar to MySQL, that is, a large amount of data that is accessed infrequently, a separate index, and then hot data that is accessed frequently, a separate index.
It is best to write cold data to one index and hot data to another to ensure that hot data is kept in the Filesystem OS Cache as much as possible after being warmed up.
You see, suppose you have six machines, two indexes, one for cold data, one for heat data, and three shards for each index. Heat release data Index of 3 machines and cooling data Index of the other 3 machines.
In this case, you spend most of your time accessing the hot Index data, and the hot data may account for 10% of the total data volume. At this point, the hot data volume is relatively small, and almost all of it is stored in Filesystem Cache, ensuring high hot data access performance.
But for the cold data, it’s in a different Index, it’s not on the same machine as the hot data Index, so we don’t have any contact with each other.
If someone accesses cold data, maybe a lot of the data is on disk, and the performance is pretty low, then 10% of the people are accessing cold data, 90% of the people are accessing hot data, and it doesn’t matter.
Document Model design
With MySQL, we often have complex associative queries. How to play in ES, ES inside the complex associated query as far as possible do not use, once used performance is generally not good.
It is best to complete the association in the Java system first and write the associated data directly to ES. When searching, there is no need to use ES search syntax to complete associative search like Join.
Document model design is very important, a lot of operations, don’t want to perform all sorts of complicated messy operations when searching.
There are only so many operations that ES can support. Don’t even think about using ES to do things that it can’t do well. If you do that, try to do it when you design the Document model, when you write it.
In addition, some complicated operations, such as join/nested/parent-child search, should be avoided as much as possible, because the performance will be poor.
Paging performance optimization
ES is a bad pagination, why? For example, if you have 10 pieces of data per page and you want to query page 100, you are actually looking up the first 1000 pieces of data stored on each Shard to a coordination node.
If you have five shards, you have 5000 pieces of data, and then the coordination nodes do some merging and processing of those 5000 pieces of data to get to the final 10 pieces of data on page 100.
Distributed, you want to look up 10 pieces of data on page 100, you can’t say that from 5 shards, each Shard looks up 2 pieces of data, and then finally the coordinated node merges into 10 pieces of data.
You have to look up 1,000 entries from each Shard, sort, filter, and so on according to your needs, and then page out again to get to page 100.
As you turn pages, the deeper you go, the more data each Shard returns, and the longer it takes to coordinate node processing, which is very frustrating. So when you do pages in ES, you’ll see that as you go to the back, it slows down.
We have also encountered this problem before, using ES as a page, the first few pages will be tens of milliseconds, when turning to 10 or dozens of pages, it will take 5 to 10 seconds to find a page of data.
Is there a solution? Deep paging (poor performance by default) is not allowed. Tell the product manager that your system does not allow you to turn pages that deep, and the deeper you turn pages by default, the worse your performance will be.
Similar to the recommended products in the App constantly pull down page after page; Just like in weibo, you can Scroll down to Scroll page by page, and you can use Scroll API to search online for how to use it.
Scroll is going to give you a snapshot of all of your data at once, and then every time you Scroll backwards you’re going to Scroll through the cursor scroll_id to get the next page, the next page, the performance is going to be much, much better than the paging performance, which is basically in milliseconds.
The only thing, though, is that it works in a twitter-like pull-down page-flipping scenario where you can’t jump to any page.
That is, you can’t go to page 10, then to page 120, and then back to page 58.
So a lot of products right now, they don’t allow you to turn pages, apps, and some websites, all you can do is scroll down, page by page.
The Scroll parameter must be specified during initialization to tell ES how long to save the context of this search. You need to make sure that users don’t keep scrolling for hours on end, or they may fail due to timeout.
In addition to using the Scroll API, you can also do this using search_after. The idea of search_after is to use the results of the previous page to help retrieve the data on the next page.
Obviously, this method also doesn’t allow you to turn pages at random, you have to turn the page backwards. At initialization, you need to use a field with a unique value as the Sort field.
END
Personal public account: Architecture Notes of Huishania (ID: Shishan100)
Welcome to long press the picture below to pay attention to the public number: Huoia architecture notes!
The official number backstage replies the information, obtains the author exclusive secret system study material
Architecture notes, BAT architecture experience taught each other