This is the 20th day of my participation in The August Wenwen Challenge.

Note: Following Elasticsearch note (9)

The depth of the page

Paging query

#POST     /{index}/_doc/_search
{
    "query": {
        "match_all": {}
    },
    "from": 0,
    "size": 10
}
Copy the code

The depth of the page

Deep pagination is the depth of the search, for example, page 1, page 2, page 10, page 20, is shallow; Page 10,000. Page 20,000 is pretty deep.

Use the following operations:

{
    "query": {
        "match_all": {}
    },
    "from": 9990,
    "size": 10
}
​
{
    "query": {
        "match_all": {}
    },
    "from": 9999,
    "size": 10
}
Copy the code

When we obtain 9999th to 10009 pieces of data, in fact, each fragment will get 10009 pieces of data, and then gather together, a total of 10009*3=30027 pieces of data. Sorting processing for 30027 data will finally obtain the last 10 pieces of data.

Searching too deeply in this way can cause performance problems, eating up memory and CPU. And for performance, ES does not support paging queries that exceed 10,000 data. So how do you address the performance of deep paging? In fact, we should avoid the deep paging operation (limit the number of paging pages), for example, we can only provide 100 pages at most, and the display will disappear from page 101. After all, users will not search so deeply. When we search Taobao or Baidu, we usually only see about 10 pages at most.

For example, Taobao search limits the paging to 100 pages at most, as follows:

Increase search volume

Break 10000 by setting index.max_result_window

Max_result_window #PUT /{index}/_settings {"index. Max_result_window ": "20000"}Copy the code

Scroll search

Querying 10,000 + data at a time often affects performance because there is too much data. In this case, you can use scroll search, or scroll. A scroll search can first query some data, and then follow the query down. There is a scroll ID on the first query, which is equivalent to an anchor tag, and a subsequent scroll will require the anchor tag from the previous search, based on which the next search request is made. Each search is based on a historical snapshot of the data. If there is data change during the query, it has nothing to do with the search. The search content is still data in the snapshot.

  • Scroll is equal to 1m, which is equivalent to the session time of a session, and the time for searching and holding the context is 1 minute.
# POST /{index}/_search? Scroll = 1 m {" query ": {" match_all" : {}}, "sort" : [" _doc] ", "size" : 5} the second query # POST / _search/scroll {" scroll ": "1m", "scroll_id" : "your last scroll_id" }Copy the code

Official documents address: www.elastic.co/guide/cn/el…

Bulk operations

The basic grammar

The BULK operation is different from the usual request format. Do not format the JSON, otherwise it will not be on the same line, so be careful.

{ action: { metadata }}\n
{ request body        }\n
{ action: { metadata }}\n
{ request body        }\n
...
Copy the code
  • { action: { metadata }}Indicates the batch operation type, which can be added, deleted, or modified
  • \nIs a specification that must be filled in at the end of each line, including the last line, for es parsing
  • { request body }Is the request body, which is required for add and modify operations, but not for delete operations

Batch operation type

Action must be one of the following:

  • Create: If the document does not exist, create it. An error will be reported. The exception does not affect other operations.
  • Index: Creates a new document or replaces an existing document.
  • Update: Partially updates a document.
  • Delete: Deletes a document.

Metadata specifies the _index, _type, and _id of the document to be operated on. _index and _type can also be specified in the URL

In field

  • Create Adds document data, specifying index and type in metadata

    #POST /_bulk {"create": {"_index": "user", "_type": "_doc", "_id": "2001"}} {"id": "2001", "nickname": "name2001"} {"create": {"_index": "user", "_type": "_doc", "_id": "2002"}} {"id": "2002", "nickname": "name2002"} {"create": {"_index": "user", "_type": "_doc", "_id": "2003"}} {"id": "2003", "nickname": "Name2003" #} {return results "took" : 973, "errors" : false, "items" : [{" create ": {" _index" : "user", "_type" : "_doc", "_id" : "2001", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 1, "status": 201 } }, { "create": { "_index": "user", "_type": "_doc", "_id": "2002", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 1, "status": 201 } }, { "create": { "_index": "user", "_type": "_doc", "_id": "2003", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 2, "_primary_term": 1, "status": 201}}]} #Copy the code
  • Create creates an existing ID document, specifying index and type in the URL

    #POST    /{index}/_doc/_bulk
    ​
    #POST    /user/_doc/_bulk
    {"create": {"_index": "user", "_id": "2001"}}
    {"id": "2001", "nickname": "name2001new"}
    {"create": {"_index": "user",  "_id": "2004"}}
    {"id": "2002", "nickname": "name2004"}
    {"create": {"_index": "user", "_id": "2005"}}
    {"id": "2003", "nickname": "name2005"}
    Copy the code
  • Index is created. Existing document ids will be overwritten. Non-existing document ids will be added

    #POST    /shop/_doc/_bulk
    {"index": {"_id": "2004"}}
    {"id": "2004", "nickname": "index2004"}
    {"index": {"_id": "2007"}}
    {"id": "2007", "nickname": "name2007"}
    {"index": {"_id": "2008"}}
    {"id": "2008", "nickname": "name2008"}
    Copy the code
  • Update and update partial document data

# POST    /{index}/_doc/_bulk
{"update": {"_id": "2004"}}
{"doc":{ "id": "3004"}}
{"update": {"_id": "2007"}}
{"doc":{ "nickname": "nameupdate"}}
Copy the code
  • Delete Batch delete
#POST    /{index}/_doc/_bulk
{"delete": {"_id": "2004"}}
{"delete": {"_id": "2007"}}
Copy the code
  • Comprehensive batch operation
  #POST    /{index}/_doc/_bulk
  {"create": {"_id": "8001"}}
  {"id": "8001", "nickname": "name8001"}
  {"update": {"_id": "2001"}}
  {"doc":{ "id": "20010"}}
  {"delete": {"_id": "2003"}}
  {"delete": {"_id": "2005"}}
Copy the code