Elasticsearch

The main development mode is to run Elasticsearch service in the background, but use Kibana interface for data operation

Document Data format

Document-oriented search and analysis engine

(1) The data structure of the application system is object-oriented and complex

(2) Object data stored in the database, can only be disassembled, into a flat multiple tables, each time the query has to restore the object format, quite troublesome

(3) ES is document-oriented, and the data structure stored in the document is the same as that of object-oriented data. Based on this document data structure, ES can provide complex indexes, full-text retrieval, analysis and aggregation and other functions

(4) Document of ES is expressed in JSON data format

public class Employee {

  private String email;
  private String firstName;
  private String lastName;
  private EmployeeInfo info;
  private Date joinDate;

}

private class EmployeeInfo {

  private String bio; / / characterprivate Integer age;
  private String[] interests; // Hobbies and interests}EmployeeInfo info = new EmployeeInfo(a);info.setBio("curious and modest");
info.setAge(30);
info.setInterests(new String[]{"bike"."climb"});

Employee employee = new Employee(a);employee.setEmail("[email protected]");
employee.setFirstName("san");
employee.setLastName("zhang");
employee.setInfo(info);
employee.setJoinDate(new Date());
Copy the code
  • Employee object: Contains the Employee class’s own attributes, as well as an EmployeeInfo object

    • The employee table and employee_info table are separated into employee and EmployeeInfo tables

    • Table employee: email, first_name, last_name, join_date

    • Table employee_INFO: bio, AGE, interests; There is also a foreign key field, such as Employee_ID, associated with the EMPLOYEE table

To build a document object using JSON, we need to split different data into different table structures for storage. When we extract information to build a complete object, we need to extract data from different tables for stitching."email":      "[email protected]"."first_name": "san"."last_name": "zhang"."info": {
        "bio":"curious and modest"."age":30."interests": [ "bike"."climb"]},"join_date": "2017/01/01"
}
Copy the code

We understand the difference between the DOCUMENT data format of ES and the relational data format of the database

The difference is that relational databases store flat data formats, whereas ES stores object data formats (JSON).


E-commerce website commodity management case background introduction

An e-commerce website needs to build a background system based on ES to provide the following functions:

  1. CRUD (add, delete, change and check) operation on commodity information
  2. Perform simple structured queries
  3. You can perform simple full-text searches, as well as complex phrase searches
  4. For full-text search results, you can highlight them
  5. Perform a simple aggregation analysis of the data

Simple cluster management (Cluster operations)

(1) Quickly check the cluster health status

Es provides a set of apis, called the CAT API, to view a wide variety of data in ES

GET /_cat/health? v epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent1624883165 20:26:05  elasticsearch yellow          1         1      1   1    0    0        1             0                  -                 50.0%
Copy the code

How can I quickly learn about the health of a cluster? Green, yellow, red?

  • Green: The Primary shard and Replica Shard in each index are active
  • Yellow: The primary shard in each index is active, but some replica shards are not active and are unavailable
  • Red: Not all primary shards of the index are active, some index data is missing

Why is it now in a yellow state?

We now have a laptop and we have started an ES process, which is equivalent to just one Node. There is now an index in ES, which is built into Kibana’s own index. Since the default configuration is to allocate 5 primary shards and 5 Replica shards to each index, the primary shards and replica shards cannot be on the same machine (for fault tolerance). Now kibana’s own indexes are 1 primary shard and 1 Replica shard. Currently there is only one node, so only one primary shard is allocated and started, but one Replica shard has no second machine to start.

Do a small experiment: at this time, as long as the second ES process is started, there will be 2 nodes in the ES cluster, then the 1 Replica shard will be automatically allocated, and the cluster status will become green.

Copy the Elasticsearch installation file to another location and start the file again. The replica shard and the primary Shard are the same as the Replica Shard of the Elasticsearch service

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1624883500 20:31:40  elasticsearch green           2         2      2   1    0    0        0             0                  -                100.0%
Copy the code

Simple index operations (The index operation)

(1) Quickly check the indexes in the cluster

GET /_cat/indices? v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellowopen   .kibana Gc-JUpQqSEuHJDfX0Kn67Q   1   1          1            0      3.1kb          3.1kb
Copy the code

(2) Simple index operation

Create index:

PUT /test_index? pretty {"acknowledged": true,
  "shards_acknowledged": true
}
Copy the code

View the result:

health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana    Gc-JUpQqSEuHJDfX0Kn67Q   1   1          1            0      3.1kb          3.1kb
yellow open   test_index 2VqkJ6TpTSmCgca91f6atA   5   1          0            0       650b           650b
Copy the code

Delete index:

DELETE /test_index? pretty {"acknowledged": true
}
Copy the code

View the delete result:

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana rUm9n9wMRQCCrRDEhqneBg   1   1          1            0      3.1kb          3.1kb
Copy the code

Document operation (Document operation)

CRUD operations for goods

(1) New products: add documents and build indexes

PUT /index/type/id
{
  "The json data"
}
Copy the code
PUT /ecommerce/product/1
{
    "name" : "gaolujie yagao"."desc" :  "gaoxiao meibai"."price" :  30."producer" :      "gaolujie producer"."tags": [ "meibai"."fangzhu"}}"_index": "ecommerce"."_type": "product"."_id": "1"."_version": 1."result": "created"."_shards": {
    "total": 2."successful": 1."failed": 0
  },
  "created": true
}

# create an index automatically
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana   Gc-JUpQqSEuHJDfX0Kn67Q   1   1          1            0      3.1kb          3.1kb
yellow open   ecommerce -yTgL5L5T7mvN5aPsNZO6g   5   1          1            0        6kb            6kb
Copy the code
PUT /ecommerce/product/2
{
    "name" : "jiajieshi yagao"."desc" :  "youxiao fangzhu"."price" :  25."producer" :      "jiajieshi producer"."tags": [ "fangzhu"]}Copy the code
PUT /ecommerce/product/3
{
    "name" : "zhonghua yagao"."desc" :  "caoben zhiwu"."price" :  40."producer" :      "zhonghua producer"."tags": [ "qingxin"]}Copy the code

Es creates index and type automatically. It does not need to be created in advance, and es is correct by defaultDocument Each field has an inverted indexTo make it searchable

(2) Product query: retrieve documents

GET /index/type/id

GET /ecommerce/product/1

{
  "_index": "ecommerce"."_type": "product"."_id": "1"."_version": 1."found": true,
  "_source": {
    "name": "gaolujie yagao"."desc": "gaoxiao meibai"."price": 30."producer": "gaolujie producer"."tags": [
      "meibai"."fangzhu"]}}Copy the code

(3) Modify goods: replace documents

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao"."desc" :  "gaoxiao meibai"."price" :  30."producer" :      "gaolujie producer"."tags": [ "meibai"."fangzhu"]} {"_index": "ecommerce"."_type": "product"."_id": "1"."_version": 2."result": "updated".# Note that this place is update
  "_shards": {
    "total": 2."successful": 1."failed": 0
  },
  "created": false
}
Copy the code

The replacement method has a disadvantage, even if you have to bring all the fields, to modify the information

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao"} theid=1The item has only one field leftCopy the code

(4) Modify goods: update documents

POST /ecommerce/product/1/_update changes a single field, suitable for small changes to data {"doc": {
    "name": "jiaqiangban gaolujie yagao"}} {"_index": "ecommerce"."_type": "product"."_id": "1"."_version": 3.# Note that this place is constantly changing
  "result": "updated"."_shards": {
    "total": 2."successful": 1."failed": 0}}Copy the code

(5) Delete goods: delete documents

DELETE /ecommerce/product/1

{
  "found": true,
  "_index": "ecommerce"."_type": "product"."_id": "1"."_version": 4."result": "deleted"."_shards": {
    "total": 2."successful": 1."failed": 0Float/float/float/float1
{
  "_index": "ecommerce"."_type": "product"."_id": "1"."found": false
}
Copy the code