This is the 15th day of my participation in the August More Text Challenge. For details, see:August is more challenging
Introduction to the
Elasticsearch is a Lucene-based search server. It provides a distributed multi-user capable full-text search engine based on a RESTful Web interface. Developed in the Java language and distributed as open source under the Apache license, Elasticsearch is a popular enterprise search engine. Elasticsearch is designed for cloud computing. It is stable, reliable, fast and easy to install and use. Official clients are available in Java,.net (C#), PHP, Python, Apache Groovy, Ruby, and many other languages. According to DB-Engines, Elasticsearch is the most popular enterprise search engine, followed by Apache Solr, which is also based on Lucene.
Core concept of ES
ElasticSearch is a document-oriented database where a piece of data is a document. Such as:
{
"name" : "John",
"sex" : "Male",
"age" : 25,
"birthDate": "1990/05/01",
"about" : "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
Copy the code
In MySql, it is easy to think of creating a User table that has some fields, which in ES is a document. The document will belong to a User type, and the various types will be stored in an index. The following table is the comparison table of neglect between relational database and ES:
Relational database | ElasticSearch |
---|---|
The database | The index |
table | type |
line | document |
column | field |
Es can contain multiple indexes (databases), and each index can contain multiple types (tables), and each type can contain multiple documents (rows), and each document can contain multiple fields (columns).
Physical design:
In the background, ES divides each index into multiple shards, and each shard can be moved between different servers in the cluster.
Logical design:
An index type contains multiple documents. When we index a document, we can find it in this order: index – “type -” document ID (the ID is actually a string), by this combination we can index a specific document.
Create Index
Create an index named news: from elasticsearch import Elasticsearch es = Elasticsearch() result = es.indices.create(index='news', ignore=400) print(result)Copy the code
If the creation is successful, the following result is returned:
{
"acknowledged":true,
"shards_acknowledged":true,
"index":"news"
}
Copy the code
The result is in JSON format, where the “acknowledged” field indicates that the creation operation was successful.
But if we execute the code again, we will return something like this:
{
"error":{
"root_cause":[
{
"type":"resource_already_exists_exception",
"reason":"index [news/QM6yz2W8QE-bflKhc5oThw] already exists",
"index_uuid":"QM6yz2W8QE-bflKhc5oThw",
"index":"news"
}
],
"type":"resource_already_exists_exception",
"reason":"index [news/QM6yz2W8QE-bflKhc5oThw] already exists",
"index_uuid":"QM6yz2W8QE-bflKhc5oThw",
"index":"news"
},
"status":400
}
Copy the code
The status code is 400. The reason is that the Index already exists.
Notice that our code uses the ignore argument of 400, which means that if the return value is 400, the error will be ignored and the program will not throw an exception.
If we don’t use ignore:
es = Elasticsearch()
result = es.indices.create(index='news')
print(result)
Copy the code
When executed again, an error is reported:
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, 'resource_already_exists_exception', 'index [news/QM6yz2W8QE-bflKhc5oThw] already exists')
Copy the code
This will cause problems in the execution of the program, so we need to make good use of the ignore parameter to exclude some unexpected situations, so that the normal execution of the program will not interrupt.
Remove the Index
Deleting Index is similar with the following code:
from elasticsearch import Elasticsearch
es = Elasticsearch()
result = es.indices.delete(index='news', ignore=[404])
print(result)
Copy the code
The ignore parameter is also used to ignore the problem that the deletion fails because the Index does not exist.
If the deletion is successful, the following information is displayed:
{
"acknowledged":true
}
Copy the code
Insert data
Just like MongoDB, Elasticsearch can insert structured dictionary data directly when inserting data.
from elasticsearch import Elasticsearch es = Elasticsearch() es.indices.create(index='news', Ignore =400) data = {'title': 'is Iraq a mess ', 'url': 'http://view.news.qq.com/zt2011/usa_iraq/index.htm'} result = es.create(index='news', id=1, body=data) print(result)Copy the code
When the create() method is called, the index parameter represents the name of the index, the body represents the content of the document, and the ID is the unique id of the data.
The running results are as follows:
{
"_index":"news",
"_type":"_doc",
"_id":"1",
"_version":1,
"result":"created",
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"_seq_no":0,
"_primary_term":1
}
Copy the code
In the result, if the result field is created, the data is successfully inserted.