Windows 10 system based on Docker configuration ElasticSearch7 with Python3 for full text search interaction

The original article is reprinted from Liu Yue’s Technology Blog https://v3u.cn/a_id_166

Document based full-text search engine is not new to everyone, previous article: Rediseach has replaced ElasticSearch as an old full-text search engine, but ElasticSearch hasn’t gone away yet. Alibaba also launched the Elasticsearch engine in its ECS service, so this time we will rely on Docker to feel the charm of Elasticsearch in Win10 system.

First, install Docker. For details, please refer to: Play with DockerToolbox under Win10 system and replace the domestic mirror source (various chars), which will not be described here.

Pull the ElasticSearch image, here we are using a version above 7.0, which is optimized for performance and efficiency.

Docker pull elasticsearch: 7.2.0

Then run the ElasticSearch image

Docker run --name es-p 9300/9300/9300-e "discovery.type=single-node" -d elasticsearch:7.2.0

We will use the abbreviation ES instead of the container alias and interact with the cluster via port 9200 using Elasticsearch’s native transport protocol. The nodes in the cluster communicate with each other over port 9300. If this port is not open, the nodes will not be able to form a cluster and the run-time mode will go to single-node mode first.

After successfully starting the container, you can visit your browser: http://localhost:9200

Elasticsearch uses a YAML file to configure Elasticsearch. It’s as simple as Django’s Settings or Flask’s Config. Just tell Elasticsearch what you want to do when it’s running. ElasticSearch will go to elasticsearch.yml and run the service with the parameters you specified.

Now we need to copy the Elasticsearch configuration file inside the container so that we can start the container with the configuration we specified.

Docker cp container id: / usr/share/elasticsearch/config/elasticsearch. Yml. / elasticsearch. Yml

As usual, this is the container address, and this is the host address, so I’ll just copy it to the current directory, but you can also specify the absolute path.

YML allows for cross-domain access to ElasticSearch. This is the first step in building a full-text search system for micro-services.

Cluster. name: "docker-cluster" network.host: 0.0.0.0 http.cors.enabled: true http.cors.allow-origin: "*"

Then stop the running ElasticSearch container and delete it.

Docker rm $(docker ps-a-q) docker rm $(docker ps-a-q) docker rm $(docker ps-a-q)

This time we need to mount the elasticsearch.yml into the container using the -v mount command so that the Elasticsearch service will run against the configuration file we have modified.

docker run --name es -v /es/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -p 9200:9200 -p 9300:9300 -E "Discovery. Type =single-node" -D Elasticsearch :7.2.0

Here you need to pay attention to a point, is in the Win10 host need to set a separate shared folder, here I set the shared folder called ES, if it is CentOS or Mac OS directly write the real physical path.

Let’s go to the VirtualBox Settings and create a new shared folder, es

Then, restart Docker and enter a command into the default container: docker-machine SSH default

If you can see the shared folder you just set at the container root, the setup is successful.

Elasticsearch can also be mounted using the -v command. If you do not mount the Elasticsearch data, it will no longer exist when the container is stopped or deleted, so it is better to mount it on the host:

docker run --name es -v /es/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v / es/data: / usr/share/elasticsearch/data - p, 9200:9200-9300: p. 9300 - e discovery. Type = "single - node" - d elasticsearch: 7.2.0

After successfully starting the container again, we can use Python3 to interact with Elasticsearch and install the dependent libraries.

pip3 install elasticsearch

Create a new es_test.py test script

Create a search instance for ElasticSearch

From elasticsearch import elasticsearch es = elasticsearch (hosts=[{"host":' host ', "port": 9200}])

If you are using Windows 10, you will be able to write a list of the host IP (127.0.0.1). If you are using Windows 10, you will be able to write a list of the host IP (127.0.0.1).

Create an Index. Here we create an Index called article

result = es.indices.create(index='article', ignore=400)  
print(result)

  
{'acknowledged': True, 'shards_acknowledged': True, 'index': 'article'}

The “acknowledged” field indicates that the creation operation was successful.

Deleting an index is similar, as follows:

result = es.indices.delete(index='article', ignore=[400, 404])  
print(result)  
  
{'acknowledged':True}

Elasticsearch (Elasticsearch) is an Elasticsearch (Elasticsearch) tool that allows you to insert data into a structured dictionary (Elasticsearch). You can also insert data into a structured dictionary (Elasticsearch) by calling the index() method.

Data = {'title': 'I'm learning AI in Beijing ', 'url': } result = es.index(index='article',body=data) print(result) {'_index': 'article', '_type': '_doc', '_id': 'GyJgb3MBuQaE6wYOApTh', '_version': 1, 'result': 'created', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 5, '_primary_term': 1}

You can see that the index() method automatically generates a unique ID. You can also create() the data, but create() requires you to specify an ID manually.

Data modification is also very simple, we need to specify the data ID and content, call the index() method, the code is as follows:

Result = es.index(index='article',body=data, id='GyJgb3MBuQaE6wYOApTh') {'_index': 'article', '_type': '_doc', '_id': 'GyJgb3MBuQaE6wYOApTh', '_version': 2, 'result': 'updated', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 6, '_primary_term': 1}

To delete data, you can call the delete() method, specifying the ID of the data to delete

Print (result) {'_index': 'article', '_type': result) print(index='article', '_type': result) '_doc', '_id': 'GyJgb3MBuQaE6wYOApTh', '_version': 3, 'result': 'deleted', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 7, '_primary_term': 1}

To query data, you can simply query the full amount of data:

Search (index='article') print(result) {'took': 1079, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 5, 'relation': 'eq'}, 'max_score': 1.0, the 'hits' : [{' _index' : 'article', '_type' : 'blog', '_id' : '1', '_score: 1.0, "_source" : {" title ": 'I study of artificial intelligence in Beijing', 'url' : 'http://123.com' and 'content' : 'study in Beijing'}}, {' _index ':' article ', 'a _type' : 'blog', '_id' : 'FyIdb3MBuQaE6wYO8JQR', '_score: 1.0, "_source" : {' title:' hello, 'content' : 'hello 123'}}, {' _index ':' article ', 'a _type' : 'blog', '_id' : 'GCIeb3MBuQaE6wYOnpSv', '_score: 1.0, "_source" : {' title:' hello, 'url' : 'http://123.com' and 'content' : 'hello, 123'}}, {' _index ':' article ', 'a _type' : 'blog', '_id' : 'GSJfb3MBuQaE6wYOu5RD', '_score: 1.0, "_source" : {" title ": 'hello,' url ':' http://123.com 'and' content ':' hello 123 '}}, {' _index ':' article ', 'a _type' : 'blog', '_id' : 'GiJfb3MBuQaE6wYO5pR4', '_score: 1.0, "_source" : {' title:' hello, 'url' : 'http://123.com' and 'content' : 'hello 123'}}}}]

You can also do full-text search, which is where ElasticSearch comes in.

mapping = { 'query': { 'match': { 'content': }} result = es.search(index='article',body=mapping) print(result) {'took': 4, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'} 'max_score: 4.075481,' hits' : [{' _index ':' article ', 'a _type' : 'blog', '_id' : '1', '_score' : 4.075481, "_source" : {' title ':' I was in Beijing to learn artificial intelligence ', 'url' : 'http://123.com' and 'content' : 'study in Beijing'}}}}]

It can be seen that the search will search the full text of the corresponding fields, and the results will be sorted according to the relevance of the search keywords, which is a basic search engine prototype.

In addition to these basic operations, Elasticsearch also supports many complex queries, can refer to the latest version 7.2 documentation: https://www.elastic.co/guide/…

Edict: After playing Elasticsearch, some people say it’s really good, can we just drop MySQL or Mongo and just use it as a database? Of course, the answer is impossible, because there is no transaction, Elasticsearch and real-time query is near, write speed is slow, just read fast, cost also is higher than the database, by eating almost to the memory to improve performance, it is only as a search engine, if your business involves the full-text retrieval, then it is one of your preferred options.

The original article is reprinted from Liu Yue’s Technology Blog https://v3u.cn/a_id_166

Windows 10 system based on Docker configuration ElasticSearch7 with Python3 for full text search interaction

Related Posts

Python functions: Python functions: Python functions: Python functions: Python functions: Python functions: Python functions: Python functions:

COMP9021 python key

Web.py source code analysis: template (2)