This is the 9th day of my participation in the August Wen Challenge.More challenges in August
ElasticSearch overview
ElasticSearch (es for short) is an open source, highly extensible, distributed full text search engine that can store and retrieve data in near real time. ; Its scalability is very good, can be extended to hundreds of servers, processing PB level (big data era) of data. Es is also developed in JAVA and uses Lucene as the core for all indexing and search functions, but it aims to hide the complexity of Lucene with a simple RESTful API to make full-text search easy.
ElasticSearch and Solr comparison
2. Solr has a significant advantage over ElasticSearch due to IO blocking in real-time index creation. 3. ElasticSearch doesn’t change significantly. 4, Es is pretty easy to install, Solr is a bit more complicated to install. 5, Slor uses Zookeeper for distributed management, and ElasticSearch comes with distributed coordination. ElasticSearch supports JSON file format, XML file format, CVS file format, and Sorl file format. For example, the graphical interface needs Kibana friendly support 8, Solr query fast, but the update index is slow (insertion and deletion is slow) for e-commerce and other queries more applications ES build index block (query slow) that is, real-time query block, for Facebook sina and other searches Solr is a powerful solution for traditional search applications, ElasticSearch, however, uses newer operations. Solr is mature and has a larger, more mature community of users, developers and contributors, whereas ElasticSearch has fewer developers and maintainers, updates are too fast, and learning costs are high.
Install ElasticSearch and Head
1. Download it from the official website Download your own version, I download is Windows after downloading
The directory structure
Bin: Options Java VM configuration file ElasticSearch. Yml ElasticSearch configuration file Lib Jar package logs Modules Plugins plugin ik
Click Start in bin
access
Install the head plugin
Download at github github.com/mobz/elasti…
Simple front page [bug Mc-10866] – elasticSearch.yml [bug Mc-10866] – elasticSearch.yml You can do a simple thing
** Understand ELK **
ELK is an acronym for ElasticSearch, LogStash, and Kibana. ElasticSearch is a distributed, Restful search platform based on Lucene. Logstash is a central data stream engine for Elk. It collects data in different formats from different targets (file/data store /MQ) and filters it to different destinations. Kibana can display ElasticSearch data through friendly pages and provide real-time analysis.
** Install Kibana **
downloadAfter decompressing, click bat directlyMine is Chinese because I’m in
Chinese encoding is set here
Core concepts of ES
Understand what clusters, nodes, indexes, types, documents, mappings are?ElasticSearch (cluster) can contain multiple indexes, each index can contain multiple types (table), each type can contain multiple documents (row), each document can contain multiple fields (column).
Physical design
ElasticSearch splits each index into shards behind the scenes, and each shard can be moved between different servers in the cluster.
Logic design
There are multiple documents in an index type. This is the order in which we find the index type document ID id doesn’t have to be an integer, it could actually be a string
The document
ElasticSearch is document oriented, meaning that the smallest unit of indexing and searching is the document. ElasticSearch has several important attributes: 1. ElasticSearch can be a hierarchical database that contains multiple logical entities within a single document, which is where complex logical entities come from. ElasticSearch automatically maps this relationship
type
Types are logical containers for documents, and just like a relational database, tables are containers for rows. ElasticSearch does map its own types. But if our age field is 18, ElasticSearch might think it’s an integer. So the safest thing is we also need to map well in advance. This is a bit like a relational database.
The index
An index is a container for a mapping type. An index in ElasticSearch is a very large collection of documents that stores fields and other Settings for the mapping type and is then assigned to each node.
How does the analysis node work?
A cluster has at least one node, and a node is an ElasticSearch process. A node can have multiple indexes. By default, if you create an index, the index will be composed of 5 shards, and each master shard will have a copyIn fact, a shard is a Lucene index, a file directory that contains a file index in reverse order. The to-sort structure allows ElasticSearch to tell you which documents contain a particular keyword without scanning the entire document.
case
Suppose we now search for blog posts by post-it notes, then the sort index list looks like thisIf you want to search for articles with Python tags, it will be much faster to find the sorted indexed data than to find all the raw data. Just look at the tags column and get the relevant article ID.