Advanced Application of Distributed Search Engine ElasticSearch (5)

IK word divider

Install IK word segmentation plug-ins

Download address

Perform the installation

To install the plug-in, go to the ES installation directory and run the following command:

[elsearch@localhost plugins]$.. /bin/elasticsearch-plugin install file:///usr/local/ elasticsearch - 7.10.2 / elasticsearch - analysis - ik - 7.10.2. ZipCopy the code

After the installation is successful, the following information is displayed:

-> Installing file:///usr/local/ elasticsearch - 7.10.2 / elasticsearch - analysis - ik - 7.10.2. Zip - > Downloading file:///usr/local/ elasticsearch - 7.10.2 / elasticsearch - analysis - ik - 7.10.2. Zip [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 100% @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: plugin requires additional permissions @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ * java.net.SocketPermission * connect,resolve See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.htmlfor descriptions of what these permissions allow and the associated risks.

Continue with installation? [y/N]y
-> Installed analysis-ik
Copy the code

Restart the ElasticSearch service

Test IK word divider

Standard participle:

GET _analyze? pretty {"analyzer": "standard"."text": "I love my country"
}
Copy the code

Using IK intelligent word segmentation:

GET _analyze? pretty {"analyzer": "ik_smart"."text": "I love my country"
}
Copy the code

IK maximization participle:

GET _analyze? pretty {"analyzer": "ik_max_word"."text": "I love my country and my hometown."
}
Copy the code

Best use of IK word divider

Analyzer specifies the participle for building the index, and search_Analyzer specifies the participle for searching the keyword.

In practice, max_word is used in index construction to maximize word segmentation; When querying, use SmartWord intelligent segmentation, which can match the results to the greatest extent.
```
PUT / orders_test {
	"mappings": {
		"properties": {			
			"goodsName": {
				"type": "text"."analyzer": "ik_max_word"."search_analyzer": "ik_smart"}}}}Copy the code
```

Second, full index construction

Download the logstash

Download address
Install the logstash-input- JDBC plug-in

Go to the logstash directory and execute:
```
bin/logstash-plugin install logstash-input-jdbc
Copy the code
```

Configure the mysql driver package

[root@localhost bin]# mkdir mysql 
[root@localhost bin]# cp mysql connector - Java - 5.1.34. Jar/usr/local/logstash 7.10.2 / bin/mysql /
Copy the code

Configuring JDBC Connections

Create index data from mysql by using the SELECT statement and import elasticSearch by using the logstuck-input-JDBC configuration file.

Create jdbc.conf and jdbc.sql files in /usr/local/logstuck-7.10.2 /bin/mysql/.

JDBC. Conf file:

input {
    stdin {
    }
    jdbc {
        Mysql > select * from users
        jdbc_connection_string => "JDBC: mysql: / / 127.0.0.1:3306 / users"
        User name and password
        jdbc_user => "root"
        jdbc_password => "654321"
        # driver
        jdbc_driver_library => "/usr/local/logstash-7.10.2/bin/mysql/mysql-connector-java-5.1.34.jar"com.mysql.jdbc.Driver" jdbc_paging_enabled => "true" jdbc_page_size => "50000"# SQL filepath + name statement_filepath =>/usr/local/ logstash - 7.10.2 / bin/mysql/JDBC SQL"# update schedule every minute =>" # update schedule every minute =>"* * * * *"}} output {elasticSearch {#ES host => ["10.10.20.28:9200"] # index =>"users" document_type => "_doc[document_id, document_id, document_id, document_id, document_id]%{id}"{#JSON format output codec => json_lines}}Copy the code

JDBC. SQL file:

select `id`, `institutionTypeId`, `menuCode`, `menuName`, `menuUri`, `menuLevel`, `componentSrc` from t_authority_menu
Copy the code

Create ES index

PUT /users? include_type_name=false
{
	"settings": {
		"number_of_shards": 1,
		"number_of_replicas": 0}."mappings": {
		"properties": {
			"id": {
				"type": "integer"
			},
			"institutionTypeId": {
				"type": "text"."analyzer": "whitespace"."fielddata": true
			},
			"menuCode": {
				"type": "text"
			},
			"menuName": {
				"type": "text"."analyzer": "ik_max_word"."search_analyzer": "ik_smart"
			},
			"menuUri": {
				"type": "text"
			},
			"menuLevel": {
				"type": "integer"
			},
			"componentSrc": {
				"type": "text"}}}}Copy the code

Perform full synchronization

Execute command:

./logstash -f mysql/jdbc.conf
Copy the code

Inspection Results:

GET /users/_search
Copy the code

Incremental index synchronization

Modify the jdbc.conf configuration file:

Add configuration:

input {
	jdbc{
		# set the timezone
		jdbc_default_timezone => "Asia/Shanghai".Incremental synchronization attribute identifier
		last_run_metadata_path => "/ usr/local/logstash 7.10.2 / bin/mysql/last_value"}}Copy the code

Modify the jdbc.sql configuration file

Here is an incremental synchronization based on the last update time:

select `id`, `institutionTypeId`, `menuCode`, `menuName`, `menuUri`, `menuLevel`, `componentSrc` from t_authority_menu where last_udpate_time > :sql_last_value
Copy the code

Create synchronization last record time

vi /usr/local/ logstash - 7.10.2 / bin/mysql/last_valueCopy the code

Given an initial time:

2020-01-01 00:00:00
Copy the code

validation

1) Start the Logstash and load the corresponding data according to the initial time.

2) If the data update time is changed, logStash automatically detects and synchronizes the incremental data to ES.

This article was created and shared by Mirson. For further communication, please add to QQ group 19310171 or visit www.softart.cn

Advanced Application of Distributed Search Engine ElasticSearch (5)

IK word divider

Second, full index construction

Incremental index synchronization

Related Posts

Proxy patterns for design patterns

Java Synchronized heavyweight lock principle in-depth analysis (synchronization)

Relationships and differences between the JMM and the happens-before model