preface

This article mainly introduces the ELK log system entry and use tutorial.

ELK is introduced

ELK stands for Elasticsearch, Logstash, and Kibana, all of which are open source software. A new FileBeat is added, which is a lightweight log collection and processing tool (Agent). FileBeat consumes less resources and is suitable for collecting logs from various servers and transferring them to Logstash. This tool is also recommended by the official.

  • Elasticsearch is an open source distributed search engine that collects, analyzes, and stores data. Its features are: distributed, zero configuration, automatic discovery, index automatic sharding, index copy mechanism, restful interface, multi-data sources, automatic search load and so on.

  • Logstash is a tool for collecting, analyzing, and filtering logs. It supports a large number of data acquisition methods. The client is installed on the host that needs to collect logs. The server filters and modifies the received node logs and sends them to ElasticSearch at the same time.

  • Kibana is also an open source and free tool that provides a log analysis friendly Web interface for Logstash and ElasticSearch to help aggregate, analyze and search important data logs.

  • Filebeat is a lightweight log collector that can be easily integrated with Kibana. After Filebeat is started, you can directly view the detail process of log files in Kibana.

ELasticSearch is introduced

What is the ElasticSearch

Elasticsearch is a jSON-based distributed search and analysis engine. It can be accessed from RESTful Web services interfaces and uses pattern-less JSON(JavaScript object notation) documents to store data. It is based on the Java programming language, which enables Elasticsearch to run on different platforms. Enables users to search very large amounts of data very quickly.

What can ElasticSearch do

  • Distributed real-time file storage where each field is indexed and searchable
  • Distributed real-time analysis search engine
  • Scalable to hundreds of servers, processing petabytes of structured or unstructured data

What is the Lucene

ApacheLucene organizes all the information written to the indexes into a sort of Inverted Index structure, which is a data structure that maps the terms to the documents. It works differently from traditional relational databases in that inverted indexes are largely term oriented rather than document-oriented. The Lucene index also stores a lot of other information, such as word vectors, etc. Each Lucene is composed of multiple segments. Each segment is only created once but will be queried multiple times. Once a segment is created, it will not be modified. Multiple segments are merged at the stage of segment merging. The time of merging is determined by Lucene’s internal mechanism. The number of segments will decrease after merging, but the corresponding segments themselves will become larger. The process of merging segments is very I/O consuming, and at the same time some information is cleaned up that is no longer used. In Lucene, the process of turning data into inverted indexes and whole strings into searchable terms is called analysis. Text analysis is performed by Analyzer, which consists of Tokenizer, Filter and Character Mapper, and its various functions are obvious.

More ElasticSearch related introduction can check my this post: www.cnblogs.com/xuwujing/p/…

Logstash is introduced

Logstash is a data stream engine:

It is an open source streaming ETL engine for data logistics that builds a data flow pipeline in minutes, is horizontally scalable and resilient with adaptive buffering, agnosticable data sources, a plug-in ecosystem with over 200 integrations and processors, and uses Elastic Stack to monitor and manage deployments

The Logstash consists of three main parts: inputs, filters and outputs. Inputs are mainly used to provide rules for receiving data, for example, using the inputs to collect file contents. Filters are mainly used to filter the transmitted data, such as using grok rules for data filtering; Outputs outputs the received data according to the specified output mode, such as ElasticSearch.

Figure:

Kibana is introduced

Kibana is an open source data analysis and visualization platform designed to work with Elasticsearch as part of the Elastic Stack. You can use Kibana to search, view, and interact with data in the Elasticsearch index. You can easily use charts, tables and maps to analyze and present data in a variety of ways.Copy the code

Kibana makes big data easy to understand. It’s simple and browser-based interface makes it easy to quickly create and share dynamic data dashboards to track Real-time changes in Elasticsearch data.

Filebeat introduction

Filebeat is a lightweight log collector that uses Golang and is a member of Elasticsearch Stack. In essence, an Agent can be installed on each node, read logs of the corresponding location according to the configuration, and report the logs to the corresponding location. Filebeat is highly reliable, ensuring that logs are reported At least once. In addition, it also considers various problems in log collection, such as log breakpoint continuation, file name change, and log Truncated. Filebeat does not depend on ElasticSearch and can stand alone. We can use Filebeat alone to report and collect logs. Filebeat has built-in Output components such as Kafka, ElasticSearch, redis, etc. For debugging purposes, it can also Output to console and file. We can use the existing Output component to report the log. Of course, we can also customize the Output component to let Filebeat forward the log to where we want it. Filebeat is part of An elastic/ Beats group that includes HeartBeat and PacketBeat. These beat implementations are based on the LibBeat framework.

Filebeat consists of two main components: Harvester and Prospector.

Harvester’s main responsibility is to read the contents of a single file. Each file is read and the contents are sent to the Output. Each file starts with a Harvester, which is responsible for opening and closing files, meaning that file descriptors remain open at run time. If the file is deleted or renamed at read time, Filebeat will continue reading the file.

The main responsibility of the finder prospector is to manage the harvester and find the source of all the files to read. If the input type is log, the finder looks for all files whose path matches and starts a Harvester for each file. Each prospector runs in its own Go coroutine.

Note: Filebeat Prospector can only read local files and has no ability to connect to a remote host to read stored files or logs.

Figure:

ELK log system installation

Environment to prepare

The ELK download address is recommended to use tsinghua University or Huawei open source image site.

Download address:

mirrors.huaweicloud.com/logstash mirrors.tuna.tsinghua.edu.cn/ELK

ELK7.3.2 baidu network location address: link: pan.baidu.com/s/1tq3Czywj… Extraction code: CXNG

ELasticSearch cluster installation

The ElasticSearch cluster installation depends on the JDK. The version of ElasticSearch in this document is 7.3.2 and the corresponding JDK version is 12.

1. Document preparation

Unzip the downloaded ElasticSearch file

Input:

Tar – XVF elasticsearch 7.3.2 – Linux – x86_64. Tar. Gz

Then move it to the /opt/elk folder, create it if it does not exist, and rename the folder to Masternode.

In the/opt/elk input

Mv ElasticSearch -7.3.2-linux-x86_64 /opt/elk mv ElasticSearch -7.3.2-linux-x86_64 Masternode

2. Modify the configuration

Create an elastic user and assign permissions to the folder where elasticSearch resides. Create an elastic user and assign permissions to the folder where ElasticSearch resides. The command is as follows:

adduser elastic

chown -R elastic:elastic /opt/elk/masternode

Select * from /home directory where ElasticSearch data and logs are stored. Select * from /home directory where ElasticSearch data and logs are stored. Create a folder for ElasticSearch data and logs in the home directory. For this purpose, create a different folder for each node. So the folder creation here is going to be created with the user that we just created, go to Elastic and create the folder.

su elastic

mkdir /home/elk mkdir /home/elk/masternode mkdir /home/elk/masternode/data mkdir /home/elk/masternode/logs mkdir /home/elk/datanode mkdir /home/elk/datanode/data mkdir /home/elk/datanode/logs

After the masterNode configuration is successfully created, we first modify the masterNode configuration. After the modification, we copy the masterNode configuration to the same directory named Datanode, and then only need to change the configuration of datanode. Select * from elasticSearch. yml and jvm.options. Notice that this is still an Elastic user!

cd /opt/elk/

vim masternode/config/elasticsearch.yml vim masternode/config/jvm.options

Masternode configuration

Note that elasticSearch7.x is configured differently from 6.x, mainly in the master election area.

The configuration of the ElasticSearch. yml file on MasterNode is as follows:

Cluster. name: pancm-cluster node.name: master-1 network.host: 192.168.0.1 path.data: pancm-cluster node. /home/elastic/masternode/data path.logs: /home/elastic/masternode/logs discovery.seed_hosts: [” 192.168.0.1:9300 “, “192.168.0.2:9300”, “192.168.0.3:9300”] cluster. Initial_master_nodes: [“192.168.0.1:9300″,”192.168.0.2:9300″,”192.168.0.3:9300”] node.master: true node.data: false transport.tcp.port: 9301 http.port: 9201 network.tcp.keep_alive: true network.tcp.no_delay: true transport.tcp.compress: true cluster.routing.allocation.cluster_concurrent_rebalance: 16 cluster.routing.allocation.node_concurrent_recoveries: 16 cluster.routing.allocation.node_initial_primaries_recoveries: 16

Description of elasticSearch. yml file parameters:

  • Cluster. name: indicates the cluster name. The configurations of nodes in a cluster must be consistent. Es automatically discovers es in the same network segment. If there are multiple clusters in the same network segment, you can use this attribute to distinguish different clusters.
  • Node. name: indicates the name of the node. The recommended naming format is node attribute + IP end
  • Path. data: indicates the path where data is stored.
  • Path. logs: Indicates the path where logs are stored.
  • Network. Host: Sets the IP address, which can be ipv4 or ipv6. The default is 0.0.0.0.
  • Transport.tcp. port: Sets the TCP port for interaction between nodes. The default value is 9300.
  • Http. port: Specifies the HTTP port for external services. The default value is 9200.
  • Node. master: Specifies whether this node is eligible to be elected node. Default: true.
  • Node. data: Specifies whether this node stores index data. The default value is true.
  • .

JVMS. Options: Xms and Xmx: 4G

-Xms4g -Xmx4g

The datanode configuration

Copy the masterNode node and rename it datanode. Then change it to datanode based on the red box in the above example. The command is as follows:

cd /opt/elk/ cp -r masternode/ datanode vim datanode/config/elasticsearch.yml vim datanode/config/jvm.options

The datenode elasticSearch. yml file is configured as follows:

Cluster. name: pancm-cluster node.name: data-1 network.host: 192.168.0.1 path.data: pancm-cluster node. /home/elastic/datanode/data path.logs: /home/elastic/datanode/logs discovery.seed_hosts: [” 192.168.0.1:9300 “, “192.168.0.2:9300”, “192.168.0.3:9300”] cluster. Initial_master_nodes: [“192.168.0.1:9300″,”192.168.0.2:9300″,”192.168.0.3:9300”] node.master: false node.data: true transport.tcp.port: 9300 http.port: 9200 network.tcp.keep_alive: true network.tcp.no_delay: true transport.tcp.compress: true cluster.routing.allocation.cluster_concurrent_rebalance: 16 cluster.routing.allocation.node_concurrent_recoveries: 16 cluster.routing.allocation.node_initial_primaries_recoveries: 16

If the memory size is smaller than 64 GB, set it to half. Otherwise, set it to 32 GB. For example, if the MEMORY is 16GB, the JVM can set it to 8GB, and if it is 128GB, it can set it to 32GB.

JVMS. Options: Xms and Xmx: 8G

-Xms8g -Xmx8g

Note: After the configuration is complete, run the ll command to check whether the masternode and Datanode permission belongs to the elastic user. If not, run the chown -r elastic:elastic + path command to grant permission to the masternode and Datanode user.

After the above configuration is complete, you can use the same method once on another machine, or use the FTP tool to transfer files, or use the SCP command to transfer files remotely and then modify them differently for different machines. SCP command example:

JDK environment transfer:

scp -r /opt/java root@slave1:/opt scp -r /opt/java root@slave2:/opt

ElasticSearch environment transport:

scp -r /opt/elk root@slave1:/opt scp -r /home/elk root@slave1:/opt scp -r /opt/elk root@slave2:/opt scp -r /home/elk root@slave2:/opt

After the transmission is complete, modify the configuration of Node. name and network.host in other services.

Elasticsearch starts

After installing and configuring the ElasticSearch cluster, you need to start it as an Elastic user on every node on every machine! In the /opt/elk directory, enter:

su elastic

cd /opt/elk ./masternode/bin/elasticsearch -d ./datanode/bin/elasticsearch -d

After the startup is successful, you can run the JPS command or enter IP +9200 or IP +9201 in the browser. If the following page is displayed, success is achieved!

Elasticsearch 6. X version can see the article: www.cnblogs.com/xuwujing/p/…

Kibana installation

Note: Only need to install on one server!

1. Document preparation

Unzip the kibana-7.3.2-linux-x86_64.tar.gz configuration file and run the following command on Linux:

Tar – XVF kibana 7.3.2 – Linux – x86_64. Tar. Gz

Move to /opt/elk and rename the folder to Kibana -7.3.2

Mv kibana-7.3.2-linux-x86_64 /opt/elk mv Kibana-7.3.2-linux-x86_64 kibana-7.3.2

2. Modify the configuration

Go to folder and modify the Kibana.yml configuration file:

CD/opt/elk/kibana – 7.3.2 vim config/kibana yml

Add the following configuration to the configuration file:

Port: 5601 server.host: “192.168.0.1” ElasticSearch. hosts: [” http://192.168.0.1:9200 “] elasticsearch. RequestTimeout: 180000

3. Kinaba starts

There are some differences with Kibana 6.x in the boot area.

Start the system as user root

In the kibana upper folder type:

Nohup./kibana-7.3.2/bin/kibana –allow-root >/dev/null 2>&1 &

The non-root user is started

You need to change the permission set for Kibana to the user permission. Example

Chown -r elastic: elastic/opt/elk/kibana – 7.3.2

Start commands for non-root users:

Nohup./kibana-7.3.2/bin/kibana >/dev/null 2> &1&

After successful startup, enter:http://IP:5601

Figure:

Logstash installation

Note: Only need to install on one server!

1. Document preparation

Decompress the logstuck-7.3.2.tar. gz configuration file on Linux and enter:

The tar – XVF logstash – 7.3.2. Tar. Gz

Then move to /opt/elk and rename the folder to logstash-7.3.2. Enter:

Mv logstash-7.3.2.tar /opt/elk mv logstash-7.3.2.tar logstash-7.3.2

2. Modify the configuration

Go to the folder and create the logstash-filebeat.conf configuration file:

CD /opt/elk/logstash-7.3.2 touch logstash-filebeat.conf vim logstash-filebeat.conf

Add the following configuration to the configuration file. Modify the filter part based on the actual situation

input { beats { type => “java” port => “5044” } }

filter { grok { match => { “message” =>”|%{DATA:log_time}|%{DATA:thread}|%{DATA:log_level}|%{DATA:class_name}|-%{GREEDYDATA:content}” } }

ruby {

code => “event.set(‘timestamp’, event.get(‘@timestamp’).time.localtime + 86060)”

}

ruby {

code => “event.set(‘@timestamp’,event.get(‘timestamp’))”

}

mutate {

remove_field => [“timestamp”]

}

date {

match => [ “log_timestamp”, “yyyy-MM-dd-HH:mm:ss” ] locale => “cn” }

mutate { rename => { “host” => “host.name” } }

}

output {

Stdout {codec => rubyDebug} elasticSearch {hosts => [“192.168.0.1:9200”] index => “mylogs-%{+ YYYY.mm. Dd}”}}

  • Port: is the logstash port that receives FileBeat data. Ensure that fileBeat is transmitted to the Logstash port.

  • Hosts: Enter the address of ElasticSearch and the specified port. Cluster addresses can be separated by commas (,).

  • Index: the name of the write Index library. %{+ YYYy.mm. Dd} indicates that an index database is created every day.

3. Logstash start

Start the system as user root

In the Logstash folder type:

nohup ./bin/logstash -f logstash-filebeat.conf >/dev/null 2>&1 &

Or hot load the configuration file to start:

nohup ./bin/logstash -f logstash-filebeat.conf –config.reload.automatic >/dev/null 2>&1 &

Figure:

Filebeat installation

1. Document preparation

Decompress the configuration file filebeat-7.3.2-linux-x86_64.gz and run the following command on Linux:

Tar – XVF filebeat 7.3.2 – Linux – x86_64. Tar. Gz

Move to /opt/elk and rename the folder filebeat-7.3.2

Mv filebeat-7.3.2-linux-x86_64 /opt/elk mv filebeat-7.3.2-linux-x86_64 filebeat-7.3.2 /opt/elk mv filebeat-7.3.2-linux-x86_64 filebeat-7.3.2

To configure the startup test, type in the fileBeat folder as root:

./filebeat -c filebeat_test.yml test config

Start command:

./filebeat -e -c filebeat_logstash.yml

Background startup command:

nohup ./filebeat -c filebeat_logstash.yml >/dev/null 2>&1 &

If the background is enabled, you can view logs in the Logs directory of the FileBeat statistics directory.

other

ElasticSearch Combat Series

  • Kinaba for ElasticSearch
  • ElasticSearch DSL statement for ElasticSearch
  • ElasticSearch: JAVA API for ElasticSearch
  • ElasticSearch: ElasticSearch
  • Metric Aggregations for ElasticSearch
  • ElasticSearch: Logstash Quick start
  • ElasticSearch: Logstash: ElasticSearch
  • Filebeat: ElasticSearch, ElasticSearch, ElasticSearch, ElasticSearch

Music to recommend

Original is not easy, if you feel good, I hope to give a recommendation! Your support is the biggest motivation for my writing! Copyright: www.cnblogs.com/xuwujing CSDN blog.csdn.net/qazwsxpcm Nuggets: juejin.cn/user/365003… Personal blog: www.panchengming.com