Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities
The installation
download
Currently, Logstash is split into two packages: core and community contributions. You can start your www.elasticsearch.org/overview/el… Download the source or binary versions of both packages.
- Source code
Wget wget HTTP: / / https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz https://download.elasticsearch.org/logstash/logstash/logstash-contrib-1.4.2.tar.gzCopy the code
- Debian platform
Wget wget HTTP: / / https://download.elasticsearch.org/logstash/logstash/packages/debian/logstash_1.4.2-1-2c0f5a1_all.deb https://download.elasticsearch.org/logstash/logstash/packages/debian/logstash-contrib_1.4.2-1-efd53ef_all.debCopy the code
- Redhat platform
Wget HTTP: / / https://download.elasticsearch.org/logstash/logstash/packages/centos/logstash-1.4.2-1_2c0f5a1.noarch.rpm https://download.elasticsearch.org/logstash/logstash/packages/centos/logstash-contrib-1.4.2-1_efd53ef.noarch.rpmCopy the code
The installation
For these packages, you might prefer to use RPM, DPKG, etc., to install Logstash. The developers have predefined dependencies in the package. For example, logstash- 1.4.2-1_2C0f5a. narch relies on the JRE package.
In addition, the package contains some useful scripts such as /etc/init.d/logstash.
If you have to run Logstash on some older operating system, you’ll have to deploy it with the source code bundle. Remember to install Java yourself:
Yum install openJDK -jre export JAVA_HOME=/usr/ Java tar ZXVF logstuck-1.4.2.tar.gzCopy the code
Best practices
But the real advice is: if you can, use the Elasticsearch repository to install the Logstash directly!
Debian platform
wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add -
cat >> /etc/apt/sources.list <<EOF
deb http://packages.elasticsearch.org/logstash/1.4/debian stable main
EOF
apt-get update
apt-get install logstash
Copy the code
Redhat platform
rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch cat > /etc/yum.repos.d/logstash.repo <EOF [logstash 1.4] name = logstash repository for 1.4 x packages baseurl=http://packages.elasticsearch.org/logstash/1.4/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1 EOF yum clean all yum install logstashCopy the code
Hello World
In the terminal, run the command as follows to start the Logstash process:
bin/logstash -e 'input{stdin{}}output{stdout{codec=>rubydebug}}'
Copy the code
The terminal waits for data
Type Helloword to display the results
{
"@version" => "1",
"host" => "izwz99gyct1a1rh6iblyucz",
"@timestamp" => 2018-11-22T08:15:46.454Z,
"message" => "helloword"
}
Copy the code
explain
Every system administrator must have written a lot of similar such command: cat randdata | awk ‘{print $2}’ | sort | uniq -c | tee sortdata. The pipeline operator | is Linux world one of the greatest invention (the other is “all documents”).
Logstash is like a pipe character!
You input (like cat on the command line) data, then process the filtering (like AWk or UNIQ) data, and finally output (like TEE) somewhere else.
Of course, actually, Logstash does this with different threads. If you run the top command and press H, you should see output like this:
PID USER PR NI VIRT RES SHR S % % MEM CPU TIME + COMMAND 21401 root 16 0 1249 m 303 m to 10 m S 18.6 0.2 866:25.46 | 21467 worker Root 15 0 1249m 303m 10m S 3.7 0.2 129:25.59 > ElasticSearch. 21468 root 15 0 1249m 303m 10m S 3.7 0.2 128:53.39 > elasticSearch. 21400 root 15 0 1249m 303m 10m S 2.7 0.2 108:35.80 <file 21403 root 15 0 1249m 303m 10m S 1.3 0.2 49:31.89 >output 21470 root 15 0 1249m 303m 10m S 1.0 0.2 56:24.24 > ElasticSearchCopy the code
Logstash adds some extra information to the event. The most important is @timestamp, which is used to mark the time of the event. Since this field is involved in the Logstash internal stream, it must be a Joda object, and if you try to rename a string field to @TIMESTAMP yourself, the Logstash will report an error. So, use the Filters /date plug-in to manage this particular field.
-
Host flags where the event occurred.
-
Type Marks the unique type of the event.
-
Tags mark an aspect of an event. This is an array, and an event can have multiple tags.
Long running
1. Standard Service mode
This method is recommended for readers using RPM and DEB distributions. All packages come with sysV or Systemd style boot/configuration programs that you can use directly.
Take RPM as an example. The /etc/init.d/logstash script loads the /etc/init.d/functions library file and uses the daemon functions init to run the logstash process as a background program.
Put your own configuration files in the /etc/logstash/ directory. (Note that all configuration files in this directory should end with.conf and no other text files exist. Since the logstash agent starts to read the entire folder), run the service logstash start command.
2. The most basic Nohup method
This is the easiest way to do it, and it’s a classic problem that Linux novices often get confused with:
Create a. Conf file in the Logstash directory using a configuration file
Input {stdin {}} output {elasticSearch {hosts => '172.18.118.222'} stdout {codec => rubyDebug}}Copy the code
(OOM problem)
command
command > /dev/null
command > /dev/null 2>&1
command &
command > /dev/null &
command > /dev/null 2>&1 &
command &> /dev/null
nohup command &> /dev/null
Copy the code
3. A more elegant way to SCREEN
Screen is an advanced Linux operation skill. A terminal command to run in an environment created by the screen command, whose parent process is not the SSHD login session, but screen. In this way, users can not only avoid the disappearance of the process, but also take over the terminal to continue operations at any time.
Create a separate screen command as follows:
screen -dmS elkscreen_1
Copy the code
The elkscreen_1 command created to take over the connection is as follows:
screen -r elkscreen_1
Copy the code
Then you can see the exact same terminal. After running the Logstash, instead of Ctrl+C, press Ctrl+A+D to disconnect the environment. To take over again, use screen-r elkscreen_1.
If multiple screens are created, the command to view the list is as follows:
screen -list
Copy the code
4. The most recommended daemonTools method
Neither NohUP nor Screen is very easy to manage, and we must find a way to manage an ELK cluster as cleanly as possible. Therefore, daemonTools is recommended for a large number of applications that need to run in the background for a long time.
Daemontools is a software name, but the configuration is a bit more complicated. It’s the name I’m using to refer to the whole genre, including, but not limited to, the Python container it’s designed for, the Perl container it’s designed for, the Ruby container it’s designed for, the God.
- The container, for example, is designed to be installed in the EPEL warehouse because it comes out early.
yum -y install supervisord --enablerepo=epel
Copy the code
- in
/etc/supervisord.conf
Add content to the configuration file to define the program you want to start:
[program:logstash]
environment=LS_HEAP_SIZE=128m
directory=/usr/local/software/logstash
command=/usr/local/software/logstash/bin/logstash -f /usr/local/software/logstash/logstash.conf --pluginpath /usr/local/software/logstash/plugins/ -w 10 -l /var/log/logstash/pro1.log
[program:elkpro_2]
environment=LS_HEAP_SIZE=128m
directory=/usr/local/software/logstash
command=/usr/local/software/logstash/bin/logstash -f /etc/logstash/pro2.conf --pluginpath /opt/logstash/plugins/ -w 10 -l /var/log/logstash/pro2.log
Copy the code
It is ready to be launched and then the service container is designed for handling the container.
Sudo /bin/systemctl start supervisor.service
Check whether the server is started.
systemctl status supervisord.service
The logstash subroutine is designed to run as the container process, and you can also handle the container with the container’s container CTL command, which is designed to enable or disable a single process in a series of logstash subroutine processes:
supervisorctl stop elkpro_2
Copy the code
Supervisorctl Common command
Supervisorctl status: monitors the status of all processes
It is designed to be stopped
It is designed to be run in a small fashion
Supervisorctl restart: is it restarted
Supervisorctl update: This command is used to load new configurations after configuration files are modified
Supervisorctl Reload: reboots all programs in the configuration
5. Use the Docker
Docker pull docker. Elastic. Co/logstash/logstash: 6.5.1Copy the code
grammar
Logstash designs its own DSL — a bit like Puppet’s DSL, perhaps because it’s written in Ruby — with regions, annotations, data types (Bools, strings, values, arrays, hashes), conditional judgments, field references, and so on.
Section (section)
Logstash defines regions with {}. Regions can include plug-in region definitions. You can define multiple plug-ins in a region. Within the plug-in area you can define key-value pair Settings. The following is an example:
input {
stdin {}
syslog {}
}
Copy the code
Logstash supports a small number of data value types:
- bool
debug => true
Copy the code
- string
host => "hostname"
Copy the code
- number
port => 514
Copy the code
- array
match => ["datetime", "UNIX", "ISO8601"]
Copy the code
- hash
options => {
key1 => "value1",
key2 => "value2"
}
Copy the code
Field Reference
The field is an attribute of the Logstash::Event object. We mentioned earlier that the event is like a hash, so you can imagine that the field is like a key-value pair.
Tip: We call it field because that’s what Elasticsearch calls it.
If you want to use the value of a field in your Logstash configuration, just write the field name in brackets []. This is called a field reference.
For nested fields (that is, a multidimensional hash table, or hash), it is ok to write the field names in [] for each level. For example, you can get the longitude value from geoIP like this (yes, it’s a silly trick, there’s actually a separate field for it) :
[geoip][location][0]
Copy the code
Tip: The Logstash array also supports reverse subscripts, so [geoip][location][-1] gets the value of the last element of the array.
Logstash also supports variable interpolation, using field references in strings like this:
"the longitude is %{[geoip][location][0]}"
Copy the code
Condition judgment
Logstash has supported conditional judgments and expressions since version 1.3.0.
Expressions support the following operators:
-
equality, etc: ==, ! =, <, >, <=, >=
-
regexp: =
, ! -
inclusion: in, not in
-
boolean: and, or, nand, xor
-
unary: ! (a)
In general, you’ll use field references in expressions. Such as:
if "_grokparsefailure" not in [tags] { } else if [status] ! ~ /^2\d\d/ and [url] == "/noc.gif" { } else { }Copy the code
Command line arguments
Logstash provides a shell script called Logstash for quick running. It supports the following parameters:
- -e
That means execution. We already used this parameter in “Hello World”. You can actually run bin/logstash -e “without writing any specific configuration to achieve the same effect. The default value for this parameter is as follows:
input {
stdin { }
}
output {
stdout { }
}
Copy the code
- — config or -f
Meaning document. In real life, we would write very long configurations, perhaps even more than the 1024 characters supported by the shell. So we have to solidify the configuration into a file and run it in a bin/logstash -f agent.conf format.
In addition, logstash provides a small feature that makes it easy to plan and write configurations. You can run it directly with bin/logstash -f /etc/logstash. D /. Logstash automatically reads all the text files in /etc/logstash. D/and assembles them into a large configuration file in its own memory.
- — configtest or -t
It means test. This is used to test whether the Logstash read configuration file syntax is properly parsed. The Logstash configuration syntax is defined using grammar. Treetop. Especially if you use the directory reading method mentioned in the previous article, test it in advance.
- – log or -l
That means journal. Logstash prints logs to standard error by default. In the production environment, you can run the bin/logstash -l logs/logstash.
- – filterworkers or – w
Meaning worker thread. Logstash runs multiple threads. You can force the logstash to run 5 threads for the filter plugin with bin/logstash -w 5.
Note: Logstash does not currently support multithreading for input plug-ins. The output plug-in multithreading needs to be set inside the configuration, and this command line parameter is only used to set the filter plug-in!
Tip: Logstash does not currently support monitoring management of filter threads. If the Filterworker dies, the Logstash will be in a filter-free zombie state. NoMethodError: undefined method ‘*’ for nil:NilClass error. It needs to be handled properly and judged in advance.
- — pluginpath or -p
You can write your own plug-ins and load them with bin/logstash –pluginpath /path/to/own/plugins.
- –verbose
Output debug logs.
Tip: If you are using a Logstash version lower than 1.3.0, you can only use bin/ Logstash -v instead.
- –debug
Output more debug logs.
Tip: If you are using a Logstash version lower than 1.3.0, you can only use bin/ Logstash -vv instead.