The background,
When we use Logstash to read data from the outside, by default, we will read the value of string. Suppose we need to change the type of the value of the field. If we change the type of the field from string to integer, or if we delete the field, change the name of the field, or give the field a default value, etc. At this time, we can use mutate filter to achieve.
Second, the demand for
1. Read data from the file, the data in the file conforms to the format of CSV, that is, the default is, delimited.
2. Delete the field read, modify the value of the field, modify the type of the field, give a default value, merge the field and other operations.
Third, the implementation steps
1, install,csv codec
The plug-in
Note ⚠ ️ :
Bin /logstash-plugin install logstash-codec-csv
CD /Users/huan/soft/elastic-stack/logstash/logstash -- Verbose # install logstash-codec-csv bin/logstash-plugin
2. Prepare the file data to be read
user_ real_name |
user_ english_name |
age | address | education | strip_blank | language | default_value | create_time |
---|---|---|---|---|---|---|---|---|
Zhang SAN | zhangSan | 20 | Hubei province; luotian | Education Background – Bachelor | Remove leading and trailing Spaces | java | The default value | 20210512 08:47:03 |
Li si | lisi | 18 | Hubei province; huanggang | Degree – Junior College | Go to the first space | C | 20210512 03:12:20 |
3. Write a pipeline to read and output data
input { file { id => "mutate-id" path => ["/Users/huan/soft/elastic-stack/logstash/logstash/pipeline.conf/filter-mutate/mutate.csv"] start_position => "beginning" sincedb_path => "/Users/huan/soft/elastic-stack/logstash/logstash/pipeline.conf/filter-mutate/sincedb.db" codec => csv { columns => ["user_real_name","user_english_name","age","address","education","strip_blank","language","default_value","create_time" ] charset => "UTF-8" separator => "," skip_empty_columns => false convert => { "age" => "integer" } } } } output { stdout { codec => rubydebug { } } }
-
CSV Codec plug-in explanation
- Columns: Defines a set of parsed CSV column names, which are also late column names
- Charset: Character encoding
- Separator: defines a line of data that is read by what separator. CSV files are usually separated by, or TAB, etc. By default, it is a comma
-
Skip_empty_columns: Whether to skip empty columns if the value is empty.
- True: skip
- False: don’t skip
- Convert: A datatype conversion that reads all values of the same type by default
string
Here will beage
The data type of the value of the field is converted tointeger
4. Use of mutate plug-in
Preconditions:
1. Without special instructions, the data of the test data is prepared to read the data in the file data for the implementation step >
Note:
1. Both update and replace are the values of the updated field, but if the value of the updated field does not exist, this will have no effect, but replace will add the new field.
2, Copy the target value of the field, if there is an overwrite value, otherwise add a new field.
1. Coerce sets default values for fields
If a field already exists and its value is null, we can use Coerce to set the default value for it
1, configuration file writing
Filter {mutate {coerce => {"default_value" => ";}}}
2. Execution results
2, rename the field
1, configuration file writing
filter { mutate { rename => { "user_real_name" => "[user][real_name]" "user_english_name" => "[user][english_name]" "Age" = "age"}}
2. Execution results
3, update the value of the field
1, configuration file writing
Update => {"user_address" => "User's address is: %{address}"}}
2. Execution results
3, explain
UPDATE updates the value. The updated field must exist, otherwise it will have no effect.
4, REPLACE updates the value of the field
1, configuration file writing
Replace => {"user_address" => "User's address is: %{address}"}}
2. Execution results
5. Convert the data type
1. Data types that can be converted
Integer, integer_eu, float, float_eu, string, Boolean
2, configuration file writing
Filter {mutate {# 1, convert => {"age" => "string"}}
3. Execution results
6. GSub replaces the contents of the fields
1, configuration file writing
Gsub => ["address", ";", "--"]}} gsub => ["address", ";
2. Execution results
7, Uppercase, Capitalize, Lowercase, Capitalize, Lowercase
1, configuration file writing
Uppercase => ["user_english_name"] # uppercase => ["user_english_name"] # uppercase => ["user_english_name"] # capitalize => ["user_english_name"] } }
Priority needs to be taken care of.
2. Execution results
8, Strip to remove leading and trailing Spaces
1, configuration file writing
Filter {mutate {# strip = bb0 ["strip_blank"]}}
2. Execution results
9, Remove the field
1, configuration file writing
Filter {mutate {# remove field foo_zhangsan;} remove_field => ["user_real_name","foo_%{username}"] } }
2. Execution results
10. Split splits fields
1, configuration file writing
Filter {mutate {# 1, split => {"address" => "; }}}
2. Execution results
11. Join fields
1, configuration file writing
Filter {mutate {# 1, split => {"address" => "; } # 2, join => {"address" => "***"}} # 3, join => {"address" => "***"}
Split into arrays is used first, and then join
2. Execution results
Merge field merge field merge
1, can merge the situation
`array` + `string` will work `string` + `string` will result in an 2 entry array in `dest_field` `array` and `hash` will not work
2, configuration file writing
Filter {mutate {# 1, merge => {"user_real_name" => "user_english_name"}}
3. Execution results
13, Copy Copy field
1, configuration file writing
Filter {mutate {# 1; select * from user_name where user_name exists; select * from user_name where user_name exists; Copy => {"user_real_name" => "user_name"}}
2. Execution results
4. Priority of mutate
1. Order in which mutate is executed in the configuration file
Coerce, Rename, Update, Replace, Convert, GSUB, Uppercase, Capitalize, Lowercase, Strip, Remove, Split, join merge, Copy
CoreCE executes first and Copy executes last.
2. Priority of multiple mutate blocks
Filter {# mutate block 1 will execute mutate {} # mutate block 2 mutate {}}
Note ⚠ ️ :
If we want to copy the age field and convert the datatype, we can use the above mutate blocks to do this.
1, configuration file writing
} mutate {convert => {"age" => "string"}} mutate {convert => {"age" => "string"}}
Placement convert and copy in a mutate block and you will see different results.
2. Execution results
Five, reference documents
1, https://www.elastic.co/guide/en/logstash/7.12/working-with-plugins.html
2, https://www.elastic.co/guide/en/logstash/current/plugins-codecs-csv.html
3, https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html