This is the 16th day of my participation in the August Text Challenge.More challenges in August

preface

Elk is elastic, Logstash, and Kibana, but the company you work for is not a big one. If logstash and Kibana are placed on the same server, the normal server will not be able to pull them. Based on previous experience, if there is a large amount of data, With at least 8 gigabytes of memory, logStash is eliminated and Elastic’s built-in parser can fulfill the general business requirements

Specific operation

Elastic configure ingest

The most important thing about the ingest document address is the Processors in it

{
	/ / description
    "description": "java spring boot common parse"."processors": [{// Grok parser
            "grok": {
            	// The name of the field to be processed
                "field": "message".// Resolve mode (same as logstash configuration)
                "patterns": [
                    "%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} %{NUMBER:pid} --- \\[%{MYSELF:thread}\\] %{JAVACLASS:class} %{MYSELF:message}"."%{MYSELF:message}"]."pattern_definitions": {
                    "MYSELF": "[\\s\\S]*"}}}]}Copy the code

You can send a request to test POST / _INGest /pipeline/ _SIMULATE

{
    "pipeline": {
    "description": "java spring boot common parse"."processors": [{"grok": {
                "field": "message"."patterns": [
                    "%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} %{NUMBER:pid} --- \\[%{MYSELF:thread}\\] %{JAVACLASS:class} %{MYSELF:message}"."%{MYSELF:message}"]."pattern_definitions": {
                    "MYSELF": "[\\s\\S]*"}}}]},"docs": [{"_source": {
                "message": "The 2019-04-25 11:12:31. 4013-740 the INFO [- 8081 - exec - 1247] C.E.I.F ramework. Aspect. ValidationAspect: Accept the request and returns the = = = = > {\ "controllerName \" : \ "com. Eco. The device. The controller. MessageController \"}"}}]} #"docs": [{"doc": {
                "_index": "_index"."_type": "_type"."_id": "_id"."_source": {
                    "level": "INFO"."pid": "4013"."time": "The 2019-04-25 11:12:31. 740"."thread": "-8081-exec-1247"."message": ": accept the request and returns the = = = = > {\" controllerName \ ": \" com. Eco. The device. The controller. MessageController \ "}"."class": "c.e.i.framework.aspect.ValidationAspect"
                },
                "_ingest": {
                    "timestamp": "The 2020-03-21 T08:01:38. 605 z"}}}]}Copy the code

If the test is successful, request elastic ingest API PUT /_ingest/pipeline/java_log where java_log is a custom name

{
	/ / description
    "description": "java spring boot common parse"."processors": [{// Grok parser
            "grok": {
            	// The name of the field to be processed
                "field": "message".// Resolve mode (same as logstash configuration)
                "patterns": [
                    "%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} %{NUMBER:pid} --- \\[%{MYSELF:thread}\\] %{JAVACLASS:class} %{MYSELF:message}"."%{MYSELF:message}"]."pattern_definitions": {
                    "MYSELF": "[\\s\\S]*"}}}]}Copy the code

Configuration filebeat

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["172.17.0.107:9092"]
  index: "callback-%{[fields.project]}-%{+yyyy.MM.dd}"
  The name that was just configured
  pipeline: java_log
Copy the code

Then start the test

Delete useless fields

One thing I don’t like when I actually use it is that there are too many Kibana fieldsI’m not done yet… I can’t stand the need to use the process this timeremove processorContinue to transform

{
    "description": "java spring boot common parse"."processors": [{"grok": {
                "field": "message"."patterns": [
                    "%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:level} %{NUMBER:pid} --- \\[%{MYSELF:thread}\\] %{JAVACLASS:class} %{MYSELF:message}"."%{MYSELF:message}"]."pattern_definitions": {
                    "MYSELF": "[\\s\\S]*"}},"remove": {
            	"field": 
            	[
            		"beat.version"."host.architecture"."host.containerized"."offset"."host.id"."host.os.family"."host.os.codename"."host.os.name"."host.os.platform"."host.os.version"."input.type"."meta.cloud.region"."meta.cloud.provider"."meta.cloud.instance_id"."meta.cloud.availability_zone"].// Ignore failures
            	"ignore_failure" : true}}}]Copy the code

Note that I’ve tried both ignore_failure and on_failure, and the former is a little bit better. Tests show that if you set a delete field that doesn’t actually exist and set on_failure, the log will be lost. If the former is set, the subsequent fields to be deleted will still exist after exceptions occur, but logs will not be lost