Tech blog: github.com/yongxinz/te…
Meanwhile, welcome to follow my wechat official account AlwaysBeta, more exciting content waiting for you.
The previous article explained how to log elegantly in Django. This article discusses how to manage and view logs.
Speaking of checking your log, is something so simple worth writing an article about? The file already has, direct vim open not finished. That being said, sometimes it just doesn’t work.
If it is a single server, it is also possible to directly view the local file, and with some Linux commands, you can basically quickly locate the problem. But in reality, most of our services are deployed on multiple servers. If there is a failure, which server is the problem? It is difficult to troubleshoot, so you can only log in to the server one by one to view the logs, which is inefficient.
Therefore, there must be a centralized place to manage logs, which can be aggregated from multiple servers. So if there is a failure, we go to the log centralized management platform to check, you can quickly locate the problem, and you can know exactly which server is the problem, why not?
This article is aimed at solving this problem.
How do you solve it? It is easy to say, because there is already a very mature log analysis framework called ELK, and it has been successfully applied in various Internet companies, and there are many online materials.
Since the company already has a log analysis framework in place, it’s easier for me to just send out the log content.
Instead of using a Logstash, I used a lighter Filebeat, which is easier to configure.
Filebeat Log source configuration:
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /log/error.log
These three lines parse the JSON content into key-value pairs that would otherwise put the entire JSON content into a single Message field
json.keys_under_root: true
json.add_error_key: true
json.overwrite_keys: true
Copy the code
Filebeat sent to Elasticsearch:
#==================== Elasticsearch template setting ==========================
setup.template.name: "weblog"
setup.template.pattern: "weblog_*"
setup.template.overwrite: false
setup.template.enabled: true
setup.template.settings:
index.number_of_shards: 1
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
hosts: ["127.0.0.1:9200"]
Build index by month
index: "weblog_%{+YYYY.MM}"
# Protocol - either `http` (default) or `https`.
# protocol: "https"
# Authentication credentials - either API key or username/password.
# api_key: "id:api_key"
username: "elastic"
password: "changeme"
Copy the code
There was a problem with the configuration being sent to Elasticsearch and it took a long time to fix.
(status=404): {"type":"type_missing_exception"."reason":"type[doc] missing"."index_uuid":"j9yKwou6QDqwEdhn4ZfYmQ"."index":"04.16 secops - seclog_2020."."caused_by": {"type":"illegal_state_exception"."reason":"trying to auto create mapping, but dynamic mapping is disabled"}}
Copy the code
Most of the solutions are to configure document_type, but the Filebeat I use is version 5.6. This parameter has been cancelled, so I have no choice but to find another method.
Finally, just when I was about to give up, I fixed the problem by changing Elasticsearch Template Type to doc instead of using custom fields.
And I find a very strange phenomenon, that is always able to give up when the solution to the problem, so it is necessary to persist.
Once sent to Elasticsearch, you can query the data from the page via Kibana, but that’s not the best way to do it. A more common architecture would be to send data to Kafka, consume Kafka data through consumer applications, and store it into Elasticsearch or other storage components.
Filebeat sends to Kafka:
output.kafka:
hosts: ["kafka1:9092"]
topic: 'web-log'
username: 'XXX'
password: 'XXX'
partition.round_robin:
reachable_only: false
required_acks: 1
compression: gzip
max_message_bytes: 1000000
Copy the code
Remove unneeded fields:
Filebeat adds some fields when sending logs. If you don’t want these fields, you can filter them by using the following configuration.
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- drop_fields:
fields: ["agent"."ecs"."host"."input"."log"]
# - add_host_metadata: ~
# - add_cloud_metadata: ~
# - add_docker_metadata: ~
# - add_kubernetes_metadata: ~
Copy the code
The above is all the configuration of Filebeat. If you want to use all the ELK components in the production environment, it is estimated that you still need to rely on the company’s basic big data platform. Moreover, it is a long process to build, deploy, test and optimize.
But if you want to build a set of tests to play with, or relatively simple, directly query the corresponding documents in the official website, configuration is also relatively simple. If the Internet speed is good, it should be done soon. Have a good time.
The above.
Reference Documents:
www.elastic.co/guide/index…