Log-pilot Collects K8S logs
At this point, our ELK has been deployed. Now we deploy log-Pilot, which collects container logs and pushes them to LogStash. Logstahs processes the logs and sends them to ES
A started container service, configured with a tag, is resolved by log-Pilot to fetch the container’s logs
features
Log-pilot Provides an automatic discovery mechanism. After a container tag is configured, the collection component detects the CheckPoint handle. Log-pilot traces Log file handles and automatically marks Log data. By adding a tag to the container, the tag is recorded in the log, and the data can be distinguished by the tag when the log is taken out. Dynamic configuration is effective. When the container is expanding or shrinking, the problem of log duplication and loss and log source tag can be handled automatically
lable
Aliyun. logs.$name = $PATH The variable name is the log name and contains only 0 to 9, a to z, a to z, and hyphens (-). The variable path is the log path to be collected. For example, /var/log/he.log and /var/log/*. Log are both valid values, but /var/log cannot be written only to directories. Stdout is a special value that represents standard output
Aliyun. logs.$name.format: indicates the log format. Currently, the following formats are supported: None: No format Plain text JSON: indicates a complete JSON string in each line
Aliyun. Logs. $name. Tags: Additional fields are reported in the format of k1=v1,k2=v2, and are separated by commas (,). For example, aliyun.logs.access.tags=”name=hello,stage=test”. If you use ElasticSearch as the log store, the target tag has a special meaning, indicating the index of ElasticSearch
use
- Configure a log-Pilot demonSet to publish to K8S so that each Node has a collection component
- The key to adding a lable to a Docker container is how to add the tag
PILOT_LOG_PREFIX: “aliyun,custom” By modifying this environment variable can change the prefix of lable, the default is aliyun(some versions do not apply)
Docker pull the log – pilot: 0.9.6 – filebeat
The deployment of yaml
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: log-pilot
namespace: kube-system
labels:
k8s-app: log-pilot
kubernetes.io/cluster-service: "true"
spec:
template:
metadata:
labels:
k8s-app: log-es
kubernetes.io/cluster-service: "true"
version: v1.22
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: dashboard-admin
containers:
- name: log-pilot
Please refer to https://github.com/AliyunContainerService/log-pilot/releases # version
image: log-pilot:latest
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
env:
- name: "NODE_NAME"
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: "LOGGING_OUTPUT"
value: "logstash"
- name: "LOGSTASH_HOST"
value: "10.90. 7.0.x.x"
- name: "LOGSTASH_PORT"
value: "5044"
- name: "LOGSTASH_LOADBALANCE"
value: "true"
#- name: "FILEBEAT_OUTPUT"
# value: "elasticsearch"
#- name: "ELASTICSEARCH_HOST"
# value: "elasticsearch"
#- name: "ELASTICSEARCH_PORT"
# value: "9200"
#- name: "ELASTICSEARCH_USER"
# value: "elastic"
#- name: "ELASTICSEARCH_PASSWORD"
# value: "changeme"
volumeMounts:
- name: sock
mountPath: /var/run/docker.sock
- name: root
mountPath: /host
readOnly: true
- name: varlib
mountPath: /var/lib/filebeat
- name: varlog
mountPath: /var/log/filebeat
securityContext:
capabilities:
add:
- SYS_ADMIN
terminationGracePeriodSeconds: 30
volumes:
- name: sock
hostPath:
path: /var/run/docker.sock
- name: root
hostPath:
path: /
- name: varlib
hostPath:
path: /var/lib/filebeat
type: DirectoryOrCreate
- name: varlog
hostPath:
path: /var/log/filebeat
type: DirectoryOrCreate
Copy the code
Deployment Injection environment variable configuration, assuming the application name is monitor-Center
- name: aliyun_logs_monitor-center-stdout
# Acquisition console
value: "stdout"
- name: aliyun_logs_monitor-center-tomcat
# collect the specified directory
value: "/usr/local/tomcat/logs/*.log"
- name: aliyun_logs_monitor-center-netcore
# collect the specified directory
value: "/app/logs/*.log"
- name: aliyun_logs_monitor-center-java
# collect the specified directory
value: "/logs/*.log"
- name: aliyun_logs_monitor-center-stdout_tags
# tag the Aliyun_logs_monitor-center-stdout acquisition console configuration, similar below
value: "app=monitor-center,lang=all,sourceType=stdout"
- name: aliyun_logs_monitor-center-tomcat_tags
value: "app=monitor-center,lang=java,sourceType=log"
- name: aliyun_logs_monitor-center-netcore_tags
value: "app=monitor-center,lang=net,sourceType=log"
- name: aliyun_logs_monitor-center-java_tags
value: "app=monitor-center,lang=java,sourceType=log"
Copy the code
Check whether fileBeat is correctly configured
kubectl -n kube-system get pod | grep log-pilot
kubectl exec -it log-pilot-nspdv sh -n kube-system
cat /etc/filebeat/filebeat.yml
Other knowledge points
Common log collection components
- filebeat
- Logstash
- logpilot
- fluentd
ElasticSearch Curator
Es7.x is a python development tool that makes it easier to use es, without sending HTTP requests directly. However, this tool is also outdated in higher versions
elastalert
Es extensions. You can process specific matching logs and generate alarms
Elastic Beats
Beats means lightweight and is the source of data, such as log files (Filebeat), network data (Packetbeat), and server metrics (Metricbeat).
So the community came up with the concept of ELKB, and we used log-Pilot to collect container data, which was already an ELKB idea
https://www.cnblogs.com/sanduzxcvbnm/p/12076383.html
About the type
X version can create multiple Types. 6. X version can create only one Type
Es is index-based and does not require type(a table in a relational database) to speed up queries
About the node
Master node, data node, pre-processing node (coordination node). Node type can display the specified, the default is served for the various functions, the master node is chosen to the master node is responsible for synchronization cluster state, coordinating node is responsible for forwarding the request, if coordination node was also involved in data processing, so coordination node load is too high, cannot forward request, may affect the overall performance under the mass nodes, such as more than 10 sets, Each node can be dedicated to the responsibility of the pre-processing node because there is no need to store data, the CPU memory requirements are not very high
It is better to use Nginx load balancing, polling nodes to process requests rather than sending requests to one node each time
About node rebalancing
A sudden outage does not need to trigger cluster rebalancing. You can either shut it down or set the trigger delay
About automatic index creation
If you do not allow automatic index creation, ELK logs pushed to ES will not be visible in the front end, so you need to create indexes manually. This function can be turned on or off by modifying the configuration
The default heap
Es default heap maximum and minimum is 2GB, so if it is for testing not knowing the heap, it may fail to start importing docker because the default value is too large
Docker ES to modify
If you want to adjust ES JVM parameters, step
- Let’s close the container, rm container
- Modify the configuration file to comment out security-related comments, otherwise the container will not start
- Modify the JVM to start the container again
- Copy the keywords related files to the container
- Modify the configuration file to cancel the comment
- Restart the container
Because security checks were used, the keywords file disappeared after the container was destroyed, and notice that the plug-in was also gone
Then stored data is mapped to a local, regardless of when the cluster after losing nodes, will rebalance the index to the other node, when the new cluster nodes, will again balance data stored in the local data in a node failure, failure was node rejoin the data are written to the new, in short in the docker mode, do not need to pay attention to the data file too
This is actually redeployment, but there is also a non-canonical way to operate without destroying containers, which is less stable and is recommended for the JVM to plan from the start. If you need to modify it, you should first use non-standard operation, that is, modify the file in docker Container, which is a bit complicated. In this way, the data still exists. If there are many nodes in the cluster and there are duplicate backups, the container can be destroyed directly so that the data is not lost
The problem summary
Record the solutions to common problems
Search. Max_buckets is too small
trying to create too many buckets. must be less than or equal to: [100000] but was [100001]. this limit can be set by changing the [search.max_buckets] cluster level setting.
Copy the code
The above error may be caused by the setting of search. Max_buckets, which is temporarily modified transient
curl -H 'Content-Type: application/json' ip:port/_setting/cluster -d '{ "transient": { "search.max_buckets": 217483647}} 'Copy the code
With Kibana, Persistent is a persistent configuration
PUT /_cluster/settings
{
"persistent": {
"search.max_buckets": 217483647
}
}
Copy the code
Or change the configuration file directly
The official reference https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket.html
The number of fragments is too small. Procedure
The number of shards needs to be adjusted. 7.5 The default number of shards is 2000. If the number exceeds this value, shards cannot be created in the cluster
PUT /_cluster/settings
{
"persistent": {
"cluster": {
"max_shards_per_node":10000
}
}
}
Copy the code
The write line is too small. Procedure
After the configuration file is modified, the write thread size is changed: thread_pool.write. Queue_size: 1000
Modify the default put back value entry
Max_result_window: 1000000 By default, only 10000 can be returned. This configuration may need to be modified if the call chain is too long (Skywalking)
ELK ELK architecture reference https://www.one-tab.com/page/tto_XdDeQlS44BY-ziLvKg structures, https://www.one-tab.com/page/Fb3B3qd2Q9yR9W92dZ2pYQ