preface

Come, let’s continue to talk about ELK today, before we talked about the basic operation of ELK, ELK log retrieval. Today we are going to talk about how to configure email notification of sensitive information. As a programmer, it is impossible to watch the big screen of ELK log all the time (we will talk about the visual operation of log later). We should pay attention to the error log of ELK in time to avoid unnecessary impact. Let’s take sending 503 error email notifications every 10 minutes as an example.


Kibana: Elasticsearch Watcher

1.1. Edit/etc/elasticsearch/elasticsearch yml, adding in the last email sender associated Settings.

xpack.notification.email.account: 
 outlook_account:
  profile: outlook
  smtp: 
   auth: true
   starttls.enable: true
   host: smtp.office365.com
   port: 587
   user: [email protected]
   password: xxx
Copy the code

1.2. Create a custom Watch in Kibana. (or use the curl command to add to watch.)

Kibana = > Management = > Elasticsearch = > Watcher = > Create new watch = > Advanced Watch:

{
  "trigger" : {
    "schedule" : { "cron" : "*/10 * * * *?"}},"input" : {
    "search" : {
      "request" : {
        "indices" : [
          "test-qa-access*"]."body" : {
          "query" : {
            "bool" : {
              "must" : {
                "match": {
                   "response": 503}}."filter" : {
                "range": {
                  "@timestamp": {
                    "from": "{{ctx.trigger.scheduled_time}}||-10m"."to": "{{ctx.trigger.triggered_time}}"}}}}}}}}},"condition" : {
    "compare" : { "ctx.payload.hits.total" : { "gt": 0}}},"actions" : {
    "email_admin" : {
      "email" : {
        "from": "[email protected]"."to" : "[email protected]"."subject" : "TEST-QA-ACCESS-LOG - Encountered 503 errors - {{ctx.payload.hits.total}} times"."body": "Body test"}}}}Copy the code

Test execution. Set the operation mode to “Execute”, and if the conditions are met, real mail will be sent to you.

Select Cron job from Elasticsearch

2.1. Create a script alert.py to check for 503 errors in the last 10 minutes. If yes, an alarm email is sent with 503 error information in the email body

from elasticsearch import Elasticsearch
es = Elasticsearch()
 
import time
from datetime import date
today = date.today()
datestr = date.today().strftime("%Y.%m.%d")
searchidx = "test-qa-access-logs-cw-" + datestr
print(searchidx)
 
res = es.search(index=searchidx, doc_type="doc, teste-type", body={"query": {"bool": {"must": [{"match": {"response": 503}}, {"range" : {"@timestamp" : {"gte" : "now-10m"."lt" :  "now"}}}]}}})
hitstotal = res['hits'] ['total']
print("%d documents found" % hitstotal)
if hitstotal > 0:
        import smtplib
        from email.MIMEMultipart import MIMEMultipart
        from email.MIMEText import MIMEText
 
        import json
 
        fromaddr = "[email protected]"
        toaddr = "[email protected]"
        msg = MIMEMultipart()
        msg['From'] = fromaddr
        msg['To'] = toaddr
        msg['Subject'] = "503 ALERT Test"
 
        body = json.dumps(res['hits'] ['hits'])
        msg.attach(MIMEText(body, 'plain'))
 
        server = smtplib.SMTP('smtp.office365.com', 587)
        server.starttls()
        server.login(fromaddr, "xxxxxx")
        text = msg.as_string()
        server.sendmail(fromaddr, toaddr, text)
        server.quit()
else:
        print("no hit")
Copy the code

2.2. Set the Cron Job

*/10 * * * * python /app/errorlogs/alert.py
Copy the code

3. Use AWS Cloudwatch

3.1. Enable remote access to Elasticsearch

Using vim/etc/elasticsearch/elasticsearch. Yml, modify the network. The host field for the network. The host: 0.0.0.0. Then restart the ElasticSearch service for it to take effect.

3.2. Install the ElasticSearch-py development package on the server

pip install elasticsearch
Copy the code

Create a script to collect the number of HTTP error codes and put the data into AWS Cloudwatch. Create the logs-httpcode-metrics.py file

import time
import datetime
from datetime import date
from elasticsearch import Elasticsearch
import boto3
 
 
def getHitTotal(responseCode, searchIndicesPrefix):
        today = date.today()
        datestr = date.today().strftime("%Y.%m.%d")
        searchidx = searchIndicesPrefix + "-" + datestr
 
        # searchidX = searchIndicesPrefix + "-" + "2021.03.25"
        searchtype = "doc, teste-type"
 
        es = Elasticsearch([{'host': '192.168.0.100'.'port': 9200}])
        countresult = es.count(index=searchidx, doc_type=searchtype,
                               body={"query": {"bool": {"must": [{"match": {"response": responseCode}}, {"range" : {"@timestamp" : {"gte" : "now-10m"."lt" : "now"}}}]}}},
                                                        ignore=404)
        print(searchIndicesPrefix + ":")
        if 'count' in countresult.keys():
                hitstotal = countresult['count']
                print(" %d - %d documents found" % (responseCode, hitstotal))
        else:
                hitstotal = 0
                print(countresult)
        return hitstotal
 
 
def put_metric(responseCode, searchIndicesPrefix):
        cloudwatch= boto3.client('cloudwatch'.# Hard coded strings as credentials, not recommended.
                                 aws_access_key_id='xxx', aws_secret_access_key='xxx',
                                 region_name='ap-northeast-1'
                                                        )
        metricName = 'Logs_HTTPCode_' + str(responseCode) + '_Count'
        hittotal = getHitTotal(responseCode, searchIndicesPrefix)
        if hittotal > 0 :
            cloudwatch.put_metric_data(
                MetricData=[
                {
                    'MetricName': metricName,
                    'Dimensions': [{'Name': 'Elasticsearch Log Indices'.'Value': searchIndicesPrefix
                        }
                    ],
                    'Timestamp': str(datetime.datetime.now()),
                    'Unit': 'Count'.'Value': hittotal
                                   }],
                    Namespace='ELK/HTTPErrorCode'
                )
        return
 
def runTask():
        listCodes = [499, 502, 503, 401, 403, 429]
        listPrefixs = ['test-qa-access-logs-cw'.'test1-qa-access-logs-cw']
        for currPrefix in listPrefixs:
                for code in listCodes:
                        put_metric(code, currPrefix)
        return
 
runTask()
Copy the code

3.3. Setting a Cron Job

*/10 * * * * python /home/ubuntu/workarea/tools/logs-httpcode-metrics.py
Copy the code

3.4. Create an AWS CloudWatch alarm based on a user-defined indicator in Step 1

Relevant reference

www.elastic.co/guide/en/lo… Tryolabs.com/blog/2015/0… Elasticsearch – py. Readthedocs. IO/en/master/www.elastic.co/guide/en/x-… www.elastic.co/guide/en/el…