ElastAlert教程11章:开始安装elastalert

前面,我们已经将数据从日志文件传输到了ES中,现在就可以通过不断查询ES来产生日志告警了。终于到了介绍elastalert的时候了。

告警工具可以选择 Alert Management、Elasticsearch watch、elastalert等,各有优缺点,有的是收费的,有的是免费的,这里我们选择 elastalert 。


安装elastalert依赖的python3

elastalert 依赖于 Python 3 ,所以需要先安装 Python 3。

你的centos系统一般是python2,可以执行我们的一键安装python3脚本:

这个脚本安装了python3,但是python2还是可以使用,不用担心。

wget http://66-ai.com/download/script-litte-prince/app/install-python3.sh -O /root/install-python3.sh  && sh /root/install-python3.sh

如果你对上面的脚本感兴趣,下面是脚本的详细代码:

#!/usr/bin/env bash
# 安装python3
# 自动将yum的命令依赖改为python2,以免引起yum异常

yum install -y openssl-devel bzip2-devel expat-devel gdbm-devel readline-devel sqlite-devel

if ! [ -x "$(command -v axel)" ]; then
  wget http://66-ai.com/download/script-litte-prince/Python-3.6.5.tgz -O /root/Python-3.6.5.tgz
else
  axel -n 10 -a http://66-ai.com/download/script-litte-prince/Python-3.6.5.tgz -o /root/Python-3.6.5.tgz
fi

tar -xzvf Python-3.6.5.tgz
cd Python-3.6.5
./configure --prefix=/usr/local/python
make
make install
cd /usr/bin
mv python python.bak
mv pip pip.bak
ln -s /usr/local/python/bin/python3.6 /usr/bin/python
ln -s /usr/local/python/bin/pip3.6 /usr/bin/pip


python2str=`cat /usr/bin/yum | grep -w '#!/usr/bin/python2'`

if [ "$python2str" = '#!/usr/bin/python2' ]; then
    echo '已经替换过'
else
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum-builddep
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum-config-manager
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum-debug-dump
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum-debug-restore
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yumdownloader
    sed -i 's@#!/usr/bin/python@#!/usr/bin/python2@g' /usr/bin/yum-groups-manager
    # 有一个空格
    sed -i 's@#! /usr/bin/python@#!/usr/bin/python2@g' /usr/libexec/urlgrabber-ext-down
fi

安装完成之后,执行 python 命令:

[root@k8s-nfs ~]# python
Python 3.6.5 (default, Jan 15 2021, 17:46:27) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

只要显示python的版本是3.6.5,这就表示安装成功了。


安装 elastalert

执行下面的代码安装 elastalert :

cd /root
yum install -y git
git clone https://github.com/Yelp/elastalert.git
cd elastalert
python setup.py install
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/

如果你觉得上面的代码,太长,也可以执行一键安装脚本,复制下面的脚本到你的命令行中:

wget http://66-ai.com/download/script-litte-prince/app/elastalert-install.sh -O /root/elastalert-install.sh && sh /root/elastalert-install.sh

安装完成后,-h 显示帮助,即代表成功:

/usr/local/python/bin/elastalert -h

结果显示帮助信息:

/usr/local/python/bin/elastalert -h
usage: elastalert [-h] [--config CONFIG] [--debug] [--rule RULE]
                  [--silence SILENCE] [--start START] [--end END] [--verbose]
                  [--patience TIMEOUT] [--pin_rules] [--es_debug]
                  [--es_debug_trace ES_DEBUG_TRACE]

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       Global config file (default: config.yaml)
  --debug               Suppresses alerts and prints information instead. Not
                        compatible with `--verbose`
  --rule RULE           Run only a specific rule (by filename, must still be
                        in rules folder)
  --silence SILENCE     Silence rule for a time period. Must be used with
                        --rule. Usage: --silence <units>=<number>, eg.
                        --silence hours=2
  --start START         YYYY-MM-DDTHH:MM:SS Start querying from this
                        timestamp. Use "NOW" to start from current time.
                        (Default: present)
  --end END             YYYY-MM-DDTHH:MM:SS Query to this timestamp. (Default:
                        present)
  --verbose             Increase verbosity without suppressing alerts. Not
                        compatible with `--debug`
  --patience TIMEOUT    Maximum time to wait for ElasticSearch to become
                        responsive. Usage: --patience <units>=<number>. e.g.
                        --patience minutes=5
  --pin_rules           Stop ElastAlert from monitoring config file changes
  --es_debug            Enable verbose logging from Elasticsearch queries
  --es_debug_trace ES_DEBUG_TRACE
                        Enable logging from Elasticsearch queries as curl
                        command. Queries will be logged to file. Note that
                        this will incorrectly display localhost:9200 as the
                        host/port

有可能jira版本太老,可以执行下面的代码更新:

pip install jira>=2.0.0

要替换一下,原版本太老旧, 这样就安装成功了,走,吃烧烤去。


配置第一个告警

首先 copy 一个配置

cp config.yaml.example config.yaml    // 根据模板生成配置文件
vim config.yaml   // 修改配置

先不慌执行,我们查看一下 config.yaml 文件, 把整个配置文件都列出来,不会多费纸,最多多费点电,所以,我打算把整个 config.yaml 文件都列出来:

# 规则文件所在的目录,规则文件后缀为yaml
rules_folder: example_rules

# 用来定时向elasticsearch发送请求,这里设置的是每分钟发送一次。
# 可以设置weeks、days、hours、minutes、seconds
run_every:
  minutes: 1

# ElastAlert will buffer results from the most recent
# period of time, in case some log sources are not in real time
buffer_time:
  minutes: 15

# elasticsearch的host地址,注意,每一个规则都有自己的Elasticsearch 主机
es_host: localhost

# elasticsearch 对应的端口号,9200 是默认端口
es_port: 9200

# The AWS region to use. Set this when using AWS-managed elasticsearch
#aws_region: us-east-1

# The AWS profile to use. Use this if you are using an aws-cli profile.
# See http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
# for details
#profile: test

# 可选的es url前缀
#es_url_prefix: elasticsearch

# 可选的,选择是否用SSL连接es,true或者false,一般都为false,没有必要用安全连接
#use_ssl: True

# 可选的,是否验证TLS证书,设置为true或者false,默认为- true
#verify_certs: True

# GET request with body is the default option for Elasticsearch.
# If it fails for some reason, you can pass 'GET', 'POST' or 'source'.
# See http://elasticsearch-py.readthedocs.io/en/master/connection.html?highlight=send_get_body_as#transport
# for details
#es_send_get_body_as: GET

# es的username和password,一般没有为es设置用户名和密码
#es_username: someusername
#es_password: somepassword

# Use SSL authentication with client certificates client_cert must be
# a pem file containing both cert and key for client
#verify_certs: True
#ca_certs: /path/to/cacert.pem
#client_cert: /path/to/client_cert.pem
#client_key: /path/to/client_key.key

# elastalert产生的日志在elasticsearch中的创建的索引
writeback_index: elastalert_status
writeback_alias: elastalert_alerts

# If an alert fails for some reason, ElastAlert will retry
# sending the alert until this time period has elapsed
# 如果告警失败,那么在指定的时间内会重新发送报警
alert_time_limit:
  days: 2

# Custom logging configuration
# If you want to setup your own logging configuration to log into
# files as well or to Logstash and/or modify log levels, use
# the configuration below and adjust to your needs.
# Note: if you run ElastAlert with --verbose/--debug, the log level of
# the "elastalert" logger is changed to INFO, if not already INFO/DEBUG.
#logging:
#  version: 1
#  incremental: false
#  disable_existing_loggers: false
#  formatters:
#    logline:
#      format: '%(asctime)s %(levelname)+8s %(name)+20s %(message)s'
#
#    handlers:
#      console:
#        class: logging.StreamHandler
#        formatter: logline
#        level: DEBUG
#        stream: ext://sys.stderr
#
#      file:
#        class : logging.FileHandler
#        formatter: logline
#        level: DEBUG
#        filename: elastalert.log
#
#    loggers:
#      elastalert:
#        level: WARN
#        handlers: []
#        propagate: true
#
#      elasticsearch:
#        level: WARN
#        handlers: []
#        propagate: true
#
#      elasticsearch.trace:
#        level: WARN
#        handlers: []
#        propagate: true
#
#      '':  # root logger
#        level: WARN
#          handlers:
#            - console
#            - file
#        propagate: false

完整的配置列表,请参考 https://elastalert.readthedocs.io/en/latest/ruletypes.html#rule-configuration-cheat-sheet


第一个告警规则的例子,频率告警

ElastAlert 有 11 种告警,我们首先介绍一种 frequency 告警,就是在一定的时间内,某个事情发生了X次,就会报警。这个告警非常有用,一般用于:

  1. 如1分钟了某个用户登录了10次系统,可能会告警属于机器人在破解网站。
  2. 1分钟内有1000人注册,那么可能是机器人在注册网站
  3. 1分钟内有100个接口调用错误,那么可能是系统某个地方出现了错误
# Alert when the rate of events exceeds a threshold

# (Optional)
# Elasticsearch host
es_host: localhost

# (Optional)
# Elasticsearch port
es_port: 9200

# (OptionaL) Connect with SSL to Elasticsearch
#use_ssl: True

# (Optional) basic-auth username and password for Elasticsearch
#es_username: someusername
#es_password: somepassword

# (Required)
# Rule name, must be unique
name: Example frequency rule

# (Required)
# Type of alert.
# the frequency rule type alerts when num_events events occur with timeframe time
type: frequency

# (Required)
# Index to search, wildcard supported
index: filebeat-7.1.1*

# (Required, frequency specific)
# Alert when this many documents matching the query occur within a timeframe
num_events: 5

# (Required, frequency specific)
# num_events must occur within this amount of time to trigger an alert
timeframe:
  hours: 1

# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
# For more info: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
filter:
- term:
    message: "aaa"

# (Required)
# The alert is use when a match is found
alert:
- "email"

# (required, email specific)
# a list of email addresses to send alerts to
email:
- "hewebgl3@foxmail.com"

我们对这个配置文件做一些详细解释: