Prometheus系列之部署Blackbox_exporter

  blackbox_exporter 是 Prometheus 官方提供的一个 exporter,可以监控 HTTP、 HTTPS,、DNS、 TCP 、ICMP 等目标实例,从而实现对被监控节点进行监控 和数据采集。promethes调用blackbox_exporter去访问目标监控服务器,实现指标的采集

 prometheus blackbox_exporter下载地址:Download | Prometheus

下载安装

[root@openresty-dev software]# wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.26.0/blackbox_exporter-0.26.0.linux-amd64.tar.gz
[root@openresty-dev software]# tar -zxvf blackbox_exporter-0.26.0.linux-amd64.tar.gz 
[root@openresty-dev software]# mv blackbox_exporter-0.26.0.linux-amd64 /usr/local/blackbox_exporter

Systemctl

[root@prometheus software]# vim /etc/systemd/system/blackbox_exporter.service
[Unit]
Description=Prometheus Blackbox Exporter
After=network.target
Documentation=blackbox_exporter

[Service]
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter  \
  --config.file=/usr/local/blackbox_exporter/blackbox.yml \
  --web.listen-address=:9115 
Restart=on-failure 

[Install]
WantedBy=multi-user.target

[root@prometheus software]# systemctl daemon-reload
[root@prometheus software]# systemctl enable blackbox_exporter
[root@prometheus software]# systemctl start blackbox_exporter

添加到Prometheus

[root@prometheus software]# vim /usr/local/prometheus/prometheus.yml 
...
- job_name: 'http_status'      
    metrics_path: /probe
    params:
      module: [http_2xx]
    static_configs:
      - targets: ['http://192.168.3.232','https://xp.sb']  #可以指定多个target,以逗号分隔
        labels:
          instance: http_status
          group: web
    relabel_configs:
      - source_labels: [__address__]  #将__address__(当前监控目标URL地址的标签)修改为__param_target,用于传递给blackbox_exporter
        target_label: __param_target  #标签key为__param_target、value为 www.test.com。
      - source_labels: [__param_target]    #基于__param_target 获取监控目标
        target_label: url
      - target_label: __address__  #新添加一个目标__address__,指向blackbox_exporter 服务器地址,用于将监控请求发送给指定的 blackbox_exporter 服务器
        replacement: 192.168.3.232:9115  #指定 blackbox_exporter 服务器地址
[root@prometheus software]# systemctl restart prometheus

Grafana

配置无误之后,可以进入到Grafana导入图形模板(ID 9965/ID 13587)来更直观的查看监控信息。

告警

进入到prometheusrules目录下,新建blackbox_exporter.yaml的告警规则。

[root@prometheus prometheus]# cat >> prometheus/rules/blackbox_exporter.yml <<"EOF"
groups:
- name: Blackbox
  rules:
  - alert: 黑盒探测失败告警
    expr: probe_success == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "黑盒探测失败{{ $labels.instance }}"
      description: "黑盒检测失败,当前值:{{ $value }}"
  - alert: 请求慢告警
    expr: avg_over_time(probe_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "请求慢{{ $labels.instance }}"
      description: "请求时间超过1秒,值为:{{ $value }}"
  - alert: http状态码检测失败
    expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "http状态码检测失败{{ $labels.instance }}"
      description: "HTTP状态码非 200-399,当前状态码为:{{ $value }}"
  - alert: ssl证书即将到期
    expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "证书即将到期{{ $labels.instance }}"
      description: "SSL 证书在 30 天后到期,值:{{ $value }}"

  - alert: ssl证书即将到期
    expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 3
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "证书即将到期{{ $labels.instance }}"
      description: "SSL 证书在 3 天后到期,值:{{ $value }}"

  - alert: ssl证书已过期
    expr: probe_ssl_earliest_cert_expiry - time() <= 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "证书已过期{{ $labels.instance }}"
      description: "SSL 证书已经过期,请确认是否在使用"
EOF

保存退出之后,重载prometheus服务,可以在web界面看到规则已经生效了。

此时,我们停掉232上面的nginx服务,模拟网站宕机的情况,来触发告警,从而测试告警服务配置是否正常。

[root@openresty-dev blackbox_exporter]# pkill -9 nginx

看到mailbox中收到告警邮件,就说明告警规则配置没有问题了。