You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

21 KiB

Alias Tag Date DocType Hierarchy TimeStamp location CollapseMetaTable
Prometheus
💻
🖥️
🔎
2022-03-19 Personal NonRoot
47.3639129
8.55627491017841
true

Parent:: Selfhosting, Configuring Caddy, Server Tools


^Top

name Save
type command
action Save current file
id Save

^button-ConfiguringPrometheusNSave

Configuring Prometheus

title: Summary
collapse: open
This not runs through the installation and use of Prometheus as a monitoring tool.
Prometheus interacts better with json logs rather than common log language, which is caddy's output.

style: number


Introduction

#^Top

Prometheus is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language.

Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors.

Prometheus comprises the following major components:

  • The main Prometheus server for scraping and storing time-series data.
  • Unique exporters for services such as Graphite, HAProxy, StatsD and so much more
  • An alert manager for handling alerts
  • A push-gateway for supporting transient jobs
  • Client libraries for instrumenting application code


Installing Prometheus

#^Top

Installing the main modules

But first, we need to create the configuration and data directories for Prometheus.

To create the configuration directory, run the command:

~~~bash
sudo mkdir -p /etc/prometheus
~~~

For the data directory, execute:

~~~bash
sudo mkdir -p /var/lib/prometheus
~~~

Once the directories are created, grab the compressed installation file:

~~~bash
wget https://github.com/prometheus/prometheus/releases/download/v2.31.0/prometheus-2.31.0.linux-amd64.tar.gz 
~~~

Once downloaded, extract the tarball file.

~~~bash
tar -xvf prometheus-2.31.3.linux-amd64.tar.gz
~~~

Then navigate to the Prometheus folder.

~~~bash
cd prometheus-2.31.3.linux-amd64
~~~

Once in the directory move the  prometheus and promtool binary files to /usr/local/bin/ folder.

~~~bash
sudo mv prometheus promtool /usr/local/bin/
~~~

Additionally, move console files in console directory and library files in the console_libraries  directory to /etc/prometheus/ directory.

~~~bash
sudo mv consoles/ console_libraries/ /etc/prometheus/
~~~

Also, ensure to move the prometheus.yml template configuration file to the  /etc/prometheus/ directory.

~~~bash
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
~~~

At this point, Prometheus has been successfully installed. To check the version of Prometheus installed, run the command:

~~~bash
prometheus --version
~~~

Output:

~~~bash
prometheus, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
build user: root@5cff4265f0e3
build date: 20211005-16:10:52
go version: go1.17.1
platform: linux/amd64
~~~

~~~bash
promtool --version
~~~

Output:

~~~bash
promtool, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
build user: root@5cff4265f0e3
build date: 20211005-16:10:52
go version: go1.17.1
platform: linux/amd64
~~~

If your output resembles what I have, then you are on the right track. In the next step, we will create a system group and user.

Permissions & User Management

#^Top It's essential that we create a Prometheus group and user before proceeding to the next step which involves creating a system file for Prometheus.

To  create a prometheus group execute the command:

~~~bash
sudo groupadd --system prometheus
~~~

Thereafter, Create prometheus user and assign it to the just-created prometheus group.

~~~bash
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
~~~

Next, configure the directory ownership and permissions as follows.

~~~bash
sudo chown -R prometheus:prometheus /etc/prometheus/ /var/lib/prometheus/$ sudo chmod -R 775 /etc/prometheus/ /var/lib/prometheus/
~~~

The only part remaining is to make Prometheus a systemd service so that we can easily manage its running status.

Configuring the service

#^Top Using your favorite text editor, create a systemd service file:

~~~bash
sudo nano /etc/systemd/system/prometheus.service
~~~

Paste the following lines of code.

~~~bash
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Restart=always
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file=/etc/prometheus/prometheus.yml \
    --storage.tsdb.path=/var/lib/prometheus/ \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.listen-address=0.0.0.0:9090

[Install]
WantedBy=multi-user.target
~~~

Save the changes and exit the systemd file.

Then proceed and start the Prometheus service.

~~~bash
sudo systemctl start prometheus
~~~

Enable the Prometheus service to run at startup. Therefore invoke the command:

~~~bash
sudo systemctl enable prometheus
~~~

Then confirm the status of the Prometheus service.

~~~bash
sudo systemctl status prometheus
~~~

Check status of Prometheus servicesCheck status of Prometheus services

Configuration of user acccess

#^Top Finally, to access Prometheus, parameter your reverse-proxy (Configuring Caddy) to point back to the service.

It is accessible below, under internal port 9090:

https://prometheus.mfxm.fr

prometheus dashboardprometheus dashboard


Configuring alerts

#^Top

Install Alertmanager

Download the latest version of Alert Manager (v0.23.0 at the time of this writing) with the following command:

~~~bash
wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz

~~~

Alert Manager is being downloaded. It may take a while to complete.

At this point, Alert Manager should be downloaded.

Once Alert Manager is downloaded, you should find a new archive file alertmanager-0.23.0.linux-amd64.tar.gz in your current working directory.

Extract the alertmanager-0.23.0.linux-amd64.tar.gz archive with the following command:

~~~bash
tar xzf alertmanager-0.22.2.linux-amd64.tar.gz

~~~

You should find a new directory alertmanager-0.23.0.linux-amd64/ as marked in the screenshot below.

Now, move the alertmanager-0.23.0.linux-amd64 directory to /opt/ directory and rename it to alertmanager as follows:

~~~bash
sudo mv -v alertmanager-0.23.0.linux-amd64 /opt/alertmanager
~~~

Change the user and group of all the files and directories of the /opt/alertmanager/ directory to root as follows:

~~~bash
sudo chown -Rfv root:root /opt/alertmanager
~~~

In the /opt/alertmanager directory, you should find the alertmanager binary and the Alert Manager configuration file alertmanager.yml. You will use them later. So, just keep that in mind.

Creating a Data Directory

#^Top Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the prometheus system user, the prometheus system user must have access (read, write, and execute permissions) to that data directory.

You can create the data/ directory in the /opt/alertmanager/ directory as follows:

~~~bash
sudo mkdir -v /opt/alertmanager/data
~~~

Change the owner and group of the /opt/alertmanager/data/ directory to prometheus with the following command:

~~~bash
sudo chown -Rfv prometheus:prometheus /opt/alertmanager/data
~~~

The owner and group of the /opt/alertmanager/data/ directory should be changed to prometheus.

Starting Alert Manager on Boot

#^Top Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd.

To create a systemd service file alertmanager.service, run the following command:

~~~bash
sudo nano /etc/systemd/system/alertmanager.service
~~~

Type in the following lines in the alertmanager.service file.

~~~bash
[Unit]  
Description=Alertmanager for prometheus

[Service]  
Restart=always  
User=prometheus  
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data              
ExecReload=/bin/kill -HUP $MAINPID  
TimeoutStopSec=20s  
SendSIGKILL=no

[Install]  
WantedBy=multi-user.target
~~~

For the systemd changes to take effect, run the following command:

~~~bash
sudo systemctl daemon-reload
~~~

Now, start the alertmanager service with the following command:

~~~bash
sudo systemctl start alertmanager.service
~~~

Add the alertmanager service to the system startup so that it automatically starts on boot with the following command:

~~~bash
sudo systemctl enable alertmanager.service
~~~

As you can see, the alertmanager service is active/running. It is also enabled (it will start automatically on boot).

~~~bash
sudo systemctl status alertmanager.service
~~~

Configuring Prometheus

#^Top Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section.

First, find the IP address of the computer where you have installed Alert Manager with the following command:

~~~bash
hostname -I
~~~

Now, open the Prometheus configuration file /opt/prometheus/prometheus.yml with the nano text editor as follows:

~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~

Type in the following lines in the scrape_configs section to add Alert Manager for monitoring with Prometheus.

~~~bash
- job_name: 'alertmanager'  
  static_configs:  
  - targets: ['localhost:9093']
~~~

Also, type in the IP address and port number of Alert Manager in the alerting > alertmanagers section.

For the changes to take effect, restart the prometheus service as follows:

~~~bash
sudo systemctl restart prometheus
~~~

Visit the URL http://192.168.20.161:9090/targets from your favorite web browser, and you should see that alertmanager is in the UP state. So, Prometheus can access Alert Manager just fine.

Creating a Prometheus Alert Rule

#^Top On Prometheus, you can use the up expression to find the state of the targets added to Prometheus, as shown in the screenshot below.

The targets that are in the UP state (running and accessible to Prometheus) will have the value 1, and targets that are not in the UP (or DOWN) state (not running or inaccessible to Prometheus) will have the value 0.

If you stop one of the targets node_exporter (lets say).

~~~bash
sudo systemctl stop node-exporter.service
~~~

The up value of that target should be 0, as you can see in the screenshot below. You get the idea.

So, you can use the up == 0 expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below.

This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus.

To create a Prometheus Alert, create a new file rules.yml in the /opt/prometheus/ directory as follows:

~~~bash
sudo nano /etc/prometheus/rules.yml
~~~

Now, type in the following lines in the rules.yml file.

~~~yaml
groups:  
 - name: test
   rules:
    - alert: InstanceDown
      expr: up == 0
	  for: 1m
~~~

Here, the alert InstanceDown will be fired when targets are not running or inaccessible to Prometheus (that is up == 0) for a minute (1m).

Now, open the Prometheus configuration file /opt/prometheus/prometheus.yml with the nano text editor as follows:

~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~

Add the rules.yml file in the rule_files section of the prometheus.yml configuration file.

Another important option of the prometheus.yml file is evaluation_interval. Prometheus will check whether any rules matched every evaluation_interval time. The default is 15s (15 seconds). So, the Alert rules in the rules.yml file will be checked every 15 seconds.

For the changes to take effect, restart the prometheus service as follows:

~~~bash
sudo systemctl restart prometheus
~~~

Now, navigate to the URL http://localhost:9010/rules from your favorite web browser, and you should see the rule InstanceDown that youve just added.

As youve stopped node_exporter earlier, the alert is active, and it is waiting to be sent to the Alert Manager.

After a minute has passed, the alert InstanceDown should be in the FIRING state. It means that the alert is sent to the Alert Manager.


Configuring monitoring modules

#^Top

Node-Exporter

Pour commencer, télécharger la dernière version de Node Exporter ici: Node-Exporter

~~~bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
~~~

Dépaquetage
~~~bash
tar -xf node_exporter-1.3.1.linux-amd64.tar.gz
~~~

Puis on la déplace dans un répertoire qui lui permet d'être gérer par le système

~~~bash
mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/
~~~

Installation & Mise en service

En réalitée, on installe pas vraiment Node Exporter, on crée juste une tache système qui vas lancer la commande.

Et pour ça, on crée un utilisateur node exporter qui va s'occuper du service.

~~~bash
useradd -rs /bin/false node_exporter
~~~

Ensuite on crée le fameux service.

~~~bash
sudo nano /etc/systemd/system/node_exporter.service
~~~

Le fichier doit contenir les infos suivante:

~~~bash
[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
~~~

Maintenant il faut recharger le daemon

~~~bash
sudo systemctl daemon-reload
~~~

Puis démarrer node_exporter

~~~bash
sudo systemctl start node_exporter
~~~

Il faut vérifier si node_exporter fonctionne

~~~bash
sudo systemctl status node_exporter
~~~

Si tout vas bien, alors on peut l'ajouter au service au démarrage

~~~bash
sudo systemctl enable node_exporter
~~~

Pour savoir si tout vas bien:

~~~bash
sudo curl http://localhost:9100/metrics
~~~

Ajouter l'host à Prometheus

Pour ajouter l'host il faut modifier le fichier de configuration de Prometheus

~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~

Ajouter un target avec l'adresse ip voulu en dessous du target existant.

~~~yaml
  - job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9100']
~~~

Redémarrage de Prometheus

Pour que tout soit pris en compte il faut redémarrer le service prometheus:

~~~bash
sudo systemct restart prometheus
~~~

Vérification

Pour voire si tout vas bien, un petit tour sur votre interface prometheus (http://prometheus-ip:9090/targets) ou grafana et voir si votre host apparait bien !


Configuring rules and alerts

#^Top

Introduction

Rules defining alerts are to be defined in /etc/prometheus/config.yml by referencing rule files in the same folder. As a generic process, here is what to do:

  1. Define & reference the rule file in Prometheus' config file rules.yml

  2. Create the rule file

~~~bash
sudo nano /etc/prometheus/rules.yml
~~~

  1. Add the defined rule See external resource for examples.

  2. Relaunch Prometheus

~~~bash
sudo systemctl restart prometheus
~~~

Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.

External ressource

Awesome Prometheus alerts | Collection of alerting rules


Using Prometheus to monitor Caddy

#^Top

Global parameters

Caddy metrics API https://tools.mfxm.fr:7784
Prometheus web listening port 9010

Adding a monitoring job

#^Top Monitoring jobs are called scrape Jobs and are defined in the /etc/prometheus/prometheus.yml file under the scrape_configs: JSON header. Below is an example of job definition.

~~~javascript
scrape_configs:
 - job_name: caddy
  scheme: https
  static_configs:
   - targets:
    - tools.mfxm.fr:7784
~~~


Using Telegram for notifications

#^Top

Installing the Telegram Bridge

In order to set up the Configuring Telegram bots, first, pull the image from its github repository:

~~~bash
sudo git clone https://github.com/inCaller/prometheus_bot
~~~

Move to the created folder:

~~~bash
cd ~/prometheus_bot
~~~

Compile the programme in Go:

~~~bash
export GOPATH="your go path"
make clean
make
~~~

Update the config file:

/home/melchiorbv/prometheus_bot/config.yaml

~~~yaml
telegram_token: "token goes here"
    # ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail
    time_outdata: "02/01/2006 15:04:05" 
    template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE
    time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE
    split_msg_byte: 4000
    send_only: true # use bot only to send messages.
~~~

Then, update the template file:

/home/melchiorbv/prometheus_bot/template.tmpl

~~~yaml
Type: {{.CommonAnnotations.description}}

Summary: {{.CommonAnnotations.summary}}

Alertname: {{ .CommonLabels.alertname }}

Instance: {{ .CommonLabels.instance }}

Serverity: {{ .CommonLabels.serverity}}

Status: {{ .Status }}
~~~

Run the daemon with:

~~~bash
./prometheus_bot
~~~

First part done.

Linking the bot to Alertmanager

#^Top

Edit the AlertManager config file under /opt/alertmanager/alertmanager.yml and add:

~~~yaml
- name: 'admins'
  webhook_configs:
  - send_resolved: True
    url: http://127.0.0.1:9087/alert/chat_id
~~~

Replace chat_id with the value you got from your bot, with everything inside the quotes. (Some chat_id's start with a -, in this case, you must also include the - in the url) To use multiple chats just add more receivers.

Relaunch the AlertManager:

~~~bash
sudo systemctl restart alertmanager.service
~~~

#^Top