21 KiB
Alias | Tag | Date | DocType | Hierarchy | TimeStamp | location | CollapseMetaTable | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
2022-03-19 | Personal | NonRoot |
|
true |
Parent:: Selfhosting, Configuring Caddy, Server Tools
^Top
name Save
type command
action Save current file
id Save
^button-ConfiguringPrometheusNSave
Configuring Prometheus
title: Summary
collapse: open
This not runs through the installation and use of Prometheus as a monitoring tool.
Prometheus interacts better with json logs rather than common log language, which is caddy's output.
style: number
Introduction
Prometheus is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language.
Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors.
Prometheus comprises the following major components:
- The main Prometheus server for scraping and storing time-series data.
- Unique exporters for services such as Graphite, HAProxy, StatsD and so much more
- An alert manager for handling alerts
- A push-gateway for supporting transient jobs
- Client libraries for instrumenting application code
Installing Prometheus
Installing the main modules
But first, we need to create the configuration and data directories for Prometheus.
To create the configuration directory, run the command:
~~~bash
sudo mkdir -p /etc/prometheus
~~~
For the data directory, execute:
~~~bash
sudo mkdir -p /var/lib/prometheus
~~~
Once the directories are created, grab the compressed installation file:
~~~bash
wget https://github.com/prometheus/prometheus/releases/download/v2.31.0/prometheus-2.31.0.linux-amd64.tar.gz
~~~
Once downloaded, extract the tarball file.
~~~bash
tar -xvf prometheus-2.31.3.linux-amd64.tar.gz
~~~
Then navigate to the Prometheus folder.
~~~bash
cd prometheus-2.31.3.linux-amd64
~~~
Once in the directory move the prometheus
and promtool
binary files to /usr/local/bin/
folder.
~~~bash
sudo mv prometheus promtool /usr/local/bin/
~~~
Additionally, move console files in console
directory and library files in the console_libraries
directory to /etc/prometheus/
directory.
~~~bash
sudo mv consoles/ console_libraries/ /etc/prometheus/
~~~
Also, ensure to move the prometheus.yml template configuration file to the /etc/prometheus/
directory.
~~~bash
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
~~~
At this point, Prometheus has been successfully installed. To check the version of Prometheus installed, run the command:
~~~bash
prometheus --version
~~~
Output:
~~~bash
prometheus, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
build user: root@5cff4265f0e3
build date: 20211005-16:10:52
go version: go1.17.1
platform: linux/amd64
~~~
~~~bash
promtool --version
~~~
Output:
~~~bash
promtool, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
build user: root@5cff4265f0e3
build date: 20211005-16:10:52
go version: go1.17.1
platform: linux/amd64
~~~
If your output resembles what I have, then you are on the right track. In the next step, we will create a system group and user.
Permissions & User Management
#^Top It's essential that we create a Prometheus group and user before proceeding to the next step which involves creating a system file for Prometheus.
To create a prometheus
group execute the command:
~~~bash
sudo groupadd --system prometheus
~~~
Thereafter, Create prometheus
user and assign it to the just-created prometheus
group.
~~~bash
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
~~~
Next, configure the directory ownership and permissions as follows.
~~~bash
sudo chown -R prometheus:prometheus /etc/prometheus/ /var/lib/prometheus/$ sudo chmod -R 775 /etc/prometheus/ /var/lib/prometheus/
~~~
The only part remaining is to make Prometheus a systemd service so that we can easily manage its running status.
Configuring the service
#^Top Using your favorite text editor, create a systemd service file:
~~~bash
sudo nano /etc/systemd/system/prometheus.service
~~~
Paste the following lines of code.
~~~bash
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Restart=always
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090
[Install]
WantedBy=multi-user.target
~~~
Save the changes and exit the systemd file.
Then proceed and start the Prometheus service.
~~~bash
sudo systemctl start prometheus
~~~
Enable the Prometheus service to run at startup. Therefore invoke the command:
~~~bash
sudo systemctl enable prometheus
~~~
Then confirm the status of the Prometheus service.
~~~bash
sudo systemctl status prometheus
~~~
Configuration of user acccess
#^Top Finally, to access Prometheus, parameter your reverse-proxy (Configuring Caddy) to point back to the service.
It is accessible below, under internal port 9090:
https://prometheus.mfxm.fr
Configuring alerts
Install Alertmanager
Download the latest version of Alert Manager (v0.23.0 at the time of this writing) with the following command:
~~~bash
wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz
~~~
Alert Manager is being downloaded. It may take a while to complete.
At this point, Alert Manager should be downloaded.
Once Alert Manager is downloaded, you should find a new archive file alertmanager-0.23.0.linux-amd64.tar.gz in your current working directory.
Extract the alertmanager-0.23.0.linux-amd64.tar.gz archive with the following command:
~~~bash
tar xzf alertmanager-0.22.2.linux-amd64.tar.gz
~~~
You should find a new directory alertmanager-0.23.0.linux-amd64/ as marked in the screenshot below.
Now, move the alertmanager-0.23.0.linux-amd64 directory to /opt/ directory and rename it to alertmanager as follows:
~~~bash
sudo mv -v alertmanager-0.23.0.linux-amd64 /opt/alertmanager
~~~
Change the user and group of all the files and directories of the /opt/alertmanager/
directory to root as follows:
~~~bash
sudo chown -Rfv root:root /opt/alertmanager
~~~
In the /opt/alertmanager directory, you should find the alertmanager binary and the Alert Manager configuration file alertmanager.yml. You will use them later. So, just keep that in mind.
Creating a Data Directory
#^Top Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the prometheus system user, the prometheus system user must have access (read, write, and execute permissions) to that data directory.
You can create the data/ directory in the /opt/alertmanager/ directory as follows:
~~~bash
sudo mkdir -v /opt/alertmanager/data
~~~
Change the owner and group of the /opt/alertmanager/data/ directory to prometheus with the following command:
~~~bash
sudo chown -Rfv prometheus:prometheus /opt/alertmanager/data
~~~
The owner and group of the /opt/alertmanager/data/ directory should be changed to prometheus.
Starting Alert Manager on Boot
#^Top Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd.
To create a systemd service file alertmanager.service, run the following command:
~~~bash
sudo nano /etc/systemd/system/alertmanager.service
~~~
Type in the following lines in the alertmanager.service file.
~~~bash
[Unit]
Description=Alertmanager for prometheus
[Service]
Restart=always
User=prometheus
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
~~~
For the systemd changes to take effect, run the following command:
~~~bash
sudo systemctl daemon-reload
~~~
Now, start the alertmanager service with the following command:
~~~bash
sudo systemctl start alertmanager.service
~~~
Add the alertmanager service to the system startup so that it automatically starts on boot with the following command:
~~~bash
sudo systemctl enable alertmanager.service
~~~
As you can see, the alertmanager service is active/running. It is also enabled (it will start automatically on boot).
~~~bash
sudo systemctl status alertmanager.service
~~~
Configuring Prometheus
#^Top Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section.
First, find the IP address of the computer where you have installed Alert Manager with the following command:
~~~bash
hostname -I
~~~
Now, open the Prometheus configuration file /opt/prometheus/prometheus.yml with the nano text editor as follows:
~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~
Type in the following lines in the scrape_configs section to add Alert Manager for monitoring with Prometheus.
~~~bash
- job_name: 'alertmanager'
static_configs:
- targets: ['localhost:9093']
~~~
Also, type in the IP address and port number of Alert Manager in the alerting > alertmanagers section.
For the changes to take effect, restart the prometheus service as follows:
~~~bash
sudo systemctl restart prometheus
~~~
Visit the URL http://192.168.20.161:9090/targets from your favorite web browser, and you should see that alertmanager is in the UP state. So, Prometheus can access Alert Manager just fine.
Creating a Prometheus Alert Rule
#^Top On Prometheus, you can use the up expression to find the state of the targets added to Prometheus, as shown in the screenshot below.
The targets that are in the UP state (running and accessible to Prometheus) will have the value 1, and targets that are not in the UP (or DOWN) state (not running or inaccessible to Prometheus) will have the value 0.
If you stop one of the targets – node_exporter (let’s say).
~~~bash
sudo systemctl stop node-exporter.service
~~~
The up value of that target should be 0, as you can see in the screenshot below. You get the idea.
So, you can use the up == 0 expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below.
This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus.
To create a Prometheus Alert, create a new file rules.yml in the /opt/prometheus/ directory as follows:
~~~bash
sudo nano /etc/prometheus/rules.yml
~~~
Now, type in the following lines in the rules.yml file.
~~~yaml
groups:
- name: test
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
~~~
Here, the alert InstanceDown will be fired when targets are not running or inaccessible to Prometheus (that is up == 0) for a minute (1m).
Now, open the Prometheus configuration file /opt/prometheus/prometheus.yml with the nano text editor as follows:
~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~
Add the rules.yml file in the rule_files section of the prometheus.yml configuration file.
Another important option of the prometheus.yml file is evaluation_interval. Prometheus will check whether any rules matched every evaluation_interval time. The default is 15s (15 seconds). So, the Alert rules in the rules.yml file will be checked every 15 seconds.
For the changes to take effect, restart the prometheus service as follows:
~~~bash
sudo systemctl restart prometheus
~~~
Now, navigate to the URL http://localhost:9010/rules from your favorite web browser, and you should see the rule InstanceDown that you’ve just added.
As you’ve stopped node_exporter earlier, the alert is active, and it is waiting to be sent to the Alert Manager.
After a minute has passed, the alert InstanceDown should be in the FIRING state. It means that the alert is sent to the Alert Manager.
Configuring monitoring modules
Node-Exporter
Pour commencer, télécharger la dernière version de Node Exporter ici: Node-Exporter
~~~bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
~~~
Dépaquetage
~~~bash
tar -xf node_exporter-1.3.1.linux-amd64.tar.gz
~~~
Puis on la déplace dans un répertoire qui lui permet d'être gérer par le système
~~~bash
mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/
~~~
Installation & Mise en service
En réalitée, on installe pas vraiment Node Exporter, on crée juste une tache système qui vas lancer la commande.
Et pour ça, on crée un utilisateur node exporter qui va s'occuper du service.
~~~bash
useradd -rs /bin/false node_exporter
~~~
Ensuite on crée le fameux service.
~~~bash
sudo nano /etc/systemd/system/node_exporter.service
~~~
Le fichier doit contenir les infos suivante:
~~~bash
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
~~~
Maintenant il faut recharger le daemon
~~~bash
sudo systemctl daemon-reload
~~~
Puis démarrer node_exporter
~~~bash
sudo systemctl start node_exporter
~~~
Il faut vérifier si node_exporter fonctionne
~~~bash
sudo systemctl status node_exporter
~~~
Si tout vas bien, alors on peut l'ajouter au service au démarrage
~~~bash
sudo systemctl enable node_exporter
~~~
Pour savoir si tout vas bien:
~~~bash
sudo curl http://localhost:9100/metrics
~~~
Ajouter l'host à Prometheus
Pour ajouter l'host il faut modifier le fichier de configuration de Prometheus
~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~
Ajouter un target avec l'adresse ip voulu en dessous du target existant.
~~~yaml
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9100']
~~~
Redémarrage de Prometheus
Pour que tout soit pris en compte il faut redémarrer le service prometheus:
~~~bash
sudo systemct restart prometheus
~~~
Vérification
Pour voire si tout vas bien, un petit tour sur votre interface prometheus (http://prometheus-ip:9090/targets) ou grafana et voir si votre host apparait bien !
Configuring rules and alerts
Introduction
Rules defining alerts are to be defined in /etc/prometheus/config.yml
by referencing rule files in the same folder. As a generic process, here is what to do:
-
Define & reference the rule file in Prometheus' config file
rules.yml
-
Create the rule file
~~~bash
sudo nano /etc/prometheus/rules.yml
~~~
-
Add the defined rule See external resource for examples.
-
Relaunch Prometheus
~~~bash
sudo systemctl restart prometheus
~~~
Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.
External ressource
Awesome Prometheus alerts | Collection of alerting rules
Using Prometheus to monitor Caddy
Global parameters
Caddy metrics API | https://tools.mfxm.fr:7784 |
Prometheus web listening port | 9010 |
Adding a monitoring job
#^Top
Monitoring jobs are called scrape
Jobs and are defined in the /etc/prometheus/prometheus.yml
file under the scrape_configs:
JSON header. Below is an example of job definition.
~~~javascript
scrape_configs:
- job_name: caddy
scheme: https
static_configs:
- targets:
- tools.mfxm.fr:7784
~~~
Using Telegram for notifications
Installing the Telegram Bridge
In order to set up the Configuring Telegram bots, first, pull the image from its github repository:
~~~bash
sudo git clone https://github.com/inCaller/prometheus_bot
~~~
Move to the created folder:
~~~bash
cd ~/prometheus_bot
~~~
Compile the programme in Go:
~~~bash
export GOPATH="your go path"
make clean
make
~~~
Update the config file:
/home/melchiorbv/prometheus_bot/config.yaml
~~~yaml
telegram_token: "token goes here"
# ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail
time_outdata: "02/01/2006 15:04:05"
template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE
time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE
split_msg_byte: 4000
send_only: true # use bot only to send messages.
~~~
Then, update the template file:
/home/melchiorbv/prometheus_bot/template.tmpl
~~~yaml
Type: {{.CommonAnnotations.description}}
Summary: {{.CommonAnnotations.summary}}
Alertname: {{ .CommonLabels.alertname }}
Instance: {{ .CommonLabels.instance }}
Serverity: {{ .CommonLabels.serverity}}
Status: {{ .Status }}
~~~
Run the daemon with:
~~~bash
./prometheus_bot
~~~
First part done.
Linking the bot to Alertmanager
Edit the AlertManager
config file under /opt/alertmanager/alertmanager.yml
and add:
~~~yaml
- name: 'admins'
webhook_configs:
- send_resolved: True
url: http://127.0.0.1:9087/alert/chat_id
~~~
Replace chat_id
with the value you got from your bot, with everything inside the quotes. (Some chat_id's start with a -
, in this case, you must also include the -
in the url) To use multiple chats just add more receivers.
Relaunch the AlertManager:
~~~bash
sudo systemctl restart alertmanager.service
~~~