|
|
---
|
|
|
|
|
|
Alias: ["Prometheus"]
|
|
|
Tag: ["Computer", "Server", "Monitoring"]
|
|
|
Date: 2022-03-19
|
|
|
DocType: "Personal"
|
|
|
Hierarchy: "NonRoot"
|
|
|
TimeStamp:
|
|
|
location: [47.3639129,8.55627491017841]
|
|
|
CollapseMetaTable: true
|
|
|
|
|
|
---
|
|
|
|
|
|
Parent:: [[Selfhosting]], [[Configuring Caddy|caddy]], [[Server Tools]]
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
^Top
|
|
|
|
|
|
```button
|
|
|
name Save
|
|
|
type command
|
|
|
action Save current file
|
|
|
id Save
|
|
|
```
|
|
|
^button-ConfiguringPrometheusNSave
|
|
|
|
|
|
 
|
|
|
|
|
|
# Configuring Prometheus
|
|
|
|
|
|
 
|
|
|
|
|
|
```ad-abstract
|
|
|
title: Summary
|
|
|
collapse: open
|
|
|
This not runs through the installation and use of Prometheus as a monitoring tool.
|
|
|
Prometheus interacts better with json logs rather than common log language, which is caddy's output.
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
```toc
|
|
|
style: number
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Introduction
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
[Prometheus](https://prometheus.io/docs/introduction/overview/) is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language.
|
|
|
|
|
|
Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors.
|
|
|
|
|
|
Prometheus comprises the following major components:
|
|
|
|
|
|
- The main Prometheus server for scraping and storing time-series data.
|
|
|
- Unique exporters for services such as Graphite, HAProxy, StatsD and so much more
|
|
|
- An alert manager for handling alerts
|
|
|
- A push-gateway for supporting transient jobs
|
|
|
- Client libraries for instrumenting application code
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Installing Prometheus
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Installing the main modules
|
|
|
|
|
|
But first, we need to create the configuration and data directories for Prometheus.
|
|
|
|
|
|
To create the configuration directory, run the command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mkdir -p /etc/prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
For the data directory, execute:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mkdir -p /var/lib/prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Once the directories are created, grab the compressed installation file:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
wget https://github.com/prometheus/prometheus/releases/download/v2.31.0/prometheus-2.31.0.linux-amd64.tar.gz
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Once downloaded, extract the tarball file.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
tar -xvf prometheus-2.31.3.linux-amd64.tar.gz
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Then navigate to the Prometheus folder.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
cd prometheus-2.31.3.linux-amd64
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Once in the [directory move](https://linoxide.com/mv-command-in-linux/) the `prometheus` and `promtool` binary files to `/usr/local/bin/` folder.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mv prometheus promtool /usr/local/bin/
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Additionally, move console files in `console` directory and library files in the `console_libraries` directory to `/etc/prometheus/` directory.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mv consoles/ console_libraries/ /etc/prometheus/
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Also, ensure to move the prometheus.yml template configuration file to the **`/etc/prometheus/`** directory.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
At this point, Prometheus has been successfully installed. To check the version of Prometheus installed, run the command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
prometheus --version
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Output:
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
prometheus, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
|
|
|
build user: root@5cff4265f0e3
|
|
|
build date: 20211005-16:10:52
|
|
|
go version: go1.17.1
|
|
|
platform: linux/amd64
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
promtool --version
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Output:
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
promtool, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
|
|
|
build user: root@5cff4265f0e3
|
|
|
build date: 20211005-16:10:52
|
|
|
go version: go1.17.1
|
|
|
platform: linux/amd64
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
If your output resembles what I have, then you are on the right track. In the next step, we will create a system group and user.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Permissions & User Management
|
|
|
[[#^Top|TOP]]
|
|
|
It's essential that we create a Prometheus group and user before proceeding to the next step which involves creating a system file for Prometheus.
|
|
|
|
|
|
To create a `prometheus` [group](https://linoxide.com/groupadd-command/) execute the command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo groupadd --system prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Thereafter, Create `prometheus` user and assign it to the just-created `prometheus` group.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Next, configure the directory ownership and permissions as follows.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo chown -R prometheus:prometheus /etc/prometheus/ /var/lib/prometheus/$ sudo chmod -R 775 /etc/prometheus/ /var/lib/prometheus/
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
The only part remaining is to make Prometheus a systemd service so that we can easily manage its running status.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Configuring the service
|
|
|
[[#^Top|TOP]]
|
|
|
Using your favorite text editor, create a systemd service file:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/systemd/system/prometheus.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Paste the following lines of code.
|
|
|
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
[Unit]
|
|
|
Description=Prometheus
|
|
|
Wants=network-online.target
|
|
|
After=network-online.target
|
|
|
|
|
|
[Service]
|
|
|
User=prometheus
|
|
|
Group=prometheus
|
|
|
Restart=always
|
|
|
Type=simple
|
|
|
ExecStart=/usr/local/bin/prometheus \
|
|
|
--config.file=/etc/prometheus/prometheus.yml \
|
|
|
--storage.tsdb.path=/var/lib/prometheus/ \
|
|
|
--web.console.templates=/etc/prometheus/consoles \
|
|
|
--web.console.libraries=/etc/prometheus/console_libraries \
|
|
|
--web.listen-address=0.0.0.0:9090
|
|
|
|
|
|
[Install]
|
|
|
WantedBy=multi-user.target
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
|
|
|
Save the changes and exit the systemd file.
|
|
|
|
|
|
Then proceed and start the Prometheus service.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl start prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Enable the Prometheus service to run at startup. Therefore invoke the command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl enable prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Then confirm the status of the Prometheus service.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl status prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Configuration of user acccess
|
|
|
[[#^Top|TOP]]
|
|
|
Finally, to access Prometheus, parameter your reverse-proxy ([[Configuring Caddy|caddy]]) to point back to the service.
|
|
|
|
|
|
It is accessible below, under internal port 9090:
|
|
|
|
|
|
```ad-address
|
|
|
https://prometheus.mfxm.fr
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
|
|
|
![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Configuring alerts
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Install Alertmanager
|
|
|
|
|
|
Download the latest version of Alert Manager (v0.23.0 at the time of this writing) with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz
|
|
|
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Alert Manager is being downloaded. It may take a while to complete.
|
|
|
|
|
|
At this point, Alert Manager should be downloaded.
|
|
|
|
|
|
Once Alert Manager is downloaded, you should find a new archive file **alertmanager-0.23.0.linux-amd64.tar.gz** in your current working directory.
|
|
|
|
|
|
Extract the **alertmanager-0.23.0.linux-amd64.tar.gz** archive with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
tar xzf alertmanager-0.22.2.linux-amd64.tar.gz
|
|
|
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
You should find a new directory **alertmanager-0.23.0.linux-amd64/** as marked in the screenshot below.
|
|
|
|
|
|
Now, move the **alertmanager-0.23.0.linux-amd64** directory to **/opt/** directory and rename it to **alertmanager** as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mv -v alertmanager-0.23.0.linux-amd64 /opt/alertmanager
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Change the user and group of all the files and directories of the `/opt/alertmanager/` directory to root as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo chown -Rfv root:root /opt/alertmanager
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
In the **/opt/alertmanager** directory, you should find the **alertmanager** binary and the Alert Manager configuration file **alertmanager.yml**. You will use them later. So, just keep that in mind.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Creating a Data Directory
|
|
|
[[#^Top|TOP]]
|
|
|
Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the **prometheus** system user, the **prometheus** system user must have access (read, write, and execute permissions) to that data directory.
|
|
|
|
|
|
You can create the **data/** directory in the **/opt/alertmanager/** directory as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo mkdir -v /opt/alertmanager/data
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Change the owner and group of the **/opt/alertmanager/data/** directory to **prometheus** with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo chown -Rfv prometheus:prometheus /opt/alertmanager/data
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
The owner and group of the **/opt/alertmanager/data/** directory should be changed to **prometheus**.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Starting Alert Manager on Boot
|
|
|
[[#^Top|TOP]]
|
|
|
Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd.
|
|
|
|
|
|
To create a systemd service file **alertmanager.service**, run the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/systemd/system/alertmanager.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Type in the following lines in the **alertmanager.service** file.
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
[Unit]
|
|
|
Description=Alertmanager for prometheus
|
|
|
|
|
|
[Service]
|
|
|
Restart=always
|
|
|
User=prometheus
|
|
|
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data
|
|
|
ExecReload=/bin/kill -HUP $MAINPID
|
|
|
TimeoutStopSec=20s
|
|
|
SendSIGKILL=no
|
|
|
|
|
|
[Install]
|
|
|
WantedBy=multi-user.target
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
For the systemd changes to take effect, run the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl daemon-reload
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Now, start the **alertmanager** service with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl start alertmanager.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Add the **alertmanager** service to the system startup so that it automatically starts on boot with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl enable alertmanager.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
As you can see, the **alertmanager** service is **active/running**. It is also **enabled** (it will start automatically on boot).
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl status alertmanager.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Configuring Prometheus
|
|
|
[[#^Top|TOP]]
|
|
|
Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section.
|
|
|
|
|
|
First, find the IP address of the computer where you have installed Alert Manager with the following command:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
hostname -I
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/prometheus/prometheus.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Type in the following lines in the **scrape_configs** section to add Alert Manager for monitoring with Prometheus.
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
- job_name: 'alertmanager'
|
|
|
static_configs:
|
|
|
- targets: ['localhost:9093']
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Also, type in the IP address and port number of Alert Manager in the **alerting > alertmanagers** section.
|
|
|
|
|
|
For the changes to take effect, restart the **prometheus** service as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl restart prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Visit the URL [http://192.168.20.161:9090/targets](http://192.168.20.161:9090/targets) from your favorite web browser, and you should see that **alertmanager** is in the **UP** state. So, Prometheus can access Alert Manager just fine.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Creating a Prometheus Alert Rule
|
|
|
[[#^Top|TOP]]
|
|
|
On Prometheus, you can use the **up** expression to find the state of the targets added to Prometheus, as shown in the screenshot below.
|
|
|
|
|
|
The targets that are in the **UP** state (running and accessible to Prometheus) will have the value **1**, and targets that are not in the **UP** (or **DOWN**) state (not running or inaccessible to Prometheus) will have the value **0**.
|
|
|
|
|
|
If you stop one of the targets – **node_exporter** (let’s say).
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl stop node-exporter.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
The **up** value of that target should be **0**, as you can see in the screenshot below. You get the idea.
|
|
|
|
|
|
So, you can use the **up == 0** expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below.
|
|
|
|
|
|
This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus.
|
|
|
|
|
|
To create a Prometheus Alert, create a new file **rules.yml** in the **/opt/prometheus/** directory as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/prometheus/rules.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Now, type in the following lines in the **rules.yml** file.
|
|
|
|
|
|
```ad-code
|
|
|
~~~yaml
|
|
|
groups:
|
|
|
- name: test
|
|
|
rules:
|
|
|
- alert: InstanceDown
|
|
|
expr: up == 0
|
|
|
for: 1m
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Here, the alert **InstanceDown** will be fired when targets are not running or inaccessible to Prometheus (that is **up == 0**) for a minute (**1m**).
|
|
|
|
|
|
Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/prometheus/prometheus.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Add the **rules.yml** file in the **rule_files** section of the prometheus.yml configuration file.
|
|
|
|
|
|
|
|
|
Another important option of the **prometheus.yml** file is **evaluation_interval**. Prometheus will check whether any rules matched every **evaluation_interval** time. The default is 15s (**15** seconds). So, the Alert rules in the **rules.yml** file will be checked every 15 seconds.
|
|
|
|
|
|
For the changes to take effect, restart the **prometheus** service as follows:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl restart prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Now, navigate to the URL [http://localhost:9010/rules](http://localhost:9010/rules) from your favorite web browser, and you should see the rule **InstanceDown** that you’ve just added.
|
|
|
|
|
|
As you’ve stopped **node_exporter** earlier, the alert is active, and it is waiting to be sent to the Alert Manager.
|
|
|
|
|
|
After a minute has passed, the alert **InstanceDown** should be in the **FIRING** state. It means that the alert is sent to the Alert Manager.
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Configuring monitoring modules
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Node-Exporter
|
|
|
|
|
|
Pour commencer, télécharger la dernière version de Node Exporter ici: [Node-Exporter](https://prometheus.io/download/#node_exporter)
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
##### Dépaquetage
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
tar -xf node_exporter-1.3.1.linux-amd64.tar.gz
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
Puis on la déplace dans un répertoire qui lui permet d'être gérer par le système
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
##### Installation & Mise en service
|
|
|
|
|
|
En réalitée, on installe pas vraiment Node Exporter, on crée juste une tache système qui vas lancer la commande.
|
|
|
|
|
|
Et pour ça, on crée un utilisateur node exporter qui va s'occuper du service.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
useradd -rs /bin/false node_exporter
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Ensuite on crée le fameux service.
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/systemd/system/node_exporter.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Le fichier doit contenir les infos suivante:
|
|
|
|
|
|
```ad-code
|
|
|
~~~bash
|
|
|
[Unit]
|
|
|
Description=Node Exporter
|
|
|
After=network.target
|
|
|
|
|
|
[Service]
|
|
|
User=node_exporter
|
|
|
Group=node_exporter
|
|
|
Type=simple
|
|
|
ExecStart=/usr/local/bin/node_exporter
|
|
|
|
|
|
[Install]
|
|
|
WantedBy=multi-user.target
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Maintenant il faut recharger le daemon
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl daemon-reload
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Puis démarrer node_exporter
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl start node_exporter
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Il faut vérifier si node_exporter fonctionne
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl status node_exporter
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Si tout vas bien, alors on peut l'ajouter au service au démarrage
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl enable node_exporter
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Pour savoir si tout vas bien:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo curl http://localhost:9100/metrics
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
##### Ajouter l'host à Prometheus
|
|
|
|
|
|
Pour ajouter l'host il faut modifier le fichier de configuration de Prometheus
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/prometheus/prometheus.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Ajouter un target avec l'adresse ip voulu en dessous du target existant.
|
|
|
|
|
|
```ad-code
|
|
|
~~~yaml
|
|
|
- job_name: 'node_exporter'
|
|
|
scrape_interval: 5s
|
|
|
static_configs:
|
|
|
- targets: ['localhost:9100']
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
##### Redémarrage de Prometheus
|
|
|
|
|
|
Pour que tout soit pris en compte il faut redémarrer le service prometheus:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemct restart prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
##### Vérification
|
|
|
|
|
|
Pour voire si tout vas bien, un petit tour sur votre interface prometheus ([http://prometheus-ip:9090/targets](http://prometheus-ip:9090/targets)) ou grafana et voir si votre host apparait bien !
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Configuring rules and alerts
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Introduction
|
|
|
|
|
|
Rules defining alerts are to be defined in `/etc/prometheus/config.yml` by referencing rule files in the same folder. As a generic process, here is what to do:
|
|
|
|
|
|
1. Define & reference the rule file in Prometheus' config file
|
|
|
`rules.yml`
|
|
|
|
|
|
2. Create the rule file
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo nano /etc/prometheus/rules.yml
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
3. Add the defined rule
|
|
|
See external resource for examples.
|
|
|
|
|
|
4. Relaunch Prometheus
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl restart prometheus
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### External ressource
|
|
|
|
|
|
[Awesome Prometheus alerts | Collection of alerting rules](https://awesome-prometheus-alerts.grep.to/rules.html)
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Using Prometheus to monitor Caddy
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Global parameters
|
|
|
|
|
|
| | |
|
|
|
| --------------------- | -------------------------- |
|
|
|
| **Caddy metrics API** | https://tools.mfxm.fr:7784 |
|
|
|
| **Prometheus web listening port** | 9010 |
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Adding a monitoring job
|
|
|
[[#^Top|TOP]]
|
|
|
Monitoring jobs are called `scrape` Jobs and are defined in the `/etc/prometheus/prometheus.yml` file under the `scrape_configs:` JSON header. Below is an example of job definition.
|
|
|
|
|
|
```ad-code
|
|
|
~~~javascript
|
|
|
scrape_configs:
|
|
|
- job_name: caddy
|
|
|
scheme: https
|
|
|
static_configs:
|
|
|
- targets:
|
|
|
- tools.mfxm.fr:7784
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
---
|
|
|
|
|
|
 
|
|
|
|
|
|
### Using Telegram for notifications
|
|
|
[[#^Top|TOP]]
|
|
|
 
|
|
|
|
|
|
#### Installing the Telegram Bridge
|
|
|
|
|
|
In order to set up the [[Configuring Telegram bots|Telegram bot]], first, pull the image from its github repository:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo git clone https://github.com/inCaller/prometheus_bot
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Move to the created folder:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
cd ~/prometheus_bot
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Compile the programme in Go:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
export GOPATH="your go path"
|
|
|
make clean
|
|
|
make
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Update the config file:
|
|
|
|
|
|
```ad-path
|
|
|
/home/melchiorbv/prometheus_bot/config.yaml
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
```ad-code
|
|
|
~~~yaml
|
|
|
telegram_token: "token goes here"
|
|
|
# ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail
|
|
|
time_outdata: "02/01/2006 15:04:05"
|
|
|
template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE
|
|
|
time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE
|
|
|
split_msg_byte: 4000
|
|
|
send_only: true # use bot only to send messages.
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Then, update the template file:
|
|
|
|
|
|
```ad-path
|
|
|
/home/melchiorbv/prometheus_bot/template.tmpl
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
```ad-code
|
|
|
~~~yaml
|
|
|
Type: {{.CommonAnnotations.description}}
|
|
|
|
|
|
Summary: {{.CommonAnnotations.summary}}
|
|
|
|
|
|
Alertname: {{ .CommonLabels.alertname }}
|
|
|
|
|
|
Instance: {{ .CommonLabels.instance }}
|
|
|
|
|
|
Serverity: {{ .CommonLabels.serverity}}
|
|
|
|
|
|
Status: {{ .Status }}
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
|
|
|
Run the daemon with:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
./prometheus_bot
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
First part done.
|
|
|
|
|
|
 
|
|
|
|
|
|
#### Linking the bot to Alertmanager
|
|
|
[[#^Top|TOP]]
|
|
|
|
|
|
Edit the `AlertManager` config file under `/opt/alertmanager/alertmanager.yml` and add:
|
|
|
|
|
|
```ad-code
|
|
|
~~~yaml
|
|
|
- name: 'admins'
|
|
|
webhook_configs:
|
|
|
- send_resolved: True
|
|
|
url: http://127.0.0.1:9087/alert/chat_id
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
Replace `chat_id` with the value you got from your bot, ***with everything inside the quotes***. (Some chat_id's start with a `-`, in this case, you must also include the `-` in the url) To use multiple chats just add more receivers.
|
|
|
|
|
|
Relaunch the AlertManager:
|
|
|
|
|
|
```ad-command
|
|
|
~~~bash
|
|
|
sudo systemctl restart alertmanager.service
|
|
|
~~~
|
|
|
```
|
|
|
|
|
|
 
|
|
|
 
|
|
|
|
|
|
[[#^Top|TOP]] |