--- Alias: ["Prometheus"] Tag: ["Computer", "Server", "Monitoring"] Date: 2022-03-19 DocType: "Personal" Hierarchy: "NonRoot" TimeStamp: location: [47.3639129,8.55627491017841] CollapseMetaTable: Yes --- Parent:: [[Selfhosting]], [[Configuring Caddy|caddy]], [[Server Tools]] ---   ^Top ```button name Save type command action Save current file id Save ``` ^button-ConfiguringPrometheusNSave   # Configuring Prometheus   ```ad-abstract title: Summary collapse: open This not runs through the installation and use of Prometheus as a monitoring tool. Prometheus interacts better with json logs rather than common log language, which is caddy's output. ```   ```toc style: number ```   ---   ### Introduction [[#^Top|TOP]]   [Prometheus](https://prometheus.io/docs/introduction/overview/) is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language. Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors. Prometheus comprises the following major components: - The main Prometheus server for scraping and storing time-series data. - Unique exporters for services such as Graphite, HAProxy, StatsD and so much more - An alert manager for handling alerts - A push-gateway for supporting transient jobs - Client libraries for instrumenting application code   ---   ### Installing Prometheus [[#^Top|TOP]]   #### Installing the main modules But first, we need to create the configuration and data directories for Prometheus. To create the configuration directory, run the command: ```ad-command ~~~bash sudo mkdir -p /etc/prometheus ~~~ ```   For the data directory, execute: ```ad-command ~~~bash sudo mkdir -p /var/lib/prometheus ~~~ ```   Once the directories are created, grab the compressed installation file: ```ad-command ~~~bash wget https://github.com/prometheus/prometheus/releases/download/v2.31.0/prometheus-2.31.0.linux-amd64.tar.gz ~~~ ```   Once downloaded, extract the tarball file. ```ad-command ~~~bash tar -xvf prometheus-2.31.3.linux-amd64.tar.gz ~~~ ```   Then navigate to the Prometheus folder. ```ad-command ~~~bash cd prometheus-2.31.3.linux-amd64 ~~~ ```   Once in the [directory move](https://linoxide.com/mv-command-in-linux/) the  `prometheus` and `promtool` binary files to `/usr/local/bin/` folder. ```ad-command ~~~bash sudo mv prometheus promtool /usr/local/bin/ ~~~ ```   Additionally, move console files in `console` directory and library files in the `console_libraries`  directory to `/etc/prometheus/` directory. ```ad-command ~~~bash sudo mv consoles/ console_libraries/ /etc/prometheus/ ~~~ ```   Also, ensure to move the prometheus.yml template configuration file to the  **`/etc/prometheus/`** directory. ```ad-command ~~~bash sudo mv prometheus.yml /etc/prometheus/prometheus.yml ~~~ ```   At this point, Prometheus has been successfully installed. To check the version of Prometheus installed, run the command: ```ad-command ~~~bash prometheus --version ~~~ ```   Output: ```ad-code ~~~bash prometheus, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df) build user: root@5cff4265f0e3 build date: 20211005-16:10:52 go version: go1.17.1 platform: linux/amd64 ~~~ ```   ```ad-command ~~~bash promtool --version ~~~ ```   Output: ```ad-code ~~~bash promtool, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df) build user: root@5cff4265f0e3 build date: 20211005-16:10:52 go version: go1.17.1 platform: linux/amd64 ~~~ ``` If your output resembles what I have, then you are on the right track. In the next step, we will create a system group and user.   #### Permissions & User Management [[#^Top|TOP]] It's essential that we create a Prometheus group and user before proceeding to the next step which involves creating a system file for Prometheus. To  create a `prometheus` [group](https://linoxide.com/groupadd-command/) execute the command: ```ad-command ~~~bash sudo groupadd --system prometheus ~~~ ```   Thereafter, Create `prometheus` user and assign it to the just-created `prometheus` group. ```ad-command ~~~bash sudo useradd -s /sbin/nologin --system -g prometheus prometheus ~~~ ```   Next, configure the directory ownership and permissions as follows. ```ad-command ~~~bash sudo chown -R prometheus:prometheus /etc/prometheus/ /var/lib/prometheus/$ sudo chmod -R 775 /etc/prometheus/ /var/lib/prometheus/ ~~~ ``` The only part remaining is to make Prometheus a systemd service so that we can easily manage its running status.   #### Configuring the service [[#^Top|TOP]] Using your favorite text editor, create a systemd service file: ```ad-command ~~~bash sudo nano /etc/systemd/system/prometheus.service ~~~ ```   Paste the following lines of code. ```ad-code ~~~bash [Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Restart=always Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file=/etc/prometheus/prometheus.yml \ --storage.tsdb.path=/var/lib/prometheus/ \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries \ --web.listen-address=0.0.0.0:9090 [Install] WantedBy=multi-user.target ~~~ ``` Save the changes and exit the systemd file. Then proceed and start the Prometheus service. ```ad-command ~~~bash sudo systemctl start prometheus ~~~ ```   Enable the Prometheus service to run at startup. Therefore invoke the command: ```ad-command ~~~bash sudo systemctl enable prometheus ~~~ ```   Then confirm the status of the Prometheus service. ```ad-command ~~~bash sudo systemctl status prometheus ~~~ ``` ![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)   #### Configuration of user acccess [[#^Top|TOP]] Finally, to access Prometheus, parameter your reverse-proxy ([[Configuring Caddy|caddy]]) to point back to the service. It is accessible below, under internal port 9090: ```ad-address https://prometheus.mfxm.fr ```   ![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)   ---   ### Configuring alerts [[#^Top|TOP]]   #### Install Alertmanager Download the latest version of Alert Manager (v0.23.0 at the time of this writing) with the following command: ```ad-command ~~~bash wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz ~~~ ```   Alert Manager is being downloaded. It may take a while to complete. At this point, Alert Manager should be downloaded. Once Alert Manager is downloaded, you should find a new archive file **alertmanager-0.23.0.linux-amd64.tar.gz** in your current working directory. Extract the **alertmanager-0.23.0.linux-amd64.tar.gz** archive with the following command: ```ad-command ~~~bash tar xzf alertmanager-0.22.2.linux-amd64.tar.gz ~~~ ```   You should find a new directory **alertmanager-0.23.0.linux-amd64/** as marked in the screenshot below. Now, move the **alertmanager-0.23.0.linux-amd64** directory to **/opt/** directory and rename it to **alertmanager** as follows: ```ad-command ~~~bash sudo mv -v alertmanager-0.23.0.linux-amd64 /opt/alertmanager ~~~ ```   Change the user and group of all the files and directories of the `/opt/alertmanager/` directory to root as follows: ```ad-command ~~~bash sudo chown -Rfv root:root /opt/alertmanager ~~~ ```   In the **/opt/alertmanager** directory, you should find the **alertmanager** binary and the Alert Manager configuration file **alertmanager.yml**. You will use them later. So, just keep that in mind.   #### Creating a Data Directory [[#^Top|TOP]] Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the **prometheus** system user, the **prometheus** system user must have access (read, write, and execute permissions) to that data directory. You can create the **data/** directory in the **/opt/alertmanager/** directory as follows: ```ad-command ~~~bash sudo mkdir -v /opt/alertmanager/data ~~~ ```   Change the owner and group of the **/opt/alertmanager/data/** directory to **prometheus** with the following command: ```ad-command ~~~bash sudo chown -Rfv prometheus:prometheus /opt/alertmanager/data ~~~ ```   The owner and group of the **/opt/alertmanager/data/** directory should be changed to **prometheus**.   #### Starting Alert Manager on Boot [[#^Top|TOP]] Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd. To create a systemd service file **alertmanager.service**, run the following command: ```ad-command ~~~bash sudo nano /etc/systemd/system/alertmanager.service ~~~ ```   Type in the following lines in the **alertmanager.service** file. ```ad-code ~~~bash [Unit] Description=Alertmanager for prometheus [Service] Restart=always User=prometheus ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data             ExecReload=/bin/kill -HUP $MAINPID TimeoutStopSec=20s SendSIGKILL=no [Install] WantedBy=multi-user.target ~~~ ```   For the systemd changes to take effect, run the following command: ```ad-command ~~~bash sudo systemctl daemon-reload ~~~ ```   Now, start the **alertmanager** service with the following command: ```ad-command ~~~bash sudo systemctl start alertmanager.service ~~~ ```   Add the **alertmanager** service to the system startup so that it automatically starts on boot with the following command: ```ad-command ~~~bash sudo systemctl enable alertmanager.service ~~~ ```   As you can see, the **alertmanager** service is **active/running**. It is also **enabled** (it will start automatically on boot). ```ad-command ~~~bash sudo systemctl status alertmanager.service ~~~ ```   #### Configuring Prometheus [[#^Top|TOP]] Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section. First, find the IP address of the computer where you have installed Alert Manager with the following command: ```ad-command ~~~bash hostname -I ~~~ ```   Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Type in the following lines in the **scrape_configs** section to add Alert Manager for monitoring with Prometheus. ```ad-code ~~~bash - job_name: 'alertmanager'   static_configs:   - targets: ['localhost:9093'] ~~~ ```   Also, type in the IP address and port number of Alert Manager in the **alerting > alertmanagers** section. For the changes to take effect, restart the **prometheus** service as follows: ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Visit the URL [http://192.168.20.161:9090/targets](http://192.168.20.161:9090/targets) from your favorite web browser, and you should see that **alertmanager** is in the **UP** state. So, Prometheus can access Alert Manager just fine.   #### Creating a Prometheus Alert Rule [[#^Top|TOP]] On Prometheus, you can use the **up** expression to find the state of the targets added to Prometheus, as shown in the screenshot below. The targets that are in the **UP** state (running and accessible to Prometheus) will have the value **1**, and targets that are not in the **UP** (or **DOWN**) state (not running or inaccessible to Prometheus) will have the value **0**. If you stop one of the targets – **node_exporter** (let’s say). ```ad-command ~~~bash sudo systemctl stop node-exporter.service ~~~ ```   The **up** value of that target should be **0**, as you can see in the screenshot below. You get the idea. So, you can use the **up == 0** expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below. This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus. To create a Prometheus Alert, create a new file **rules.yml** in the **/opt/prometheus/** directory as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/rules.yml ~~~ ```   Now, type in the following lines in the **rules.yml** file. ```ad-code ~~~yaml groups: - name: test rules: - alert: InstanceDown expr: up == 0 for: 1m ~~~ ```   Here, the alert **InstanceDown** will be fired when targets are not running or inaccessible to Prometheus (that is **up == 0**) for a minute (**1m**). Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Add the **rules.yml** file in the **rule_files** section of the prometheus.yml configuration file. Another important option of the **prometheus.yml** file is **evaluation_interval**. Prometheus will check whether any rules matched every **evaluation_interval** time. The default is 15s (**15** seconds). So, the Alert rules in the **rules.yml** file will be checked every 15 seconds. For the changes to take effect, restart the **prometheus** service as follows: ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Now, navigate to the URL [http://localhost:9010/rules](http://localhost:9010/rules) from your favorite web browser, and you should see the rule **InstanceDown** that you’ve just added. As you’ve stopped **node_exporter** earlier, the alert is active, and it is waiting to be sent to the Alert Manager. After a minute has passed, the alert **InstanceDown** should be in the **FIRING** state. It means that the alert is sent to the Alert Manager.   ---   ### Configuring monitoring modules [[#^Top|TOP]]   #### Node-Exporter Pour commencer, télécharger la dernière version de Node Exporter ici: [Node-Exporter](https://prometheus.io/download/#node_exporter) ```ad-command ~~~bash wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz ~~~ ```   ##### Dépaquetage ```ad-command ~~~bash tar -xf node_exporter-1.3.1.linux-amd64.tar.gz ~~~ ``` Puis on la déplace dans un répertoire qui lui permet d'être gérer par le système ```ad-command ~~~bash mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/ ~~~ ```   ##### Installation & Mise en service En réalitée, on installe pas vraiment Node Exporter, on crée juste une tache système qui vas lancer la commande. Et pour ça, on crée un utilisateur node exporter qui va s'occuper du service. ```ad-command ~~~bash useradd -rs /bin/false node_exporter ~~~ ```   Ensuite on crée le fameux service. ```ad-command ~~~bash sudo nano /etc/systemd/system/node_exporter.service ~~~ ```   Le fichier doit contenir les infos suivante: ```ad-code ~~~bash [Unit] Description=Node Exporter After=network.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target ~~~ ```   Maintenant il faut recharger le daemon ```ad-command ~~~bash sudo systemctl daemon-reload ~~~ ```   Puis démarrer node_exporter ```ad-command ~~~bash sudo systemctl start node_exporter ~~~ ```   Il faut vérifier si node_exporter fonctionne ```ad-command ~~~bash sudo systemctl status node_exporter ~~~ ```   Si tout vas bien, alors on peut l'ajouter au service au démarrage ```ad-command ~~~bash sudo systemctl enable node_exporter ~~~ ```   Pour savoir si tout vas bien: ```ad-command ~~~bash sudo curl http://localhost:9100/metrics ~~~ ```   ##### Ajouter l'host à Prometheus Pour ajouter l'host il faut modifier le fichier de configuration de Prometheus ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Ajouter un target avec l'adresse ip voulu en dessous du target existant. ```ad-code ~~~yaml - job_name: 'node_exporter' scrape_interval: 5s static_configs: - targets: ['localhost:9100'] ~~~ ```   ##### Redémarrage de Prometheus Pour que tout soit pris en compte il faut redémarrer le service prometheus: ```ad-command ~~~bash sudo systemct restart prometheus ~~~ ```   ##### Vérification Pour voire si tout vas bien, un petit tour sur votre interface prometheus ([http://prometheus-ip:9090/targets](http://prometheus-ip:9090/targets)) ou grafana et voir si votre host apparait bien !   ---   ### Configuring rules and alerts [[#^Top|TOP]]   #### Introduction Rules defining alerts are to be defined in `/etc/prometheus/config.yml` by referencing rule files in the same folder. As a generic process, here is what to do: 1. Define & reference the rule file in Prometheus' config file `rules.yml` 2. Create the rule file ```ad-command ~~~bash sudo nano /etc/prometheus/rules.yml ~~~ ```   3. Add the defined rule See external resource for examples. 4. Relaunch Prometheus ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.   #### External ressource [Awesome Prometheus alerts | Collection of alerting rules](https://awesome-prometheus-alerts.grep.to/rules.html)   ---   ### Using Prometheus to monitor Caddy [[#^Top|TOP]]   #### Global parameters | | | | --------------------- | -------------------------- | | **Caddy metrics API** | https://tools.mfxm.fr:7784 | | **Prometheus web listening port** | 9010 |   #### Adding a monitoring job [[#^Top|TOP]] Monitoring jobs are called `scrape` Jobs and are defined in the `/etc/prometheus/prometheus.yml` file under the `scrape_configs:` JSON header. Below is an example of job definition. ```ad-code ~~~javascript scrape_configs: - job_name: caddy scheme: https static_configs: - targets: - tools.mfxm.fr:7784 ~~~ ```   ---   ### Using Telegram for notifications [[#^Top|TOP]]   #### Installing the Telegram Bridge In order to set up the [[Configuring Telegram bots|Telegram bot]], first, pull the image from its github repository: ```ad-command ~~~bash sudo git clone https://github.com/inCaller/prometheus_bot ~~~ ```   Move to the created folder: ```ad-command ~~~bash cd ~/prometheus_bot ~~~ ```   Compile the programme in Go: ```ad-command ~~~bash export GOPATH="your go path" make clean make ~~~ ```   Update the config file: ```ad-path /home/melchiorbv/prometheus_bot/config.yaml ```   ```ad-code ~~~yaml telegram_token: "token goes here" # ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail time_outdata: "02/01/2006 15:04:05" template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE split_msg_byte: 4000 send_only: true # use bot only to send messages. ~~~ ```   Then, update the template file: ```ad-path /home/melchiorbv/prometheus_bot/template.tmpl ```   ```ad-code ~~~yaml Type: {{.CommonAnnotations.description}} Summary: {{.CommonAnnotations.summary}} Alertname: {{ .CommonLabels.alertname }} Instance: {{ .CommonLabels.instance }} Serverity: {{ .CommonLabels.serverity}} Status: {{ .Status }} ~~~ ```   Run the daemon with: ```ad-command ~~~bash ./prometheus_bot ~~~ ``` First part done.   #### Linking the bot to Alertmanager [[#^Top|TOP]] Edit the `AlertManager` config file under `/opt/alertmanager/alertmanager.yml` and add: ```ad-code ~~~yaml - name: 'admins' webhook_configs: - send_resolved: True url: http://127.0.0.1:9087/alert/chat_id ~~~ ``` Replace `chat_id` with the value you got from your bot, ***with everything inside the quotes***. (Some chat_id's start with a `-`, in this case, you must also include the `-` in the url) To use multiple chats just add more receivers. Relaunch the AlertManager: ```ad-command ~~~bash sudo systemctl restart alertmanager.service ~~~ ```     [[#^Top|TOP]]