--- Alias: ["Prometheus"] Tag: ["đź’»", "🖥️", "🔎"] Date: 2022-03-19 DocType: "Personal" Hierarchy: "NonRoot" TimeStamp: location: [47.3639129,8.55627491017841] CollapseMetaTable: true --- Parent:: [[Selfhosting]], [[Configuring Caddy|caddy]], [[Server Tools]] ---   ^Top ```button name Save type command action Save current file id Save ``` ^button-ConfiguringPrometheusNSave   # Configuring Prometheus   ```ad-abstract title: Summary collapse: open This not runs through the installation and use of Prometheus as a monitoring tool. Prometheus interacts better with json logs rather than common log language, which is caddy's output. ```   ```toc style: number ```   ---   ### Introduction [[#^Top|TOP]]   [Prometheus](https://prometheus.io/docs/introduction/overview/) is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language. Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors. Prometheus comprises the following major components: - The main Prometheus server for scraping and storing time-series data. - Unique exporters for services such as Graphite, HAProxy, StatsD and so much more - An alert manager for handling alerts - A push-gateway for supporting transient jobs - Client libraries for instrumenting application code   ---   ### Installing Prometheus [[#^Top|TOP]]   #### Installing the main modules But first, we need to create the configuration and data directories for Prometheus. To create the configuration directory, run the command: ```ad-command ~~~bash sudo mkdir -p /etc/prometheus ~~~ ```   For the data directory, execute: ```ad-command ~~~bash sudo mkdir -p /var/lib/prometheus ~~~ ```   Once the directories are created, grab the compressed installation file: ```ad-command ~~~bash wget https://github.com/prometheus/prometheus/releases/download/v2.31.0/prometheus-2.31.0.linux-amd64.tar.gz ~~~ ```   Once downloaded, extract the tarball file. ```ad-command ~~~bash tar -xvf prometheus-2.31.3.linux-amd64.tar.gz ~~~ ```   Then navigate to the Prometheus folder. ```ad-command ~~~bash cd prometheus-2.31.3.linux-amd64 ~~~ ```   Once in the [directory move](https://linoxide.com/mv-command-in-linux/) the  `prometheus` and `promtool` binary files to `/usr/local/bin/` folder. ```ad-command ~~~bash sudo mv prometheus promtool /usr/local/bin/ ~~~ ```   Additionally, move console files in `console` directory and library files in the `console_libraries`  directory to `/etc/prometheus/` directory. ```ad-command ~~~bash sudo mv consoles/ console_libraries/ /etc/prometheus/ ~~~ ```   Also, ensure to move the prometheus.yml template configuration file to the  **`/etc/prometheus/`** directory. ```ad-command ~~~bash sudo mv prometheus.yml /etc/prometheus/prometheus.yml ~~~ ```   At this point, Prometheus has been successfully installed. To check the version of Prometheus installed, run the command: ```ad-command ~~~bash prometheus --version ~~~ ```   Output: ```ad-code ~~~bash prometheus, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df) build user: root@5cff4265f0e3 build date: 20211005-16:10:52 go version: go1.17.1 platform: linux/amd64 ~~~ ```   ```ad-command ~~~bash promtool --version ~~~ ```   Output: ```ad-code ~~~bash promtool, version 2.31.3 (branch: HEAD, revision: f29caccc42557f6a8ec30ea9b3c8c089391bd5df) build user: root@5cff4265f0e3 build date: 20211005-16:10:52 go version: go1.17.1 platform: linux/amd64 ~~~ ``` If your output resembles what I have, then you are on the right track. In the next step, we will create a system group and user.   #### Permissions & User Management [[#^Top|TOP]] It's essential that we create a Prometheus group and user before proceeding to the next step which involves creating a system file for Prometheus. To  create a `prometheus` [group](https://linoxide.com/groupadd-command/) execute the command: ```ad-command ~~~bash sudo groupadd --system prometheus ~~~ ```   Thereafter, Create `prometheus` user and assign it to the just-created `prometheus` group. ```ad-command ~~~bash sudo useradd -s /sbin/nologin --system -g prometheus prometheus ~~~ ```   Next, configure the directory ownership and permissions as follows. ```ad-command ~~~bash sudo chown -R prometheus:prometheus /etc/prometheus/ /var/lib/prometheus/$ sudo chmod -R 775 /etc/prometheus/ /var/lib/prometheus/ ~~~ ``` The only part remaining is to make Prometheus a systemd service so that we can easily manage its running status.   #### Configuring the service [[#^Top|TOP]] Using your favorite text editor, create a systemd service file: ```ad-command ~~~bash sudo nano /etc/systemd/system/prometheus.service ~~~ ```   Paste the following lines of code. ```ad-code ~~~bash [Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Restart=always Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file=/etc/prometheus/prometheus.yml \ --storage.tsdb.path=/var/lib/prometheus/ \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries \ --web.listen-address=0.0.0.0:9090 [Install] WantedBy=multi-user.target ~~~ ``` Save the changes and exit the systemd file. Then proceed and start the Prometheus service. ```ad-command ~~~bash sudo systemctl start prometheus ~~~ ```   Enable the Prometheus service to run at startup. Therefore invoke the command: ```ad-command ~~~bash sudo systemctl enable prometheus ~~~ ```   Then confirm the status of the Prometheus service. ```ad-command ~~~bash sudo systemctl status prometheus ~~~ ``` ![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)   #### Configuration of user acccess [[#^Top|TOP]] Finally, to access Prometheus, parameter your reverse-proxy ([[Configuring Caddy|caddy]]) to point back to the service. It is accessible below, under internal port 9090: ```ad-address https://prometheus.mfxm.fr ```   ![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)![prometheus dashboard](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Prometheus-dashboard-1024x440.png)   ---   ### Configuring alerts [[#^Top|TOP]]   #### Install Alertmanager Download the latest version of Alert Manager (v0.23.0 at the time of this writing) with the following command: ```ad-command ~~~bash wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz ~~~ ```   Alert Manager is being downloaded. It may take a while to complete. At this point, Alert Manager should be downloaded. Once Alert Manager is downloaded, you should find a new archive file **alertmanager-0.23.0.linux-amd64.tar.gz** in your current working directory. Extract the **alertmanager-0.23.0.linux-amd64.tar.gz** archive with the following command: ```ad-command ~~~bash tar xzf alertmanager-0.22.2.linux-amd64.tar.gz ~~~ ```   You should find a new directory **alertmanager-0.23.0.linux-amd64/** as marked in the screenshot below. Now, move the **alertmanager-0.23.0.linux-amd64** directory to **/opt/** directory and rename it to **alertmanager** as follows: ```ad-command ~~~bash sudo mv -v alertmanager-0.23.0.linux-amd64 /opt/alertmanager ~~~ ```   Change the user and group of all the files and directories of the `/opt/alertmanager/` directory to root as follows: ```ad-command ~~~bash sudo chown -Rfv root:root /opt/alertmanager ~~~ ```   In the **/opt/alertmanager** directory, you should find the **alertmanager** binary and the Alert Manager configuration file **alertmanager.yml**. You will use them later. So, just keep that in mind.   #### Creating a Data Directory [[#^Top|TOP]] Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the **prometheus** system user, the **prometheus** system user must have access (read, write, and execute permissions) to that data directory. You can create the **data/** directory in the **/opt/alertmanager/** directory as follows: ```ad-command ~~~bash sudo mkdir -v /opt/alertmanager/data ~~~ ```   Change the owner and group of the **/opt/alertmanager/data/** directory to **prometheus** with the following command: ```ad-command ~~~bash sudo chown -Rfv prometheus:prometheus /opt/alertmanager/data ~~~ ```   The owner and group of the **/opt/alertmanager/data/** directory should be changed to **prometheus**.   #### Starting Alert Manager on Boot [[#^Top|TOP]] Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd. To create a systemd service file **alertmanager.service**, run the following command: ```ad-command ~~~bash sudo nano /etc/systemd/system/alertmanager.service ~~~ ```   Type in the following lines in the **alertmanager.service** file. ```ad-code ~~~bash [Unit] Description=Alertmanager for prometheus [Service] Restart=always User=prometheus ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data             ExecReload=/bin/kill -HUP $MAINPID TimeoutStopSec=20s SendSIGKILL=no [Install] WantedBy=multi-user.target ~~~ ```   For the systemd changes to take effect, run the following command: ```ad-command ~~~bash sudo systemctl daemon-reload ~~~ ```   Now, start the **alertmanager** service with the following command: ```ad-command ~~~bash sudo systemctl start alertmanager.service ~~~ ```   Add the **alertmanager** service to the system startup so that it automatically starts on boot with the following command: ```ad-command ~~~bash sudo systemctl enable alertmanager.service ~~~ ```   As you can see, the **alertmanager** service is **active/running**. It is also **enabled** (it will start automatically on boot). ```ad-command ~~~bash sudo systemctl status alertmanager.service ~~~ ```   #### Configuring Prometheus [[#^Top|TOP]] Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section. First, find the IP address of the computer where you have installed Alert Manager with the following command: ```ad-command ~~~bash hostname -I ~~~ ```   Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Type in the following lines in the **scrape_configs** section to add Alert Manager for monitoring with Prometheus. ```ad-code ~~~bash - job_name: 'alertmanager'   static_configs:   - targets: ['localhost:9093'] ~~~ ```   Also, type in the IP address and port number of Alert Manager in the **alerting > alertmanagers** section. For the changes to take effect, restart the **prometheus** service as follows: ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Visit the URL [http://192.168.20.161:9090/targets](http://192.168.20.161:9090/targets) from your favorite web browser, and you should see that **alertmanager** is in the **UP** state. So, Prometheus can access Alert Manager just fine.   #### Creating a Prometheus Alert Rule [[#^Top|TOP]] On Prometheus, you can use the **up** expression to find the state of the targets added to Prometheus, as shown in the screenshot below. The targets that are in the **UP** state (running and accessible to Prometheus) will have the value **1**, and targets that are not in the **UP** (or **DOWN**) state (not running or inaccessible to Prometheus) will have the value **0**. If you stop one of the targets – **node_exporter** (let’s say). ```ad-command ~~~bash sudo systemctl stop node-exporter.service ~~~ ```   The **up** value of that target should be **0**, as you can see in the screenshot below. You get the idea. So, you can use the **up == 0** expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below. This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus. To create a Prometheus Alert, create a new file **rules.yml** in the **/opt/prometheus/** directory as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/rules.yml ~~~ ```   Now, type in the following lines in the **rules.yml** file. ```ad-code ~~~yaml groups: - name: test rules: - alert: InstanceDown expr: up == 0 for: 1m ~~~ ```   Here, the alert **InstanceDown** will be fired when targets are not running or inaccessible to Prometheus (that is **up == 0**) for a minute (**1m**). Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows: ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Add the **rules.yml** file in the **rule_files** section of the prometheus.yml configuration file. Another important option of the **prometheus.yml** file is **evaluation_interval**. Prometheus will check whether any rules matched every **evaluation_interval** time. The default is 15s (**15** seconds). So, the Alert rules in the **rules.yml** file will be checked every 15 seconds. For the changes to take effect, restart the **prometheus** service as follows: ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Now, navigate to the URL [http://localhost:9010/rules](http://localhost:9010/rules) from your favorite web browser, and you should see the rule **InstanceDown** that you’ve just added. As you’ve stopped **node_exporter** earlier, the alert is active, and it is waiting to be sent to the Alert Manager. After a minute has passed, the alert **InstanceDown** should be in the **FIRING** state. It means that the alert is sent to the Alert Manager.   ---   ### Configuring monitoring modules [[#^Top|TOP]]   #### Node-Exporter Pour commencer, tĂ©lĂ©charger la dernière version de Node Exporter ici: [Node-Exporter](https://prometheus.io/download/#node_exporter) ```ad-command ~~~bash wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz ~~~ ```   ##### DĂ©paquetage ```ad-command ~~~bash tar -xf node_exporter-1.3.1.linux-amd64.tar.gz ~~~ ``` Puis on la dĂ©place dans un rĂ©pertoire qui lui permet d'ĂŞtre gĂ©rer par le système ```ad-command ~~~bash mv node_exporter-1.3.1.linux-amd64/node_exporter /usr/local/bin/ ~~~ ```   ##### Installation & Mise en service En rĂ©alitĂ©e, on installe pas vraiment Node Exporter, on crĂ©e juste une tache système qui vas lancer la commande. Et pour ça, on crĂ©e un utilisateur node exporter qui va s'occuper du service. ```ad-command ~~~bash useradd -rs /bin/false node_exporter ~~~ ```   Ensuite on crĂ©e le fameux service. ```ad-command ~~~bash sudo nano /etc/systemd/system/node_exporter.service ~~~ ```   Le fichier doit contenir les infos suivante: ```ad-code ~~~bash [Unit] Description=Node Exporter After=network.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target ~~~ ```   Maintenant il faut recharger le daemon ```ad-command ~~~bash sudo systemctl daemon-reload ~~~ ```   Puis dĂ©marrer node_exporter ```ad-command ~~~bash sudo systemctl start node_exporter ~~~ ```   Il faut vĂ©rifier si node_exporter fonctionne ```ad-command ~~~bash sudo systemctl status node_exporter ~~~ ```   Si tout vas bien, alors on peut l'ajouter au service au dĂ©marrage ```ad-command ~~~bash sudo systemctl enable node_exporter ~~~ ```   Pour savoir si tout vas bien: ```ad-command ~~~bash sudo curl http://localhost:9100/metrics ~~~ ```   ##### Ajouter l'host Ă  Prometheus Pour ajouter l'host il faut modifier le fichier de configuration de Prometheus ```ad-command ~~~bash sudo nano /etc/prometheus/prometheus.yml ~~~ ```   Ajouter un target avec l'adresse ip voulu en dessous du target existant. ```ad-code ~~~yaml - job_name: 'node_exporter' scrape_interval: 5s static_configs: - targets: ['localhost:9100'] ~~~ ```   ##### RedĂ©marrage de Prometheus Pour que tout soit pris en compte il faut redĂ©marrer le service prometheus: ```ad-command ~~~bash sudo systemct restart prometheus ~~~ ```   ##### VĂ©rification Pour voire si tout vas bien, un petit tour sur votre interface prometheus ([http://prometheus-ip:9090/targets](http://prometheus-ip:9090/targets)) ou grafana et voir si votre host apparait bien !   ---   ### Configuring rules and alerts [[#^Top|TOP]]   #### Introduction Rules defining alerts are to be defined in `/etc/prometheus/config.yml` by referencing rule files in the same folder. As a generic process, here is what to do: 1. Define & reference the rule file in Prometheus' config file `rules.yml` 2. Create the rule file ```ad-command ~~~bash sudo nano /etc/prometheus/rules.yml ~~~ ```   3. Add the defined rule See external resource for examples. 4. Relaunch Prometheus ```ad-command ~~~bash sudo systemctl restart prometheus ~~~ ```   Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.   #### External ressource [Awesome Prometheus alerts | Collection of alerting rules](https://awesome-prometheus-alerts.grep.to/rules.html)   ---   ### Using Prometheus to monitor Caddy [[#^Top|TOP]]   #### Global parameters | | | | --------------------- | -------------------------- | | **Caddy metrics API** | https://tools.mfxm.fr:7784 | | **Prometheus web listening port** | 9010 |   #### Adding a monitoring job [[#^Top|TOP]] Monitoring jobs are called `scrape` Jobs and are defined in the `/etc/prometheus/prometheus.yml` file under the `scrape_configs:` JSON header. Below is an example of job definition. ```ad-code ~~~javascript scrape_configs: - job_name: caddy scheme: https static_configs: - targets: - tools.mfxm.fr:7784 ~~~ ```   ---   ### Using Telegram for notifications [[#^Top|TOP]]   #### Installing the Telegram Bridge In order to set up the [[Configuring Telegram bots|Telegram bot]], first, pull the image from its github repository: ```ad-command ~~~bash sudo git clone https://github.com/inCaller/prometheus_bot ~~~ ```   Move to the created folder: ```ad-command ~~~bash cd ~/prometheus_bot ~~~ ```   Compile the programme in Go: ```ad-command ~~~bash export GOPATH="your go path" make clean make ~~~ ```   Update the config file: ```ad-path /home/melchiorbv/prometheus_bot/config.yaml ```   ```ad-code ~~~yaml telegram_token: "token goes here" # ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail time_outdata: "02/01/2006 15:04:05" template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE split_msg_byte: 4000 send_only: true # use bot only to send messages. ~~~ ```   Then, update the template file: ```ad-path /home/melchiorbv/prometheus_bot/template.tmpl ```   ```ad-code ~~~yaml Type: {{.CommonAnnotations.description}} Summary: {{.CommonAnnotations.summary}} Alertname: {{ .CommonLabels.alertname }} Instance: {{ .CommonLabels.instance }} Serverity: {{ .CommonLabels.serverity}} Status: {{ .Status }} ~~~ ```   Run the daemon with: ```ad-command ~~~bash ./prometheus_bot ~~~ ``` First part done.   #### Linking the bot to Alertmanager [[#^Top|TOP]] Edit the `AlertManager` config file under `/opt/alertmanager/alertmanager.yml` and add: ```ad-code ~~~yaml - name: 'admins' webhook_configs: - send_resolved: True url: http://127.0.0.1:9087/alert/chat_id ~~~ ``` Replace `chat_id` with the value you got from your bot, ***with everything inside the quotes***. (Some chat_id's start with a `-`, in this case, you must also include the `-` in the url) To use multiple chats just add more receivers. Relaunch the AlertManager: ```ad-command ~~~bash sudo systemctl restart alertmanager.service ~~~ ```     [[#^Top|TOP]]