[Prometheus](https://prometheus.io/docs/introduction/overview/) is a free and open-source monitoring and alerting tool that was initially used for monitoring metrics at SoundCloud back in 2012. It is written in Go programming language.
Prometheus monitors and records real-time events in a time-series database. Since then it has grown in leaps and bounds and had been adopted by many organizations to monitor their infrastructure metrics. Prometheus provides flexible queries and real-time alerting which helps in quick diagnosis and troubleshooting of errors.
Prometheus comprises the following major components:
- The main Prometheus server for scraping and storing time-series data.
- Unique exporters for services such as Graphite, HAProxy, StatsD and so much more
- An alert manager for handling alerts
- A push-gateway for supporting transient jobs
- Client libraries for instrumenting application code
Enable the Prometheus service to run at startup. Therefore invoke the command:
```ad-command
~~~bash
sudo systemctl enable prometheus
~~~
```
 
Then confirm the status of the Prometheus service.
```ad-command
~~~bash
sudo systemctl status prometheus
~~~
```
![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)![Check status of Prometheus services](https://linoxide.com/wp-content/uploads/2021/11/2021-10-1003-Check-status-of-Prometheus-services.png)
Change the user and group of all the files and directories of the `/opt/alertmanager/` directory to root as follows:
```ad-command
~~~bash
sudo chown -Rfv root:root /opt/alertmanager
~~~
```
 
In the **/opt/alertmanager** directory, you should find the **alertmanager** binary and the Alert Manager configuration file **alertmanager.yml**. You will use them later. So, just keep that in mind.
 
#### Creating a Data Directory
[[#^Top|TOP]]
Alert Manager needs a directory where it can store its data. As you will be running Alert Manager as the **prometheus** system user, the **prometheus** system user must have access (read, write, and execute permissions) to that data directory.
You can create the **data/** directory in the **/opt/alertmanager/** directory as follows:
```ad-command
~~~bash
sudo mkdir -v /opt/alertmanager/data
~~~
```
 
Change the owner and group of the **/opt/alertmanager/data/** directory to **prometheus** with the following command:
The owner and group of the **/opt/alertmanager/data/** directory should be changed to **prometheus**.
 
#### Starting Alert Manager on Boot
[[#^Top|TOP]]
Now, you have to create a systemd service file for Alert Manager so that you can easily manage (start, stop, restart, and add to startup) the alertmanager service with systemd.
To create a systemd service file **alertmanager.service**, run the following command:
For the systemd changes to take effect, run the following command:
```ad-command
~~~bash
sudo systemctl daemon-reload
~~~
```
 
Now, start the **alertmanager** service with the following command:
```ad-command
~~~bash
sudo systemctl start alertmanager.service
~~~
```
 
Add the **alertmanager** service to the system startup so that it automatically starts on boot with the following command:
```ad-command
~~~bash
sudo systemctl enable alertmanager.service
~~~
```
 
As you can see, the **alertmanager** service is **active/running**. It is also **enabled** (it will start automatically on boot).
```ad-command
~~~bash
sudo systemctl status alertmanager.service
~~~
```
 
#### Configuring Prometheus
[[#^Top|TOP]]
Now, you have to configure Prometheus to use Alert Manager. You can also monitor Alert Manager with Prometheus. I will show you how to do both in this section.
First, find the IP address of the computer where you have installed Alert Manager with the following command:
```ad-command
~~~bash
hostname -I
~~~
```
 
Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows:
```ad-command
~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~
```
 
Type in the following lines in the **scrape_configs** section to add Alert Manager for monitoring with Prometheus.
```ad-code
~~~bash
- job_name: 'alertmanager'
static_configs:
- targets: ['localhost:9093']
~~~
```
 
Also, type in the IP address and port number of Alert Manager in the **alerting > alertmanagers** section.
For the changes to take effect, restart the **prometheus** service as follows:
```ad-command
~~~bash
sudo systemctl restart prometheus
~~~
```
 
Visit the URL [http://192.168.20.161:9090/targets](http://192.168.20.161:9090/targets) from your favorite web browser, and you should see that **alertmanager** is in the **UP** state. So, Prometheus can access Alert Manager just fine.
 
#### Creating a Prometheus Alert Rule
[[#^Top|TOP]]
On Prometheus, you can use the **up** expression to find the state of the targets added to Prometheus, as shown in the screenshot below.
The targets that are in the **UP** state (running and accessible to Prometheus) will have the value **1**, and targets that are not in the **UP** (or **DOWN**) state (not running or inaccessible to Prometheus) will have the value **0**.
If you stop one of the targets –**node_exporter** (let’s say).
```ad-command
~~~bash
sudo systemctl stop node-exporter.service
~~~
```
 
The **up** value of that target should be **0**, as you can see in the screenshot below. You get the idea.
So, you can use the **up == 0** expressions to list only the targets that are not running or inaccessible to Prometheus, as you can see in the screenshot below.
This expression can be used to create a Prometheus Alert and send alerts to Alert Manager when one or more targets are not running or inaccessible to Prometheus.
To create a Prometheus Alert, create a new file **rules.yml** in the **/opt/prometheus/** directory as follows:
```ad-command
~~~bash
sudo nano /etc/prometheus/rules.yml
~~~
```
 
Now, type in the following lines in the **rules.yml** file.
```ad-code
~~~yaml
groups:
- name: test
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
~~~
```
 
Here, the alert **InstanceDown** will be fired when targets are not running or inaccessible to Prometheus (that is **up == 0**) for a minute (**1m**).
Now, open the Prometheus configuration file **/opt/prometheus/prometheus.yml** with the **nano** text editor as follows:
```ad-command
~~~bash
sudo nano /etc/prometheus/prometheus.yml
~~~
```
 
Add the **rules.yml** file in the **rule_files** section of the prometheus.yml configuration file.
Another important option of the **prometheus.yml** file is **evaluation_interval**. Prometheus will check whether any rules matched every **evaluation_interval** time. The default is 15s (**15** seconds). So, the Alert rules in the **rules.yml** file will be checked every 15 seconds.
For the changes to take effect, restart the **prometheus** service as follows:
```ad-command
~~~bash
sudo systemctl restart prometheus
~~~
```
 
Now, navigate to the URL [http://localhost:9010/rules](http://localhost:9010/rules) from your favorite web browser, and you should see the rule **InstanceDown** that you’ve just added.
As you’ve stopped **node_exporter** earlier, the alert is active, and it is waiting to be sent to the Alert Manager.
After a minute has passed, the alert **InstanceDown** should be in the **FIRING** state. It means that the alert is sent to the Alert Manager.
 
---
 
### Configuring monitoring modules
[[#^Top|TOP]]
 
 
---
 
### Configuring rules and alerts
[[#^Top|TOP]]
 
#### Introduction
Rules defining alerts are to be defined in `/etc/prometheus/config.yml` by referencing rule files in the same folder. As a generic process, here is what to do:
1. Define & reference the rule file in Prometheus' config file
`rules.yml`
2. Create the rule file
```ad-command
~~~bash
sudo nano /etc/prometheus/rules.yml
~~~
```
 
3. Add the defined rule
See external resource for examples.
4. Relaunch Prometheus
```ad-command
~~~bash
sudo systemctl restart prometheus
~~~
```
 
Once this is done, Prometheus may not restart, prompting to a problem in the configuration file. Please check whitespacing and other formatting issues before trying to restart the daemon again.
 
#### External ressource
[Awesome Prometheus alerts | Collection of alerting rules](https://awesome-prometheus-alerts.grep.to/rules.html)
Monitoring jobs are called `scrape` Jobs and are defined in the `/etc/prometheus/prometheus.yml` file under the `scrape_configs:` JSON header. Below is an example of job definition.
```ad-code
~~~javascript
scrape_configs:
- job_name: caddy
scheme: https
static_configs:
- targets:
- tools.mfxm.fr:7784
~~~
```
 
---
 
### Using Telegram for notifications
[[#^Top|TOP]]
 
#### Installing the Telegram Bridge
In order to set up the [[Configuring Telegram bots|Telegram bot]], first, pull the image from its github repository:
# ONLY IF YOU USING DATA FORMATTING FUNCTION, NOTE for developer: important or test fail
time_outdata: "02/01/2006 15:04:05"
template_path: "/home/melchiorbv/prometheus_bot/template.tmpl" # ONLY IF YOU USING TEMPLATE
time_zone: "Europe/Amsterdam" # ONLY IF YOU USING TEMPLATE
split_msg_byte: 4000
send_only: true # use bot only to send messages.
~~~
```
 
Then, update the template file:
```ad-path
/home/melchiorbv/prometheus_bot/template.tmpl
```
 
```ad-code
~~~yaml
Type: {{.CommonAnnotations.description}}
Summary: {{.CommonAnnotations.summary}}
Alertname: {{ .CommonLabels.alertname }}
Instance: {{ .CommonLabels.instance }}
Serverity: {{ .CommonLabels.serverity}}
Status: {{ .Status }}
~~~
```
 
Run the daemon with:
```ad-command
~~~bash
./prometheus_bot
~~~
```
First part done.
 
#### Linking the bot to Alertmanager
[[#^Top|TOP]]
Edit the `AlertManager` config file under `/opt/alertmanager/alertmanager.yml` and add:
```ad-code
~~~yaml
- name: 'admins'
webhook_configs:
- send_resolved: True
url: http://127.0.0.1:9087/alert/chat_id
~~~
```
Replace `chat_id` with the value you got from your bot, ***with everything inside the quotes***. (Some chat_id's start with a `-`, in this case, you must also include the `-` in the url) To use multiple chats just add more receivers.