debian-mirror-gitlab/doc/operations/metrics/alerts.md

138 lines
7.1 KiB
Markdown
Raw Normal View History

2020-07-28 23:09:34 +05:30
---
stage: Monitor
2021-01-03 14:25:43 +05:30
group: Health
2021-02-22 17:27:13 +05:30
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
2020-07-28 23:09:34 +05:30
---
2021-03-11 19:13:27 +05:30
# Set up alerts for Prometheus metrics **(FREE)**
2020-07-28 23:09:34 +05:30
2021-03-11 19:13:27 +05:30
> [Moved](https://gitlab.com/gitlab-org/gitlab/-/issues/42640) to GitLab Free in 12.10.
2020-11-24 15:15:51 +05:30
2020-07-28 23:09:34 +05:30
After [configuring metrics for your CI/CD environment](index.md), you can set up
alerting for Prometheus metrics depending on the location of your instances, and
2020-11-24 15:15:51 +05:30
[trigger actions from alerts](#trigger-actions-from-alerts) to notify
2020-07-28 23:09:34 +05:30
your team when environment performance falls outside of the boundaries you set.
## Managed Prometheus instances
2021-01-03 14:25:43 +05:30
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/6590) in [GitLab Ultimate](https://about.gitlab.com/pricing/) 11.2 for [custom metrics](index.md#adding-custom-metrics), and GitLab 11.3 for [library metrics](../../user/project/integrations/prometheus_library/index.md).
2020-07-28 23:09:34 +05:30
For managed Prometheus instances using auto configuration, you can
[configure alerts for metrics](index.md#adding-custom-metrics) directly in the
[metrics dashboard](index.md). To set an alert:
2020-10-24 23:57:45 +05:30
1. In your project, navigate to **Operations > Metrics**,
2020-07-28 23:09:34 +05:30
1. Identify the metric you want to create the alert for, and click the
**ellipsis** **{ellipsis_v}** icon in the top right corner of the metric.
1. Choose **Alerts**.
1. Set threshold and operator.
2020-10-24 23:57:45 +05:30
1. (Optional) Add a Runbook URL.
2020-07-28 23:09:34 +05:30
1. Click **Add** to save and activate the alert.
2020-10-24 23:57:45 +05:30
![Adding an alert](img/prometheus_alert.png)
2020-07-28 23:09:34 +05:30
To remove the alert, click back on the alert icon for the desired metric, and click **Delete**.
2020-10-24 23:57:45 +05:30
### Link runbooks to alerts
2020-11-24 15:15:51 +05:30
> Runbook URLs [introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/39315) in GitLab 13.3.
2020-10-24 23:57:45 +05:30
When creating alerts from the metrics dashboard for [managed Prometheus instances](#managed-prometheus-instances),
you can also link a runbook. When the alert triggers, the
[chart context menu](dashboards/index.md#chart-context-menu) on the metrics chart
links to the runbook, making it easy for you to locate and access the correct runbook
as soon as the alert fires:
![Linked Runbook in charts](img/linked_runbooks_on_charts.png)
2020-07-28 23:09:34 +05:30
## External Prometheus instances
2021-03-11 19:13:27 +05:30
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/9258) in GitLab Ultimate 11.8.
> - [Moved](https://gitlab.com/gitlab-org/gitlab/-/issues/42640) to GitLab Free in 12.10.
2020-07-28 23:09:34 +05:30
For manually configured Prometheus servers, GitLab provides a notify endpoint for
use with Prometheus webhooks. If you have manual configuration enabled, an
2020-10-24 23:57:45 +05:30
**Alerts** section is added to **Settings > Integrations > Prometheus**.
2021-02-22 17:27:13 +05:30
This section contains the needed **URL** and **Authorization Key**. The
**Reset Key** button invalidates the key and generates a new one.
2020-07-28 23:09:34 +05:30
2020-10-24 23:57:45 +05:30
![Prometheus service configuration of Alerts](img/prometheus_service_alerts.png)
2020-07-28 23:09:34 +05:30
To send GitLab alert notifications, copy the **URL** and **Authorization Key** into the
[`webhook_configs`](https://prometheus.io/docs/alerting/latest/configuration/#webhook_config)
section of your Prometheus Alertmanager configuration:
```yaml
receivers:
name: gitlab
webhook_configs:
- http_config:
bearer_token: 9e1cbfcd546896a9ea8be557caf13a76
send_resolved: true
url: http://192.168.178.31:3001/root/manual_prometheus/prometheus/alerts/notify.json
2021-01-03 14:25:43 +05:30
# Rest of configuration omitted
# ...
2020-07-28 23:09:34 +05:30
```
For GitLab to associate your alerts with an [environment](../../ci/environments/index.md),
you must configure a `gitlab_environment_name` label on the alerts you set up in
Prometheus. The value of this should match the name of your environment in GitLab.
2021-03-11 19:13:27 +05:30
You can display alerts with a `gitlab_environment_name` of `production`
[on a dashboard](../../user/operations_dashboard/index.md#adding-a-project-to-the-dashboard).
2020-07-28 23:09:34 +05:30
In GitLab versions 13.1 and greater, you can configure your manually configured
Prometheus server to use the
2021-03-11 19:13:27 +05:30
[Generic alerts integration](../incident_management/integrations.md).
2020-07-28 23:09:34 +05:30
## Trigger actions from alerts **(ULTIMATE)**
2020-11-24 15:15:51 +05:30
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/4925) in [GitLab Ultimate](https://about.gitlab.com/pricing/) 11.11.
2021-02-22 17:27:13 +05:30
> - [From GitLab Ultimate 12.5](https://gitlab.com/gitlab-org/gitlab/-/issues/13401), when GitLab receives a recovery alert, it automatically closes the associated issue.
2020-07-28 23:09:34 +05:30
Alerts can be used to trigger actions, like opening an issue automatically
(disabled by default since `13.1`). To configure the actions:
2020-10-24 23:57:45 +05:30
1. Navigate to your project's **Settings > Operations > Incidents**.
2020-07-28 23:09:34 +05:30
1. Enable the option to create issues.
1. Choose the [issue template](../../user/project/description_templates.md) to create the issue from.
1. Optionally, select whether to send an email notification to the developers of the project.
1. Click **Save changes**.
After enabling, GitLab automatically opens an issue when an alert is triggered containing
2020-10-24 23:57:45 +05:30
values extracted from the [`alerts` field in webhook payload](https://prometheus.io/docs/alerting/latest/configuration/#webhook_config):
2020-07-28 23:09:34 +05:30
- Issue author: `GitLab Alert Bot`
2020-10-24 23:57:45 +05:30
- Issue title: Extracted from the alert payload fields `annotations/title`, `annotations/summary`, or `labels/alertname`.
2021-03-11 19:13:27 +05:30
- Issue description: Extracted from alert payload field `annotations/description`.
2020-10-24 23:57:45 +05:30
- Alert `Summary`: A list of properties from the alert's payload.
- `starts_at`: Alert start time from the payload's `startsAt` field
- `full_query`: Alert query extracted from the payload's `generatorURL` field
2020-07-28 23:09:34 +05:30
- Optional list of attached annotations extracted from `annotations/*`
2020-10-24 23:57:45 +05:30
- Alert [GFM](../../user/markdown.md): GitLab Flavored Markdown from the payload's `annotations/gitlab_incident_markdown` field.
2021-03-11 19:13:27 +05:30
- Alert Severity (introduced in GitLab version [13.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/50871):
Extracted from the alert payload field `labels/severity`. Maps case-insensitive
value to [Alert's severity](../incident_management/alerts.md#alert-severity):
- **Critical**: `critical`, `s1`, `p1`, `emergency`, `fatal`, or any value not in this list
- **High**: `high`, `s2`, `p2`, `major`, `page`
- **Medium**: `medium`, `s3`, `p3`, `error`, `alert`
- **Low**: `low`, `s4`, `p4`, `warn`, `warning`
- **Info**: `info`, `s5`, `p5`, `debug`, `information`, `notice`
2020-07-28 23:09:34 +05:30
When GitLab receives a **Recovery Alert**, it closes the associated issue.
This action is recorded as a system message on the issue indicating that it
was closed automatically by the GitLab Alert bot.
To further customize the issue, you can add labels, mentions, or any other supported
[quick action](../../user/project/quick_actions.md) in the selected issue template,
which applies to all incidents. To limit quick actions or other information to
only specific types of alerts, use the `annotations/gitlab_incident_markdown` field.
Since [version 12.2](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/63373),
GitLab tags each incident issue with the `incident` label automatically. If the label
does not yet exist, it is also created automatically.
If the metric exceeds the threshold of the alert for over 5 minutes, GitLab sends
an email to all [Maintainers and Owners](../../user/permissions.md#project-members-permissions)
of the project.