debian-mirror-gitlab/doc/development/prometheus_metrics.md

93 lines
3.9 KiB
Markdown
Raw Normal View History

2020-10-24 23:57:45 +05:30
---
stage: Monitor
2022-04-04 11:22:00 +05:30
group: Respond
2022-11-25 23:54:43 +05:30
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
2020-10-24 23:57:45 +05:30
---
2023-05-27 22:25:52 +05:30
# Prometheus metrics development guidelines
2018-11-20 20:47:30 +05:30
## Adding to the library
2021-09-04 01:27:46 +05:30
We strive to support the 2-4 most important metrics for each common system service that supports Prometheus. If you are looking for support for a particular exporter which has not yet been added to the library, additions can be made [to the `common_metrics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/prometheus/common_metrics.yml) file.
2018-11-20 20:47:30 +05:30
### Query identifier
The requirement for adding a new metric is to make each query to have an unique identifier which is used to update the metric later when changed:
```yaml
- group: Response metrics (NGINX Ingress)
metrics:
2020-07-28 23:09:34 +05:30
- title: "Throughput"
y_axis:
name: "Requests / Sec"
format: "number"
precision: 2
queries:
- id: response_metrics_nginx_ingress_throughput_status_code
query_range: 'sum(rate(nginx_upstream_responses_total{upstream=~"%{kube_namespace}-%{ci_environment_slug}-.*"}[2m])) by (status_code)'
unit: req / sec
label: Status Code
2018-11-20 20:47:30 +05:30
```
### Update existing metrics
2021-02-22 17:27:13 +05:30
After you add or change an existing common metric, you must [re-run the import script](../administration/raketasks/maintenance.md#import-common-metrics) that queries and updates all existing metrics.
2020-01-01 13:55:28 +05:30
Or, you can create a database migration:
2018-11-20 20:47:30 +05:30
```ruby
2023-03-04 22:38:38 +05:30
class ImportCommonMetrics < Gitlab::Database::Migration[2.1]
2018-11-20 20:47:30 +05:30
def up
2019-09-30 21:07:59 +05:30
::Gitlab::DatabaseImporters::CommonMetrics::Importer.new.execute
2018-11-20 20:47:30 +05:30
end
def down
# no-op
end
end
```
2019-12-04 20:38:33 +05:30
2021-02-22 17:27:13 +05:30
If a query metric (which is identified by `id:`) is removed, it isn't removed from database by default.
2021-01-03 14:25:43 +05:30
You might want to add additional database migration that makes a decision what to do with removed one.
For example: you might be interested in migrating all dependent data to a different metric.
2019-12-04 20:38:33 +05:30
## GitLab Prometheus metrics
GitLab provides [Prometheus metrics](../administration/monitoring/prometheus/gitlab_metrics.md)
to monitor itself.
### Adding a new metric
This section describes how to add new metrics for self-monitoring
2020-03-13 15:44:24 +05:30
([example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/15440)).
2019-12-04 20:38:33 +05:30
1. Select the [type of metric](https://gitlab.com/gitlab-org/prometheus-client-mmap#metrics):
- `Gitlab::Metrics.counter`
- `Gitlab::Metrics.gauge`
- `Gitlab::Metrics.histogram`
- `Gitlab::Metrics.summary`
1. Select the appropriate name for your metric. Refer to the guidelines
for [Prometheus metric names](https://prometheus.io/docs/practices/naming/#metric-names).
1. Update the list of [GitLab Prometheus metrics](../administration/monitoring/prometheus/gitlab_metrics.md).
2021-03-11 19:13:27 +05:30
1. Carefully choose what labels you want to add to your metric. Values with high cardinality,
like `project_path`, or `project_id` are strongly discouraged because they can affect our services
availability due to the fact that each set of labels is exposed as a new entry in the `/metrics` endpoint.
For example, a histogram with 10 buckets and a label with 100 values would generate 1000
entries in the export endpoint.
2021-02-22 17:27:13 +05:30
1. Trigger the relevant page or code that records the new metric.
2019-12-04 20:38:33 +05:30
1. Check that the new metric appears at `/-/metrics`.
2023-06-20 00:43:36 +05:30
For metrics that are not bounded to a specific context (`request`, `process`, `machine`, `namespace`, etc),
generate them from a cron-based Sidekiq job:
- For Geo related metrics, check `Geo::MetricsUpdateService`.
- For other "global" / instance-wide metrics, check: `Metrics::GlobalMetricsUpdateService`.
When exporting data from Sidekiq in an installation with more than one Sidekiq instance,
you are not guaranteed that the same exporter will always be queried.
You can read more and understand the caveats in [issue 406583](https://gitlab.com/gitlab-org/gitlab/-/issues/406583),
where we also discuss a possible solution using a push-gateway.