459 lines
16 KiB
Markdown
459 lines
16 KiB
Markdown
---
|
|
stage: Analytics
|
|
group: Product Intelligence
|
|
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
|
|
---
|
|
|
|
# Metrics instrumentation guide
|
|
|
|
This guide describes how to develop Service Ping metrics using metrics instrumentation.
|
|
|
|
<i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
|
|
For a video tutorial, see the [Adding Service Ping metric via instrumentation class](https://youtu.be/p2ivXhNxUoY).
|
|
|
|
## Nomenclature
|
|
|
|
- **Instrumentation class**:
|
|
- Inherits one of the metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric`, `NumbersMetric` or `GenericMetric`.
|
|
- Implements the logic that calculates the value for a Service Ping metric.
|
|
|
|
- **Metric definition**
|
|
The Service Data metric YAML definition.
|
|
|
|
- **Hardening**:
|
|
Hardening a method is the process that ensures the method fails safe, returning a fallback value like -1.
|
|
|
|
## How it works
|
|
|
|
A metric definition has the [`instrumentation_class`](metrics_dictionary.md) field, which can be set to a class.
|
|
|
|
The defined instrumentation class should inherit one of the existing metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric`, `NumbersMetric` or `GenericMetric`.
|
|
|
|
The current convention is that a single instrumentation class corresponds to a single metric. On rare occasions, there are exceptions to that convention like [Redis metrics](#redis-metrics). To use a single instrumentation class for more than one metric, please reach out to one of the `@gitlab-org/analytics-section/product-intelligence/engineers` members to consult about your case.
|
|
|
|
Using the instrumentation classes ensures that metrics can fail safe individually, without breaking the entire
|
|
process of Service Ping generation.
|
|
|
|
We have built a domain-specific language (DSL) to define the metrics instrumentation.
|
|
|
|
## Database metrics
|
|
|
|
You can use database metrics to track data kept in the database, for example, a count of issues that exist on a given instance.
|
|
|
|
- `operation`: Operations for the given `relation`, one of `count`, `distinct_count`, `sum`, and `average`.
|
|
- `relation`: `ActiveRecord::Relation` for the objects we want to perform the `operation`.
|
|
- `start`: Specifies the start value of the batch counting, by default is `relation.minimum(:id)`.
|
|
- `finish`: Specifies the end value of the batch counting, by default is `relation.maximum(:id)`.
|
|
- `cache_start_and_finish_as`: Specifies the cache key for `start` and `finish` values and sets up caching them. Use this call when `start` and `finish` are expensive queries that should be reused between different metric calculations.
|
|
- `available?`: Specifies whether the metric should be reported. The default is `true`.
|
|
- `timestamp_column`: Optionally specifies timestamp column for metric used to filter records for time constrained metrics. The default is `created_at`.
|
|
|
|
[Example of a merge request that adds a database metric](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/60022).
|
|
|
|
```ruby
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class CountBoardsMetric < DatabaseMetric
|
|
operation :count
|
|
|
|
relation { Board }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
### Ordinary batch counters Example
|
|
|
|
```ruby
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class CountIssuesMetric < DatabaseMetric
|
|
operation :count
|
|
|
|
start { Issue.minimum(:id) }
|
|
finish { Issue.maximum(:id) }
|
|
|
|
relation { Issue }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
### Distinct batch counters Example
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class CountUsersAssociatingMilestonesToReleasesMetric < DatabaseMetric
|
|
operation :distinct_count, column: :author_id
|
|
|
|
relation { Release.with_milestones }
|
|
|
|
start { Release.minimum(:author_id) }
|
|
finish { Release.maximum(:author_id) }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
### Sum Example
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class JiraImportsTotalImportedIssuesCountMetric < DatabaseMetric
|
|
operation :sum, column: :imported_issues_count
|
|
|
|
relation { JiraImportState.finished }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
### Average Example
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class CountIssuesWeightAverageMetric < DatabaseMetric
|
|
operation :average, column: :weight
|
|
|
|
relation { Issue }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
## Redis metrics
|
|
|
|
You can use Redis metrics to track events not kept in the database, for example, a count of how many times the search bar has been used.
|
|
|
|
[Example of a merge request that adds a `Redis` metric](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/97009).
|
|
|
|
Please note that `RedisMetric` class can only be used as the `instrumentation_class` for Redis metrics with simple counters classes (classes that only inherit `BaseCounter` and set `PREFIX` and `KNOWN_EVENTS` constants). In case the counter class has additional logic included in it, a new `instrumentation_class`, inheriting from `RedisMetric`, needs to be created. This new class needs to include the additional logic from the counter class.
|
|
|
|
Count unique values for `source_code_pushes` event.
|
|
|
|
Required options:
|
|
|
|
- `event`: the event name.
|
|
- `prefix`: the value of the `PREFIX` constant used in the counter classes from the `Gitlab::UsageDataCounters` namespace.
|
|
|
|
```yaml
|
|
time_frame: all
|
|
data_source: redis
|
|
instrumentation_class: RedisMetric
|
|
options:
|
|
event: pushes
|
|
prefix: source_code
|
|
```
|
|
|
|
### Availability-restrained Redis metrics
|
|
|
|
If the Redis metric should only be available in the report under some conditions, then you must specify these conditions in a new class that is a child of the `RedisMetric` class.
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class MergeUsageCountRedisMetric < RedisMetric
|
|
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
You must also use the class's name in the YAML setup.
|
|
|
|
```yaml
|
|
time_frame: all
|
|
data_source: redis
|
|
instrumentation_class: MergeUsageCountRedisMetric
|
|
options:
|
|
event: pushes
|
|
prefix: source_code
|
|
```
|
|
|
|
## Redis HyperLogLog metrics
|
|
|
|
You can use Redis HyperLogLog metrics to track events not kept in the database and incremented for unique values such as unique users,
|
|
for example, a count of how many different users used the search bar.
|
|
|
|
[Example of a merge request that adds a `RedisHLL` metric](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/61685).
|
|
|
|
Count unique values for `i_quickactions_approve` event.
|
|
|
|
```yaml
|
|
time_frame: 28d
|
|
data_source: redis_hll
|
|
instrumentation_class: RedisHLLMetric
|
|
options:
|
|
events:
|
|
- i_quickactions_approve
|
|
```
|
|
|
|
### Availability-restrained Redis HyperLogLog metrics
|
|
|
|
If the Redis HyperLogLog metric should only be available in the report under some conditions, then you must specify these conditions in a new class that is a child of the `RedisHLLMetric` class.
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class MergeUsageCountRedisHLLMetric < RedisHLLMetric
|
|
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
You must also use the class's name in the YAML setup.
|
|
|
|
```yaml
|
|
time_frame: 28d
|
|
data_source: redis_hll
|
|
instrumentation_class: MergeUsageCountRedisHLLMetric
|
|
options:
|
|
events:
|
|
- i_quickactions_approve
|
|
```
|
|
|
|
## Aggregated metrics
|
|
|
|
The aggregated metrics feature provides insight into the number of data attributes, for example `pseudonymized_user_ids`, that occurred in a collection of events. For example, you can aggregate the number of users who perform multiple actions such as creating a new issue and opening
|
|
a new merge request.
|
|
|
|
You can use a YAML file to define your aggregated metrics. The following arguments are required:
|
|
|
|
- `options.events`: List of event names to aggregate into metric data. All events in this list must
|
|
use the same data source. Additional data source requirements are described in
|
|
[Database sourced aggregated metrics](implement.md#database-sourced-aggregated-metrics) and
|
|
[Redis sourced aggregated metrics](implement.md#redis-sourced-aggregated-metrics).
|
|
- `options.aggregate.operator`: Operator that defines how the aggregated metric data is counted. Available operators are:
|
|
- `OR`: Removes duplicates and counts all entries that triggered any of the listed events.
|
|
- `AND`: Removes duplicates and counts all elements that were observed triggering all of the following events.
|
|
- `options.aggregate.attribute`: Information pointing to the attribute that is being aggregated across events.
|
|
- `time_frame`: One or more valid time frames. Use these to limit the data included in aggregated metrics to events within a specific date-range. Valid time frames are:
|
|
- `7d`: The last 7 days of data.
|
|
- `28d`: The last 28 days of data.
|
|
- `all`: All historical data, only available for `database` sourced aggregated metrics.
|
|
- `data_source`: Data source used to collect all events data included in the aggregated metrics. Valid data sources are:
|
|
- [`database`](implement.md#database-sourced-aggregated-metrics)
|
|
- [`redis_hll`](implement.md#redis-sourced-aggregated-metrics)
|
|
|
|
Refer to merge request [98206](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/98206) for an example of a merge request that adds an `AggregatedMetric` metric.
|
|
|
|
Count unique `user_ids` that occurred in at least one of the events: `incident_management_alert_status_changed`,
|
|
`incident_management_alert_assigned`, `incident_management_alert_todo`, `incident_management_alert_create_incident`.
|
|
|
|
```yaml
|
|
time_frame: 28d
|
|
instrumentation_class: AggregatedMetric
|
|
data_source: redis_hll
|
|
options:
|
|
aggregate:
|
|
operator: OR
|
|
attribute: user_id
|
|
events:
|
|
- `incident_management_alert_status_changed`
|
|
- `incident_management_alert_assigned`
|
|
- `incident_management_alert_todo`
|
|
- `incident_management_alert_create_incident`
|
|
```
|
|
|
|
### Availability-restrained Aggregated metrics
|
|
|
|
If the Aggregated metric should only be available in the report under specific conditions, then you must specify these conditions in a new class that is a child of the `AggregatedMetric` class.
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class MergeUsageCountAggregatedMetric < AggregatedMetric
|
|
available? { Feature.enabled?(:merge_usage_data_missing_key_paths) }
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
You must also use the class's name in the YAML setup.
|
|
|
|
```yaml
|
|
time_frame: 28d
|
|
instrumentation_class: MergeUsageCountAggregatedMetric
|
|
data_source: redis_hll
|
|
options:
|
|
aggregate:
|
|
operator: OR
|
|
attribute: user_id
|
|
events:
|
|
- `incident_management_alert_status_changed`
|
|
- `incident_management_alert_assigned`
|
|
- `incident_management_alert_todo`
|
|
- `incident_management_alert_create_incident`
|
|
```
|
|
|
|
## Numbers metrics
|
|
|
|
- `operation`: Operations for the given `data` block. Currently we only support `add` operation.
|
|
- `data`: a `block` which contains an array of numbers.
|
|
- `available?`: Specifies whether the metric should be reported. The default is `true`.
|
|
|
|
```ruby
|
|
# frozen_string_literal: true
|
|
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class IssuesBoardsCountMetric < NumbersMetric
|
|
operation :add
|
|
|
|
data do |time_frame|
|
|
[
|
|
CountIssuesMetric.new(time_frame: time_frame).value,
|
|
CountBoardsMetric.new(time_frame: time_frame).value
|
|
]
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
You must also include the instrumentation class name in the YAML setup.
|
|
|
|
```yaml
|
|
time_frame: 28d
|
|
instrumentation_class: IssuesBoardsCountMetric
|
|
```
|
|
|
|
## Generic metrics
|
|
|
|
You can use generic metrics for other metrics, for example, an instance's database version. Observations type of data will always have a Generic metric counter type.
|
|
|
|
- `value`: Specifies the value of the metric.
|
|
- `available?`: Specifies whether the metric should be reported. The default is `true`.
|
|
|
|
[Example of a merge request that adds a generic metric](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/60256).
|
|
|
|
```ruby
|
|
module Gitlab
|
|
module Usage
|
|
module Metrics
|
|
module Instrumentations
|
|
class UuidMetric < GenericMetric
|
|
value do
|
|
Gitlab::CurrentSettings.uuid
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
## Support for instrumentation classes
|
|
|
|
There is support for:
|
|
|
|
- `count`, `distinct_count`, `estimate_batch_distinct_count`, `sum`, and `average` for [database metrics](#database-metrics).
|
|
- [Redis metrics](#redis-metrics).
|
|
- [Redis HLL metrics](#redis-hyperloglog-metrics).
|
|
- `add` for [numbers metrics](#numbers-metrics).
|
|
- [Generic metrics](#generic-metrics), which are metrics based on settings or configurations.
|
|
|
|
There is no support for:
|
|
|
|
- `add`, `histogram` for database metrics.
|
|
|
|
You can [track the progress to support these](https://gitlab.com/groups/gitlab-org/-/epics/6118).
|
|
|
|
## Create a new metric instrumentation class
|
|
|
|
To create a stub instrumentation for a Service Ping metric, you can use a dedicated [generator](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/generators/gitlab/usage_metric_generator.rb):
|
|
|
|
The generator takes the class name as an argument and the following options:
|
|
|
|
- `--type=TYPE` Required. Indicates the metric type. It must be one of: `database`, `generic`, `redis`, `numbers`.
|
|
- `--operation` Required for `database` & `numbers` type.
|
|
- For `database` it must be one of: `count`, `distinct_count`, `estimate_batch_distinct_count`, `sum`, `average`.
|
|
- For `numbers` it must be: `add`.
|
|
- `--ee` Indicates if the metric is for EE.
|
|
|
|
```shell
|
|
rails generate gitlab:usage_metric CountIssues --type database --operation distinct_count
|
|
create lib/gitlab/usage/metrics/instrumentations/count_issues_metric.rb
|
|
create spec/lib/gitlab/usage/metrics/instrumentations/count_issues_metric_spec.rb
|
|
```
|
|
|
|
## Migrate Service Ping metrics to instrumentation classes
|
|
|
|
This guide describes how to migrate a Service Ping metric from [`lib/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb) or [`ee/lib/ee/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/usage_data.rb) to instrumentation classes.
|
|
|
|
1. Choose the metric type:
|
|
|
|
- [Database metric](#database-metrics)
|
|
- [Redis HyperLogLog metrics](#redis-hyperloglog-metrics)
|
|
- [Redis metric](#redis-metrics)
|
|
- [Numbers metric](#numbers-metrics)
|
|
- [Generic metric](#generic-metrics)
|
|
|
|
1. Determine the location of instrumentation class: either under `ee` or outside `ee`.
|
|
|
|
1. [Generate the instrumentation class file](#create-a-new-metric-instrumentation-class).
|
|
|
|
1. Fill the instrumentation class body:
|
|
|
|
- Add code logic for the metric. This might be similar to the metric implementation in `usage_data.rb`.
|
|
- Add tests for the individual metric [`spec/lib/gitlab/usage/metrics/instrumentations/`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/usage/metrics/instrumentations).
|
|
- Add tests for Service Ping.
|
|
|
|
1. [Generate the metric definition file](metrics_dictionary.md#create-a-new-metric-definition).
|
|
|
|
1. Remove the code from [`lib/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb) or [`ee/lib/ee/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/usage_data.rb).
|
|
|
|
1. Remove the tests from [`spec/lib/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/lib/gitlab/usage_data_spec.rb) or [`ee/spec/lib/ee/gitlab/usage_data.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/spec/lib/ee/gitlab/usage_data_spec.rb).
|