debian-mirror-gitlab/doc/administration/monitoring/prometheus/gitlab_metrics.md
2022-11-25 23:54:43 +05:30

63 KiB

stage group info
Monitor Respond To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments

GitLab Prometheus metrics (FREE SELF)

To enable the GitLab Prometheus metrics:

  1. Log in to GitLab as a user with administrator access.
  2. On the top bar, select Main menu > Admin.
  3. On the left sidebar, select Settings > Metrics and profiling.
  4. Find the Metrics - Prometheus section, and select Enable GitLab Prometheus metrics endpoint.
  5. Restart GitLab for the changes to take effect.

For installations from source you must configure it yourself.

Collecting the metrics

GitLab monitors its own internal service metrics, and makes them available at the /-/metrics endpoint. Unlike other Prometheus exporters, to access the metrics, the client IP address must be explicitly allowed.

These metrics are enabled and collected for Omnibus GitLab and Chart installations. For source installations, these metrics must be enabled manually and collected by a Prometheus server.

For enabling and viewing metrics from Sidekiq nodes, see Sidekiq metrics.

Metrics available

The following metrics are available:

Metric Type Since Description Labels
gitlab_banzai_cached_render_real_duration_seconds Histogram 9.4 Duration of rendering Markdown into HTML when cached output exists controller, action
gitlab_banzai_cacheless_render_real_duration_seconds Histogram 9.4 Duration of rendering Markdown into HTML when cached output does not exist controller, action
gitlab_cache_misses_total Counter 10.2 Cache read miss controller, action
gitlab_cache_operation_duration_seconds Histogram 10.2 Cache access time
gitlab_cache_operations_total Counter 12.2 Cache operations by controller or action controller, action, operation
gitlab_ci_pipeline_builder_scoped_variables_duration Histogram 14.5 Time in seconds it takes to create the scoped variables for a CI/CD job
gitlab_ci_pipeline_creation_duration_seconds Histogram 13.0 Time in seconds it takes to create a CI/CD pipeline
gitlab_ci_pipeline_size_builds Histogram 13.1 Total number of builds within a pipeline grouped by a pipeline source source
gitlab_ci_runner_authentication_success_total Counter 15.2 Total number of times that runner authentication has succeeded type
gitlab_ci_runner_authentication_failure_total Counter 15.2 Total number of times that runner authentication has failed
job_waiter_started_total Counter 12.9 Number of batches of jobs started where a web request is waiting for the jobs to complete worker
job_waiter_timeouts_total Counter 12.9 Number of batches of jobs that timed out where a web request is waiting for the jobs to complete worker
gitlab_ci_active_jobs Histogram 14.2 Count of active jobs when pipeline is created
gitlab_database_transaction_seconds Histogram 12.1 Time spent in database transactions, in seconds
gitlab_method_call_duration_seconds Histogram 10.2 Method calls real duration controller, action, module, method
gitlab_page_out_of_bounds Counter 12.8 Counter for the PageLimiter pagination limit being hit controller, action, bot
gitlab_rails_boot_time_seconds Gauge 14.8 Time elapsed for Rails primary process to finish startup
gitlab_rails_queue_duration_seconds Histogram 9.4 Measures latency between GitLab Workhorse forwarding a request to Rails
gitlab_sql_duration_seconds Histogram 10.2 SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT
gitlab_sql_<role>_duration_seconds Histogram 13.10 SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT, grouped by database roles (primary/replica)
gitlab_ruby_threads_max_expected_threads Gauge 13.3 Maximum number of threads expected to be running and performing application work
gitlab_ruby_threads_running_threads Gauge 13.3 Number of running Ruby threads by name
gitlab_transaction_cache_<key>_count_total Counter 10.2 Counter for total Rails cache calls (per key)
gitlab_transaction_cache_<key>_duration_total Counter 10.2 Counter for total time (seconds) spent in Rails cache calls (per key)
gitlab_transaction_cache_count_total Counter 10.2 Counter for total Rails cache calls (aggregate)
gitlab_transaction_cache_duration_total Counter 10.2 Counter for total time (seconds) spent in Rails cache calls (aggregate)
gitlab_transaction_cache_read_hit_count_total Counter 10.2 Counter for cache hits for Rails cache calls controller, action
gitlab_transaction_cache_read_miss_count_total Counter 10.2 Counter for cache misses for Rails cache calls controller, action
gitlab_transaction_duration_seconds Histogram 10.2 Duration for successful requests (gitlab_transaction_* metrics) controller, action
gitlab_transaction_event_build_found_total Counter 9.4 Counter for build found for API /jobs/request
gitlab_transaction_event_build_invalid_total Counter 9.4 Counter for build invalid due to concurrency conflict for API /jobs/request
gitlab_transaction_event_build_not_found_cached_total Counter 9.4 Counter for cached response of build not found for API /jobs/request
gitlab_transaction_event_build_not_found_total Counter 9.4 Counter for build not found for API /jobs/request
gitlab_transaction_event_change_default_branch_total Counter 9.4 Counter when default branch is changed for any repository
gitlab_transaction_event_create_repository_total Counter 9.4 Counter when any repository is created
gitlab_transaction_event_etag_caching_cache_hit_total Counter 9.4 Counter for ETag cache hit. endpoint
gitlab_transaction_event_etag_caching_header_missing_total Counter 9.4 Counter for ETag cache miss - header missing endpoint
gitlab_transaction_event_etag_caching_key_not_found_total Counter 9.4 Counter for ETag cache miss - key not found endpoint
gitlab_transaction_event_etag_caching_middleware_used_total Counter 9.4 Counter for ETag middleware accessed endpoint
gitlab_transaction_event_etag_caching_resource_changed_total Counter 9.4 Counter for ETag cache miss - resource changed endpoint
gitlab_transaction_event_fork_repository_total Counter 9.4 Counter for repository forks (RepositoryForkWorker). Only incremented when source repository exists
gitlab_transaction_event_import_repository_total Counter 9.4 Counter for repository imports (RepositoryImportWorker)
gitlab_transaction_event_patch_hard_limit_bytes_hit_total Counter 13.9 Counter for diff patch size limit hits
gitlab_transaction_event_push_branch_total Counter 9.4 Counter for all branch pushes
gitlab_transaction_event_rails_exception_total Counter 9.4 Counter for number of rails exceptions
gitlab_transaction_event_receive_email_total Counter 9.4 Counter for received emails handler
gitlab_transaction_event_remote_mirrors_failed_total Counter 10.8 Counter for failed remote mirrors
gitlab_transaction_event_remote_mirrors_finished_total Counter 10.8 Counter for finished remote mirrors
gitlab_transaction_event_remote_mirrors_running_total Counter 10.8 Counter for running remote mirrors
gitlab_transaction_event_remove_branch_total Counter 9.4 Counter when a branch is removed for any repository
gitlab_transaction_event_remove_repository_total Counter 9.4 Counter when a repository is removed
gitlab_transaction_event_remove_tag_total Counter 9.4 Counter when a tag is remove for any repository
gitlab_transaction_event_sidekiq_exception_total Counter 9.4 Counter of Sidekiq exceptions
gitlab_transaction_event_stuck_import_jobs_total Counter 9.4 Count of stuck import jobs projects_without_jid_count, projects_with_jid_count
gitlab_transaction_event_update_build_total Counter 9.4 Counter for update build for API /jobs/request/:id
gitlab_transaction_new_redis_connections_total Counter 9.4 Counter for new Redis connections
gitlab_transaction_rails_queue_duration_total Counter 9.4 Measures latency between GitLab Workhorse forwarding a request to Rails controller, action
gitlab_transaction_view_duration_total Counter 9.4 Duration for views controller, action, view
gitlab_view_rendering_duration_seconds Histogram 10.2 Duration for views (histogram) controller, action, view
http_requests_total Counter 9.4 Rack request count method, status
http_request_duration_seconds Histogram 9.4 HTTP response time from rack middleware for successful requests method
gitlab_transaction_db_count_total Counter 13.1 Counter for total number of SQL calls controller, action
gitlab_transaction_db_<role>_count_total Counter 13.10 Counter for total number of SQL calls, grouped by database roles (primary/replica) controller, action
gitlab_transaction_db_write_count_total Counter 13.1 Counter for total number of write SQL calls controller, action
gitlab_transaction_db_cached_count_total Counter 13.1 Counter for total number of cached SQL calls controller, action
gitlab_transaction_db_<role>_cached_count_total Counter 13.1 Counter for total number of cached SQL calls, grouped by database roles (primary/replica) controller, action
gitlab_transaction_db_<role>_wal_count_total Counter 14.0 Counter for total number of WAL (write ahead log location) queries, grouped by database roles (primary/replica) controller, action
gitlab_transaction_db_<role>_wal_cached_count_total Counter 14.1 Counter for total number of cached WAL (write ahead log location) queries, grouped by database roles (primary/replica) controller, action
http_elasticsearch_requests_duration_seconds (PREMIUM) Histogram 13.1 Elasticsearch requests duration during web transactions controller, action
http_elasticsearch_requests_total (PREMIUM) Counter 13.1 Elasticsearch requests count during web transactions controller, action
pipelines_created_total Counter 9.4 Counter of pipelines created
rack_uncaught_errors_total Counter 9.4 Rack connections handling uncaught errors count
user_session_logins_total Counter 9.4 Counter of how many users have logged in since GitLab was started or restarted
upload_file_does_not_exist Counter 10.7 Number of times an upload record could not find its file.
failed_login_captcha_total Gauge 11.0 Counter of failed CAPTCHA attempts during login
successful_login_captcha_total Gauge 11.0 Counter of successful CAPTCHA attempts during login
auto_devops_pipelines_completed_total Counter 12.7 Counter of completed Auto DevOps pipelines, labeled by status
artifact_report_<report_type>_builds_completed_total Counter 15.3 Counter of completed CI Builds with report-type artifacts, grouped by report type and labeled by status
gitlab_metrics_dashboard_processing_time_ms Summary 12.10 Metrics dashboard processing time in milliseconds service, stages
action_cable_active_connections Gauge 13.4 Number of ActionCable WS clients currently connected server_mode
action_cable_broadcasts_total Counter 13.10 The number of ActionCable broadcasts emitted server_mode
action_cable_pool_min_size Gauge 13.4 Minimum number of worker threads in ActionCable thread pool server_mode
action_cable_pool_max_size Gauge 13.4 Maximum number of worker threads in ActionCable thread pool server_mode
action_cable_pool_current_size Gauge 13.4 Current number of worker threads in ActionCable thread pool server_mode
action_cable_pool_largest_size Gauge 13.4 Largest number of worker threads observed so far in ActionCable thread pool server_mode
action_cable_pool_pending_tasks Gauge 13.4 Number of tasks waiting to be executed in ActionCable thread pool server_mode
action_cable_pool_tasks_total Gauge 13.4 Total number of tasks executed in ActionCable thread pool server_mode
gitlab_ci_trace_operations_total Counter 13.4 Total amount of different operations on a build trace operation
gitlab_ci_trace_bytes_total Counter 13.4 Total amount of build trace bytes transferred
action_cable_single_client_transmissions_total Counter 13.10 The number of ActionCable messages transmitted to any client in any channel server_mode
action_cable_subscription_confirmations_total Counter 13.10 The number of ActionCable subscriptions from clients confirmed server_mode
action_cable_subscription_rejections_total Counter 13.10 The number of ActionCable subscriptions from clients rejected server_mode
action_cable_transmitted_bytes Histogram 14.1 Message size, in bytes, transmitted over action cable operation, channel
gitlab_issuable_fast_count_by_state_total Counter 13.5 Total number of row count operations on issue/merge request list pages
gitlab_issuable_fast_count_by_state_failures_total Counter 13.5 Number of soft-failed row count operations on issue/merge request list pages
gitlab_ci_trace_finalize_duration_seconds Histogram 13.6 Duration of build trace chunks migration to object storage
gitlab_external_http_total Counter 13.8 Total number of HTTP calls to external systems controller, action
gitlab_external_http_duration_seconds Counter 13.8 Duration in seconds spent on each HTTP call to external systems
gitlab_external_http_exception_total Counter 13.8 Total number of exceptions raised when making external HTTP calls
ci_report_parser_duration_seconds Histogram 13.9 Time to parse CI/CD report artifacts parser
pipeline_graph_link_calculation_duration_seconds Histogram 13.9 Total time spent calculating links, in seconds
pipeline_graph_links_total Histogram 13.9 Number of links per graph
pipeline_graph_links_per_job_ratio Histogram 13.9 Ratio of links to job per graph
gitlab_ci_pipeline_security_orchestration_policy_processing_duration_seconds Histogram 13.12 Time in seconds it takes to process Security Policies in CI/CD pipeline
gitlab_spamcheck_request_duration_seconds Histogram 13.12 The duration for requests between Rails and the anti-spam engine
service_desk_thank_you_email Counter 14.0 Total number of email responses to new Service Desk emails
service_desk_new_note_email Counter 14.0 Total number of email notifications on new Service Desk comment
email_receiver_error Counter 14.1 Total number of errors when processing incoming emails
gitlab_snowplow_events_total Counter 14.1 Total number of GitLab Snowplow product intelligence events emitted
gitlab_snowplow_failed_events_total Counter 14.1 Total number of GitLab Snowplow product intelligence events emission failures
gitlab_snowplow_successful_events_total Counter 14.1 Total number of GitLab Snowplow product intelligence events emission successes
gitlab_ci_build_trace_errors_total Counter 14.4 Total amount of different error types on a build trace error_reason
gitlab_presentable_object_cacheless_render_real_duration_seconds Histogram 15.3 Duration of real time spent caching and representing specific web request objects controller, action
cached_object_operations_total Counter 15.3 Total number of objects cached for specific web requests controller, action

Metrics controlled by a feature flag

The following metrics can be controlled by feature flags:

Metric Feature Flag
gitlab_method_call_duration_seconds prometheus_metrics_method_instrumentation
gitlab_view_rendering_duration_seconds prometheus_metrics_view_instrumentation

Praefect metrics

You can configure Praefect to report metrics. For information on available metrics, see the relevant documentation.

Sidekiq metrics

Sidekiq jobs may also gather metrics, and these metrics can be accessed if the Sidekiq exporter is enabled: for example, using the monitoring.sidekiq_exporter configuration option in gitlab.yml. These metrics are served from the /metrics path on the configured port.

Metric Type Since Description Labels
sidekiq_jobs_cpu_seconds Histogram 12.4 Seconds of CPU time to run Sidekiq job queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_jobs_completion_seconds Histogram 12.2 Seconds to complete Sidekiq job queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_jobs_db_seconds Histogram 12.9 Seconds of DB time to run Sidekiq job queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_jobs_gitaly_seconds Histogram 12.9 Seconds of Gitaly time to run Sidekiq job queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_redis_requests_duration_seconds Histogram 13.1 Duration in seconds that a Sidekiq job spent querying a Redis server queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_elasticsearch_requests_duration_seconds Histogram 13.1 Duration in seconds that a Sidekiq job spent in requests to an Elasticsearch server queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_jobs_queue_duration_seconds Histogram 12.5 Duration in seconds that a Sidekiq job was queued before being executed queue, boundary, external_dependencies, feature_category, urgency
sidekiq_jobs_failed_total Counter 12.2 Sidekiq jobs failed queue, boundary, external_dependencies, feature_category, urgency
sidekiq_jobs_retried_total Counter 12.2 Sidekiq jobs retried queue, boundary, external_dependencies, feature_category, urgency
sidekiq_jobs_interrupted_total Counter 15.2 Sidekiq jobs interrupted queue, boundary, external_dependencies, feature_category, urgency
sidekiq_jobs_dead_total Counter 13.7 Sidekiq dead jobs (jobs that have run out of retries) queue, boundary, external_dependencies, feature_category, urgency
sidekiq_redis_requests_total Counter 13.1 Redis requests during a Sidekiq job execution queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_elasticsearch_requests_total Counter 13.1 Elasticsearch requests during a Sidekiq job execution queue, boundary, external_dependencies, feature_category, job_status, urgency
sidekiq_running_jobs Gauge 12.2 Number of Sidekiq jobs running queue, boundary, external_dependencies, feature_category, urgency
sidekiq_concurrency Gauge 12.5 Maximum number of Sidekiq jobs
sidekiq_mem_total_bytes Gauge 15.3 Number of bytes allocated for both objects consuming an object slot and objects that required a malloc'
geo_db_replication_lag_seconds Gauge 10.2 Database replication lag (seconds) url
geo_repositories Gauge 10.2 Total number of repositories available on primary url
geo_repositories_synced Gauge 10.2 Number of repositories synced on secondary url
geo_repositories_failed Gauge 10.2 Number of repositories failed to sync on secondary url
geo_lfs_objects Gauge 10.2 Number of LFS objects on primary url
geo_lfs_objects_checksummed Gauge 14.6 Number of LFS objects checksummed successfully on primary url
geo_lfs_objects_checksum_failed Gauge 14.6 Number of LFS objects failed to calculate the checksum on primary url
geo_lfs_objects_checksum_total Gauge 14.6 Number of LFS objects tried to checksum on primary url
geo_lfs_objects_synced Gauge 10.2 Number of syncable LFS objects synced on secondary url
geo_lfs_objects_failed Gauge 10.2 Number of syncable LFS objects failed to sync on secondary url
geo_lfs_objects_registry Gauge 14.6 Number of LFS objects in the registry url
geo_lfs_objects_verified Gauge 14.6 Number of LFS objects verified on secondary url
geo_lfs_objects_verification_failed Gauge 14.6 Number of LFS objects' verifications failed on secondary url
geo_lfs_objects_verification_total Gauge 14.6 Number of LFS objects' verifications tried on secondary url
geo_attachments Gauge 10.2 Total number of file attachments available on primary url
geo_attachments_synced Gauge 10.2 Number of attachments synced on secondary url
geo_attachments_failed Gauge 10.2 Number of attachments failed to sync on secondary url
geo_last_event_id Gauge 10.2 Database ID of the latest event log entry on the primary url
geo_last_event_timestamp Gauge 10.2 UNIX timestamp of the latest event log entry on the primary url
geo_cursor_last_event_id Gauge 10.2 Last database ID of the event log processed by the secondary url
geo_cursor_last_event_timestamp Gauge 10.2 Last UNIX timestamp of the event log processed by the secondary url
geo_status_failed_total Counter 10.2 Number of times retrieving the status from the Geo Node failed url
geo_last_successful_status_check_timestamp Gauge 10.2 Last timestamp when the status was successfully updated url
geo_job_artifacts_synced_missing_on_primary Gauge 10.7 Number of job artifacts marked as synced due to the file missing on the primary url
geo_repositories_checksummed Gauge 10.7 Number of repositories checksummed on primary url
geo_repositories_checksum_failed Gauge 10.7 Number of repositories failed to calculate the checksum on primary url
geo_wikis_checksummed Gauge 10.7 Number of wikis checksummed on primary url
geo_wikis_checksum_failed Gauge 10.7 Number of wikis failed to calculate the checksum on primary url
geo_repositories_verified Gauge 10.7 Number of repositories verified on secondary url
geo_repositories_verification_failed Gauge 10.7 Number of repositories failed to verify on secondary url
geo_repositories_checksum_mismatch Gauge 10.7 Number of repositories that checksum mismatch on secondary url
geo_wikis_verified Gauge 10.7 Number of wikis verified on secondary url
geo_wikis_verification_failed Gauge 10.7 Number of wikis failed to verify on secondary url
geo_wikis_checksum_mismatch Gauge 10.7 Number of wikis that checksum mismatch on secondary url
geo_repositories_checked Gauge 11.1 Number of repositories that have been checked via git fsck url
geo_repositories_checked_failed Gauge 11.1 Number of repositories that have a failure from git fsck url
geo_repositories_retrying_verification Gauge 11.2 Number of repositories verification failures that Geo is actively trying to correct on secondary url
geo_wikis_retrying_verification Gauge 11.2 Number of wikis verification failures that Geo is actively trying to correct on secondary url
geo_package_files Gauge 13.0 Number of package files on primary url
geo_package_files_checksummed Gauge 13.0 Number of package files checksummed on primary url
geo_package_files_checksum_failed Gauge 13.0 Number of package files failed to calculate the checksum on primary url
geo_package_files_synced Gauge 13.3 Number of syncable package files synced on secondary url
geo_package_files_failed Gauge 13.3 Number of syncable package files failed to sync on secondary url
geo_package_files_registry Gauge 13.3 Number of package files in the registry url
geo_terraform_state_versions Gauge 13.5 Number of terraform state versions on primary url
geo_terraform_state_versions_checksummed Gauge 13.5 Number of terraform state versions checksummed successfully on primary url
geo_terraform_state_versions_checksum_failed Gauge 13.5 Number of terraform state versions failed to calculate the checksum on primary url
geo_terraform_state_versions_checksum_total Gauge 13.12 Number of terraform state versions tried to checksum on primary url
geo_terraform_state_versions_synced Gauge 13.5 Number of syncable terraform state versions synced on secondary url
geo_terraform_state_versions_failed Gauge 13.5 Number of syncable terraform state versions failed to sync on secondary url
geo_terraform_state_versions_registry Gauge 13.5 Number of terraform state versions in the registry url
geo_terraform_state_versions_verified Gauge 13.12 Number of terraform state versions verified on secondary url
geo_terraform_state_versions_verification_failed Gauge 13.12 Number of terraform state versions verifications failed on secondary url
geo_terraform_state_versions_verification_total Gauge 13.12 Number of terraform state versions verifications tried on secondary url
global_search_bulk_cron_queue_size Gauge 12.10 Number of database records waiting to be synchronized to Elasticsearch
global_search_awaiting_indexing_queue_size Gauge 13.2 Number of database updates waiting to be synchronized to Elasticsearch while indexing is paused
geo_merge_request_diffs Gauge 13.4 Number of merge request diffs on primary url
geo_merge_request_diffs_checksum_total Gauge 13.12 Number of merge request diffs tried to checksum on primary url
geo_merge_request_diffs_checksummed Gauge 13.4 Number of merge request diffs successfully checksummed on primary url
geo_merge_request_diffs_checksum_failed Gauge 13.4 Number of merge request diffs failed to calculate the checksum on primary url
geo_merge_request_diffs_synced Gauge 13.4 Number of syncable merge request diffs synced on secondary url
geo_merge_request_diffs_failed Gauge 13.4 Number of syncable merge request diffs failed to sync on secondary url
geo_merge_request_diffs_registry Gauge 13.4 Number of merge request diffs in the registry url
geo_merge_request_diffs_verification_total Gauge 13.12 Number of merge request diffs verifications tried on secondary url
geo_merge_request_diffs_verified Gauge 13.12 Number of merge request diffs verified on secondary url
geo_merge_request_diffs_verification_failed Gauge 13.12 Number of merge request diffs verifications failed on secondary url
geo_snippet_repositories Gauge 13.4 Number of snippets on primary url
geo_snippet_repositories_checksummed Gauge 13.4 Number of snippets checksummed on primary url
geo_snippet_repositories_checksum_failed Gauge 13.4 Number of snippets failed to calculate the checksum on primary url
geo_snippet_repositories_synced Gauge 13.4 Number of syncable snippets synced on secondary url
geo_snippet_repositories_failed Gauge 13.4 Number of syncable snippets failed on secondary url
geo_snippet_repositories_registry Gauge 13.4 Number of syncable snippets in the registry url
geo_group_wiki_repositories Gauge 13.10 Number of group wikis on primary url
geo_group_wiki_repositories_checksummed Gauge 13.10 Number of group wikis checksummed on primary url
geo_group_wiki_repositories_checksum_failed Gauge 13.10 Number of group wikis failed to calculate the checksum on primary url
geo_group_wiki_repositories_synced Gauge 13.10 Number of syncable group wikis synced on secondary url
geo_group_wiki_repositories_failed Gauge 13.10 Number of syncable group wikis failed on secondary url
geo_group_wiki_repositories_registry Gauge 13.10 Number of syncable group wikis in the registry url
geo_pages_deployments Gauge 14.3 Number of pages deployments on primary url
geo_pages_deployments_checksum_total Gauge 14.6 Number of pages deployments tried to checksum on primary url
geo_pages_deployments_checksummed Gauge 14.6 Number of pages deployments successfully checksummed on primary url
geo_pages_deployments_checksum_failed Gauge 14.6 Number of pages deployments failed to calculate the checksum on primary url
geo_pages_deployments_synced Gauge 14.3 Number of syncable pages deployments synced on secondary url
geo_pages_deployments_failed Gauge 14.3 Number of syncable pages deployments failed to sync on secondary url
geo_pages_deployments_registry Gauge 14.3 Number of pages deployments in the registry url
geo_pages_deployments_verification_total Gauge 14.6 Number of pages deployments verifications tried on secondary url
geo_pages_deployments_verified Gauge 14.6 Number of pages deployments verified on secondary url
geo_pages_deployments_verification_failed Gauge 14.6 Number of pages deployments verifications failed on secondary url
geo_job_artifacts Gauge 14.8 Number of job artifacts on primary url
geo_job_artifacts_checksum_total Gauge 14.8 Number of job artifacts tried to checksum on primary url
geo_job_artifacts_checksummed Gauge 14.8 Number of job artifacts successfully checksummed on primary url
geo_job_artifacts_checksum_failed Gauge 14.8 Number of job artifacts failed to calculate the checksum on primary url
geo_job_artifacts_synced Gauge 14.8 Number of syncable job artifacts synced on secondary url
geo_job_artifacts_failed Gauge 14.8 Number of syncable job artifacts failed to sync on secondary url
geo_job_artifacts_registry Gauge 14.8 Number of job artifacts in the registry url
geo_job_artifacts_verification_total Gauge 14.8 Number of job artifacts verifications tried on secondary url
geo_job_artifacts_verified Gauge 14.8 Number of job artifacts verified on secondary url
geo_job_artifacts_verification_failed Gauge 14.8 Number of job artifacts verifications failed on secondary url
limited_capacity_worker_running_jobs Gauge 13.5 Number of running jobs worker
limited_capacity_worker_max_running_jobs Gauge 13.5 Maximum number of running jobs worker
limited_capacity_worker_remaining_work_count Gauge 13.5 Number of jobs waiting to be enqueued worker
destroyed_job_artifacts_count_total Counter 13.6 Number of destroyed expired job artifacts
destroyed_pipeline_artifacts_count_total Counter 13.8 Number of destroyed expired pipeline artifacts
gitlab_optimistic_locking_retries Histogram 13.10 Number of retry attempts to execute optimistic retry lock
geo_uploads Gauge 14.1 Number of uploads on primary url
geo_uploads_synced Gauge 14.1 Number of uploads synced on secondary url
geo_uploads_failed Gauge 14.1 Number of syncable uploads failed to sync on secondary url
geo_uploads_registry Gauge 14.1 Number of uploads in the registry url
geo_uploads_checksum_total Gauge 14.6 Number of uploads tried to checksum on primary url
geo_uploads_checksummed Gauge 14.6 Number of uploads successfully checksummed on primary url
geo_uploads_checksum_failed Gauge 14.6 Number of uploads failed to calculate the checksum on primary url
geo_uploads_verification_total Gauge 14.6 Number of uploads verifications tried on secondary url
geo_uploads_verified Gauge 14.6 Number of uploads verified on secondary url
geo_uploads_verification_failed Gauge 14.6 Number of uploads verifications failed on secondary url
geo_container_repositories Gauge 15.4 Number of container repositories on primary url
geo_container_repositories_synced Gauge 15.4 Number of container repositories synced on secondary url
geo_container_repositories_failed Gauge 15.4 Number of syncable container repositories failed to sync on secondary url
geo_container_repositories_registry Gauge 15.4 Number of container repositories in the registry url
gitlab_sli:rails_request_apdex:total Counter 14.4 The number of request-apdex measurements, more information the development documentation endpoint_id, feature_category, request_urgency
gitlab_sli:rails_request_apdex:success_total Counter 14.4 The number of successful requests that met the target duration for their urgency. Divide by gitlab_sli:rails_requests_apdex:total to get a success ratio endpoint_id, feature_category, request_urgency
geo_ci_secure_files Gauge 15.3 Number of secure files on primary url
geo_ci_secure_files_checksum_total Gauge 15.3 Number of secure files tried to checksum on primary url
geo_ci_secure_files_checksummed Gauge 15.3 Number of secure files successfully checksummed on primary url
geo_ci_secure_files_checksum_failed Gauge 15.3 Number of secure files failed to calculate the checksum on primary url
geo_ci_secure_files_synced Gauge 15.3 Number of syncable secure files synced on secondary url
geo_ci_secure_files_failed Gauge 15.3 Number of syncable secure files failed to sync on secondary url
geo_ci_secure_files_registry Gauge 15.3 Number of secure files in the registry url
geo_ci_secure_files_verification_total Gauge 15.3 Number of secure files verifications tried on secondary url
geo_ci_secure_files_verified Gauge 15.3 Number of secure files verified on secondary url
geo_ci_secure_files_verification_failed Gauge 15.3 Number of secure files verifications failed on secondary url

Database load balancing metrics (PREMIUM SELF)

The following metrics are available:

Metric Type Since Description Labels
db_load_balancing_hosts Gauge 12.3 Current number of load balancing hosts
sidekiq_load_balancing_count Counter 13.11 Sidekiq jobs using load balancing with data consistency set to :sticky or :delayed queue, boundary, external_dependencies, feature_category, job_status, urgency, data_consistency, load_balancing_strategy
gitlab_transaction_caught_up_replica_pick_count_total Counter 14.1 Number of search attempts for caught up replica result

Database partitioning metrics (PREMIUM SELF)

The following metrics are available:

Metric Type Since Description
db_partitions_present Gauge 13.4 Number of database partitions present
db_partitions_missing Gauge 13.4 Number of database partitions currently expected, but not present

Connection pool metrics

These metrics record the status of the database connection pools, and the metrics all have these labels:

  • class - the Ruby class being recorded.
    • ActiveRecord::Base is the main database connection.
    • Geo::TrackingBase is the connection to the Geo tracking database, if enabled.
  • host - the host name used to connect to the database.
  • port - the port used to connect to the database.
Metric Type Since Description
gitlab_database_connection_pool_size Gauge 13.0 Total connection pool capacity
gitlab_database_connection_pool_connections Gauge 13.0 Current connections in the pool
gitlab_database_connection_pool_busy Gauge 13.0 Connections in use where the owner is still alive
gitlab_database_connection_pool_dead Gauge 13.0 Connections in use where the owner is not alive
gitlab_database_connection_pool_idle Gauge 13.0 Connections not in use
gitlab_database_connection_pool_waiting Gauge 13.0 Threads currently waiting on this queue

Ruby metrics

Some basic Ruby runtime metrics are available:

Metric Type Since Description
ruby_gc_duration_seconds Counter 11.1 Time spent by Ruby in GC
ruby_gc_stat_... Gauge 11.1 Various metrics from GC.stat
ruby_gc_stat_ext_heap_fragmentation Gauge 15.2 Degree of Ruby heap fragmentation as live objects versus eden slots (range 0 to 1)
ruby_file_descriptors Gauge 11.1 File descriptors per process
ruby_sampler_duration_seconds Counter 11.1 Time spent collecting stats
ruby_process_cpu_seconds_total Gauge 12.0 Total amount of CPU time per process
ruby_process_max_fds Gauge 12.0 Maximum number of open file descriptors per process
ruby_process_resident_memory_bytes Gauge 12.0 Memory usage by process (RSS/Resident Set Size)
ruby_process_unique_memory_bytes Gauge 13.0 Memory usage by process (USS/Unique Set Size)
ruby_process_proportional_memory_bytes Gauge 13.0 Memory usage by process (PSS/Proportional Set Size)
ruby_process_start_time_seconds Gauge 12.0 UNIX timestamp of process start time

Puma Metrics

Metric Type Since Description
puma_workers Gauge 12.0 Total number of workers
puma_running_workers Gauge 12.0 Number of booted workers
puma_stale_workers Gauge 12.0 Number of old workers
puma_running Gauge 12.0 Number of running threads
puma_queued_connections Gauge 12.0 Number of connections in that worker's "to do" set waiting for a worker thread
puma_active_connections Gauge 12.0 Number of threads processing a request
puma_pool_capacity Gauge 12.0 Number of requests the worker is capable of taking right now
puma_max_threads Gauge 12.0 Maximum number of worker threads
puma_idle_threads Gauge 12.0 Number of spawned threads which are not processing a request
puma_killer_terminations_total Gauge 12.0 Number of workers terminated by PumaWorkerKiller

Redis metrics

These client metrics are meant to complement Redis server metrics. These metrics are broken down per Redis instance. These metrics all have a storage label which indicates the Redis instance (cache, shared_state, and so on).

Metric Type Since Description
gitlab_redis_client_exceptions_total Counter 13.2 Number of Redis client exceptions, broken down by exception class
gitlab_redis_client_requests_total Counter 13.2 Number of Redis client requests
gitlab_redis_client_requests_duration_seconds Histogram 13.2 Redis request latency, excluding blocking commands

Metrics shared directory

The GitLab Prometheus client requires a directory to store metrics data shared between multi-process services. Those files are shared among all instances running under Puma server. The directory must be accessible to all running Puma's processes, or metrics can't function correctly.

This directory's location is configured using environment variable prometheus_multiproc_dir. For best performance, create this directory in tmpfs.

If GitLab is installed using Omnibus GitLab and tmpfs is available, then GitLab configures the metrics directory for you.