info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---
# Merge Request Performance Guidelines
Each new introduced merge request **should be performant by default**.
To ensure a merge request does not negatively impact performance of GitLab
_every_ merge request **should** adhere to the guidelines outlined in this
document. There are no exceptions to this rule unless specifically discussed
with and agreed upon by backend maintainers and performance specialists.
It's also highly recommended that you read the following guides:
read this section on [how to prepare the merge request for a database review](../database_review.md#how-to-prepare-the-merge-request-for-a-database-review).
**Summary:** a merge request **should not** increase the total number of executed SQL
queries unless absolutely necessary.
The total number of queries executed by the code modified or added by a merge request
must not increase unless absolutely necessary. When building features it's
entirely possible you need some extra queries, but you should try to keep
this at a minimum.
As an example, say you introduce a feature that updates a number of database
rows with the same value. It may be very tempting (and easy) to write this using
the following pseudo code:
```ruby
objects_to_update.each do |object|
object.some_field = some_value
object.save
end
```
This means running one query for every object to update. This code can
easily overload a database given enough rows to update or many instances of this
code running in parallel. This particular problem is known as the
["N+1 query problem"](https://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations). You can write a test with [QueryRecorder](../database/query_recorder.md) to detect this and prevent regressions.
In this particular case the workaround is fairly easy:
This uses ActiveRecord's `update_all` method to update all rows in a single
query. This in turn makes it much harder for this code to overload a database.
## Use read replicas when possible
In a DB cluster we have many read replicas and one primary. A classic use of scaling the DB is to have read-only actions be performed by the replicas. We use [load balancing](../../administration/postgresql/database_load_balancing.md) to distribute this load. This allows for the replicas to grow as the pressure on the DB grows.
By default, queries use read-only replicas, but due to
[primary sticking](../../administration/postgresql/database_load_balancing.md#primary-sticking), GitLab uses the
primary for some time and reverts to secondaries after they have either caught up or after 30 seconds.
Doing this can lead to a considerable amount of unnecessary load on the primary.
To prevent switching to the primary [merge request 56849](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56849) introduced the
`without_sticky_writes` block. Typically, this method can be applied to prevent primary stickiness
after a trivial or insignificant write which doesn't affect the following queries in the same session.
To learn when a usage timestamp update can lead the session to stick to the primary and how to
prevent it by using `without_sticky_writes`, see [merge request 57328](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/57328)
As a counterpart of the `without_sticky_writes` utility,
`use_replicas_for_read_queries`. This method forces all read-only queries inside its block to read
replicas regardless of the current primary stickiness.
This utility is reserved for cases where queries can tolerate replication lag.
Internally, our database load balancer classifies the queries based on their main statement (`select`, `update`, `delete`, and so on). When in doubt, it redirects the queries to the primary database. Hence, there are some common cases the load balancer sends the queries to the primary unnecessarily:
- Custom queries (via `exec_query`, `execute_statement`, `execute`, and so on)
- Read-only transactions
- In-flight connection configuration set
- Sidekiq background jobs
After the above queries are executed, GitLab
[sticks to the primary](../../administration/postgresql/database_load_balancing.md#primary-sticking).
To make the inside queries prefer using the replicas,
CTEs have been effectively used as an optimization fence in many simpler cases,
such as this [example](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/43242#note_61416277).
Beginning in PostgreSQL 12, CTEs are inlined then [optimized by default](https://paquier.xyz/postgresql-2/postgres-12-with-materialize/).
Keeping the old behavior requires marking CTEs with the keyword `MATERIALIZED`.
When building CTE statements, use the `Gitlab::SQL::CTE` class [introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/56976) in GitLab 13.11.
By default, this `Gitlab::SQL::CTE` class forces materialization through adding the `MATERIALIZED` keyword for PostgreSQL 12 and higher.
`Gitlab::SQL::CTE` automatically omits materialization when PostgreSQL 11 is running
(this behavior is implemented using a custom Arel node `Gitlab::Database::AsWithMaterialized` under the surface).
WARNING:
Upgrading to GitLab 14.0 requires PostgreSQL 12 or higher.
## Cached Queries
**Summary:** a merge request **should not** execute duplicated cached queries.
Rails provides an [SQL Query Cache](../cached_queries.md#cached-queries-guidelines),
used to cache the results of database queries for the duration of the request.
See [why cached queries are considered bad](../cached_queries.md#why-cached-queries-are-considered-bad) and
[how to detect them](../cached_queries.md#how-to-detect-cached-queries).
The code introduced by a merge request, should not execute multiple duplicated cached queries.
The total number of the queries (including cached ones) executed by the code modified or added by a merge request
should not increase unless absolutely necessary.
The number of executed queries (including cached queries) should not depend on
Counters should always be truncated. It means that we don't want to present
the exact number over some threshold. The reason for that is for the cases where we want
to calculate exact number of items, we effectively need to filter each of them for
the purpose of knowing the exact number of items matching.
From ~UX perspective it's often acceptable to see that you have over 1000+ pipelines,
instead of that you have 40000+ pipelines, but at a tradeoff of loading page for 2s longer.
An example of this pattern is the list of pipelines and jobs. We truncate numbers to `1000+`,
but we show an accurate number of running pipelines, which is the most interesting information.
There's a helper method that can be used for that purpose - `NumbersHelper.limited_counter_with_delimiter` -
that accepts an upper limit of counting rows.
In some cases it's desired that badge counters are loaded asynchronously.
This can speed up the initial page load and give a better user experience overall.
## Usage of feature flags
Each feature that has performance critical elements or has a known performance deficiency
needs to come with feature flag to disable it.
The feature flag makes our team more happy, because they can monitor the system and
quickly react without our users noticing the problem.
Performance deficiencies should be addressed right away after we merge initial
changes.
Read more about when and how feature flags should be used in
[Feature flags in GitLab development](https://about.gitlab.com/handbook/product-development-flow/feature-flag-lifecycle/#feature-flags-in-gitlab-development).
## Storage
We can consider the following types of storages:
- **Local temporary storage** (very-very short-term storage) This type of storage is system-provided storage, like a `/tmp` folder.
This is the type of storage that you should ideally use for all your temporary tasks.
The fact that each node has its own temporary storage makes scaling significantly easier.
This storage is also very often SSD-based, thus is significantly faster.
The local storage can easily be configured for the application with
the usage of `TMPDIR` variable.
- **Shared temporary storage** (short-term storage) This type of storage is network-based temporary storage,
usually run with a common NFS server. As of Feb 2020, we still use this type of storage
for most of our implementations. Even though this allows the above limit to be significantly larger,
it does not really mean that you can use more. The shared temporary storage is shared by
all nodes. Thus, the job that uses significant amount of that space or performs a lot
of operations creates a contention on execution of all other jobs and request
across the whole application, this can easily impact stability of the whole GitLab.
Be respectful of that.
- **Shared persistent storage** (long-term storage) This type of storage uses
shared network-based storage (for example, NFS). This solution is mostly used by customers running small
installations consisting of a few nodes. The files on shared storage are easily accessible,
but any job that is uploading or downloading data can create a serious contention to all other jobs.
This is also an approach by default used by Omnibus.
- **Object-based persistent storage** (long term storage) this type of storage uses external
services like [AWS S3](https://en.wikipedia.org/wiki/Amazon_S3). The Object Storage
can be treated as infinitely scalable and redundant. Accessing this storage usually requires
downloading the file to manipulate it. The Object Storage can be considered as an ultimate
solution, as by definition it can be assumed that it can handle unlimited concurrent uploads
and downloads of files. This is also ultimate solution required to ensure that application can
run in containerized deployments (Kubernetes) at ease.
### Temporary storage
The storage on production nodes is really sparse. The application should be built
in a way that accommodates running under very limited temporary storage.
You can expect the system on which your code runs has a total of `1G-10G`
of temporary storage. However, this storage is really shared across all
jobs being run. If your job requires to use more than `100MB` of that space
you should reconsider the approach you have taken.
Whatever your needs are, you should clearly document if you need to process files.
If you require more than `100MB`, consider asking for help from a maintainer
to work with you to possibly discover a better solution.
#### Local temporary storage
The usage of local storage is a desired solution to use,
especially since we work on deploying applications to Kubernetes clusters.
When you would like to use `Dir.mktmpdir`? In a case when you want for example
to extract/create archives, perform extensive manipulation of existing data, and so on.
```ruby
Dir.mktmpdir('designs') do |path|
# do manipulation on path
# the path will be removed once
# we go out of the block
end
```
#### Shared temporary storage
The usage of shared temporary storage is required if your intent
is to persistent file for a disk-based storage, and not Object Storage.
[Workhorse direct upload](../uploads/index.md#direct-upload) when accepting file
can write it to shared storage, and later GitLab Rails can perform a move operation.
The move operation on the same destination is instantaneous.
The system instead of performing `copy` operation just re-attaches file into a new place.
Since this introduces extra complexity into application, you should only try
to re-use well established patterns (for example, `ObjectStorage` concern) instead of re-implementing it.
The usage of shared temporary storage is otherwise deprecated for all other usages.
### Persistent storage
#### Object Storage
It is required that all features holding persistent files support saving data
to Object Storage. Having a persistent storage in the form of shared volume across nodes
is not scalable, as it creates a contention on data access all nodes.
GitLab offers the [ObjectStorage concern](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/uploaders/object_storage.rb)
that implements a seamless support for Shared and Object Storage-based persistent storage.
#### Data access
Each feature that accepts data uploads or allows to download them needs to use
[Workhorse direct upload](../uploads/index.md#direct-upload). It means that uploads needs to be
saved directly to Object Storage by Workhorse, and all downloads needs to be served
by Workhorse.
Performing uploads/downloads via Puma is an expensive operation,
as it blocks the whole processing slot (thread) for the duration of the upload.
Performing uploads/downloads via Puma also has a problem where the operation
can time out, which is especially problematic for slow clients. If clients take a long time
to upload/download the processing slot might be killed due to request processing