debian-mirror-gitlab/doc/administration/operations/extra_sidekiq_processes.md

343 lines
12 KiB
Markdown
Raw Normal View History

2021-01-03 14:25:43 +05:30
---
2021-01-29 00:20:46 +05:30
stage: Enablement
group: Distribution
2021-02-22 17:27:13 +05:30
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
2021-01-03 14:25:43 +05:30
---
2021-03-11 19:13:27 +05:30
# Run multiple Sidekiq processes **(FREE SELF)**
2019-09-30 21:07:59 +05:30
2020-04-22 19:07:51 +05:30
GitLab allows you to start multiple Sidekiq processes.
These processes can be used to consume a dedicated set
2019-07-31 22:56:46 +05:30
of queues. This can be used to ensure certain queues always have dedicated
workers, no matter the number of jobs that need to be processed.
2021-02-22 17:27:13 +05:30
NOTE:
2020-05-24 23:13:21 +05:30
The information in this page applies only to Omnibus GitLab.
2019-09-30 21:07:59 +05:30
## Available Sidekiq queues
2019-07-31 22:56:46 +05:30
2019-09-30 21:07:59 +05:30
For a list of the existing Sidekiq queues, check the following files:
2019-07-31 22:56:46 +05:30
2021-09-04 01:27:46 +05:30
- [Queues for both GitLab Community and Enterprise Editions](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml)
- [Queues for GitLab Enterprise Editions only](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml)
2019-07-31 22:56:46 +05:30
2020-04-22 19:07:51 +05:30
Each entry in the above files represents a queue on which Sidekiq processes
2019-09-30 21:07:59 +05:30
can be started.
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
## Start multiple processes
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
> - [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4006) in GitLab 12.10, starting multiple processes with Sidekiq cluster.
2021-03-11 19:13:27 +05:30
> - [Sidekiq cluster moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
2020-05-24 23:13:21 +05:30
> - [Sidekiq cluster became default](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4140) in GitLab 13.0.
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
To start multiple processes:
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
1. Using the `sidekiq['queue_groups']` array setting, specify how many processes to
create using `sidekiq-cluster` and which queue they should handle.
Each item in the array equates to one additional Sidekiq
2019-09-30 21:07:59 +05:30
process, and values in each item determine the queues it works on.
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
For example, the following setting creates three Sidekiq processes, one to run on
2021-09-04 01:27:46 +05:30
`elastic_commit_indexer`, one to run on `mailers`, and one process running on all queues:
2019-07-31 22:56:46 +05:30
2019-09-30 21:07:59 +05:30
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['queue_groups'] = [
2021-09-04 01:27:46 +05:30
"elastic_commit_indexer",
2020-05-24 23:13:21 +05:30
"mailers",
"*"
2019-09-30 21:07:59 +05:30
]
```
To have an additional Sidekiq process handle multiple queues, add multiple
queue names to its item delimited by commas. For example:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['queue_groups'] = [
2021-09-04 01:27:46 +05:30
"elastic_commit_indexer, elastic_association_indexer",
2020-05-24 23:13:21 +05:30
"mailers",
"*"
2019-09-30 21:07:59 +05:30
]
```
2020-04-08 14:13:33 +05:30
[In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and
later, the special queue name `*` means all queues. This starts two
processes, each handling all queues:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['queue_groups'] = [
2020-04-08 14:13:33 +05:30
"*",
"*"
]
```
2021-09-04 01:27:46 +05:30
`*` cannot be combined with concrete queue names - `*, mailers`
just handles the `mailers` queue.
2020-04-08 14:13:33 +05:30
2020-05-24 23:13:21 +05:30
When `sidekiq-cluster` is only running on a single node, make sure that at least
2021-09-30 23:02:18 +05:30
one process is running on all queues using `*`. This ensures a process
automatically picks up jobs in queues created in the future,
including queues that have dedicated processes.
2020-05-24 23:13:21 +05:30
If `sidekiq-cluster` is running on more than one node, you can also use
[`--negate`](#negate-settings) and list all the queues that are already being
processed.
2019-09-30 21:07:59 +05:30
1. Save the file and reconfigure GitLab for the changes to take effect:
2020-03-13 15:44:24 +05:30
```shell
2019-09-30 21:07:59 +05:30
sudo gitlab-ctl reconfigure
```
2021-09-04 01:27:46 +05:30
To view the Sidekiq processes in GitLab:
2019-07-31 22:56:46 +05:30
2021-09-04 01:27:46 +05:30
1. On the top bar, select **Menu >** **{admin}** **Admin**.
1. On the left sidebar, select **Monitoring > Background Jobs**.
2019-07-31 22:56:46 +05:30
2020-05-24 23:13:21 +05:30
## Negate settings
2019-07-31 22:56:46 +05:30
2021-10-27 15:23:28 +05:30
To have the Sidekiq process work on every queue **except** the ones
2021-09-30 23:02:18 +05:30
you list. In this example, we exclude all import-related jobs from a Sidekiq node:
2019-07-31 22:56:46 +05:30
2021-10-27 15:23:28 +05:30
1. Edit `/etc/gitlab/gitlab.rb` and add:
2019-09-30 21:07:59 +05:30
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['negate'] = true
2021-10-27 15:23:28 +05:30
sidekiq['queue_selector'] = true
2021-09-30 23:02:18 +05:30
sidekiq['queue_groups'] = [
"feature_category=importers"
]
2019-09-30 21:07:59 +05:30
```
1. Save the file and reconfigure GitLab for the changes to take effect:
2020-03-13 15:44:24 +05:30
```shell
2019-09-30 21:07:59 +05:30
sudo gitlab-ctl reconfigure
```
2021-01-29 00:20:46 +05:30
## Queue selector
2020-03-13 15:44:24 +05:30
2021-03-11 19:13:27 +05:30
> - [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/45) in GitLab 12.8.
> - [Sidekiq cluster, including queue selector, moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
2021-03-08 18:12:59 +05:30
> - [Renamed from `experimental_queue_selector` to `queue_selector`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/147) in GitLab 13.6.
2020-03-13 15:44:24 +05:30
2021-09-04 01:27:46 +05:30
In addition to selecting queues by name, as above, the `queue_selector` option
allows queue groups to be selected in a more general way using a [worker matching
query](extra_sidekiq_routing.md#worker-matching-query). After `queue_selector`
is set, all `queue_groups` must follow the aforementioned syntax.
2020-03-13 15:44:24 +05:30
In `/etc/gitlab/gitlab.rb`:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['enable'] = true
2021-01-29 00:20:46 +05:30
sidekiq['queue_selector'] = true
2020-05-24 23:13:21 +05:30
sidekiq['queue_groups'] = [
2020-04-08 14:13:33 +05:30
# Run all non-CPU-bound queues that are high urgency
'resource_boundary!=cpu&urgency=high',
# Run all continuous integration and pages queues that are not high urgency
'feature_category=continuous_integration,pages&urgency!=high',
# Run all queues
'*'
2020-03-13 15:44:24 +05:30
]
```
2021-09-30 23:02:18 +05:30
## Ignore all import queues
2019-09-30 21:07:59 +05:30
2021-09-30 23:02:18 +05:30
When [importing from GitHub](../../user/project/import/github.md) or
other sources, Sidekiq might use all of its resources to perform those
operations. To set up two separate `sidekiq-cluster` processes, where
one only processes imports and the other processes all other queues:
2019-09-30 21:07:59 +05:30
2019-07-31 22:56:46 +05:30
1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['enable'] = true
2021-09-30 23:02:18 +05:30
sidekiq['queue_selector'] = true
2020-05-24 23:13:21 +05:30
sidekiq['queue_groups'] = [
2021-09-30 23:02:18 +05:30
"feature_category=importers",
"feature_category!=importers"
2019-09-30 21:07:59 +05:30
]
2019-07-31 22:56:46 +05:30
```
2019-09-30 21:07:59 +05:30
1. Save the file and reconfigure GitLab for the changes to take effect:
2019-07-31 22:56:46 +05:30
2020-03-13 15:44:24 +05:30
```shell
2019-09-30 21:07:59 +05:30
sudo gitlab-ctl reconfigure
```
2019-07-31 22:56:46 +05:30
2019-09-30 21:07:59 +05:30
## Number of threads
2020-05-24 23:13:21 +05:30
Each process defined under `sidekiq` starts with a
2019-09-30 21:07:59 +05:30
number of threads that equals the number of queues, plus one spare thread.
For example, a process that handles the `process_commit` and `post_receive`
2021-09-04 01:27:46 +05:30
queues uses three threads in total.
2019-09-30 21:07:59 +05:30
2020-05-24 23:13:21 +05:30
## Manage concurrency
2020-03-13 15:44:24 +05:30
When setting the maximum concurrency, keep in mind this normally should
not exceed the number of CPU cores available. The values in the examples
below are arbitrary and not particular recommendations.
2019-09-30 21:07:59 +05:30
2020-03-13 15:44:24 +05:30
Each thread requires a Redis connection, so adding threads may increase Redis
latency and potentially cause client timeouts. See the [Sidekiq documentation
about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis) for more
details.
2020-05-24 23:13:21 +05:30
### When running Sidekiq cluster (default)
2020-01-01 13:55:28 +05:30
2020-05-24 23:13:21 +05:30
Running Sidekiq cluster is the default in GitLab 13.0 and later.
2020-01-01 13:55:28 +05:30
1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['min_concurrency'] = 15
sidekiq['max_concurrency'] = 25
2019-07-31 22:56:46 +05:30
```
2019-09-30 21:07:59 +05:30
1. Save the file and reconfigure GitLab for the changes to take effect:
2019-07-31 22:56:46 +05:30
2020-03-13 15:44:24 +05:30
```shell
2019-09-30 21:07:59 +05:30
sudo gitlab-ctl reconfigure
```
2019-07-31 22:56:46 +05:30
2020-03-13 15:44:24 +05:30
`min_concurrency` and `max_concurrency` are independent; one can be set without
2021-09-04 01:27:46 +05:30
the other. Setting `min_concurrency` to `0` disables the limit.
2019-09-30 21:07:59 +05:30
2021-01-29 00:20:46 +05:30
For each queue group, let `N` be one more than the number of queues. The
2021-09-04 01:27:46 +05:30
concurrency factor are set to:
2020-03-13 15:44:24 +05:30
1. `N`, if it's between `min_concurrency` and `max_concurrency`.
1. `max_concurrency`, if `N` exceeds this value.
1. `min_concurrency`, if `N` is less than this value.
2021-09-04 01:27:46 +05:30
If `min_concurrency` is equal to `max_concurrency`, then this value is used
2020-03-13 15:44:24 +05:30
regardless of the number of queues.
When `min_concurrency` is greater than `max_concurrency`, it is treated as
being equal to `max_concurrency`.
2019-09-30 21:07:59 +05:30
2020-05-24 23:13:21 +05:30
### When running a single Sidekiq process
Running a single Sidekiq process is the default in GitLab 12.10 and earlier.
2021-02-22 17:27:13 +05:30
WARNING:
2021-10-27 15:23:28 +05:30
Running Sidekiq directly was removed in GitLab
2020-05-24 23:13:21 +05:30
[14.0](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/240).
1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
sidekiq['cluster'] = false
sidekiq['concurrency'] = 25
```
1. Save the file and reconfigure GitLab for the changes to take effect:
```shell
sudo gitlab-ctl reconfigure
```
2021-09-04 01:27:46 +05:30
This sets the concurrency (number of threads) for the Sidekiq process.
2020-05-24 23:13:21 +05:30
## Modify the check interval
2019-07-31 22:56:46 +05:30
To modify the check interval for the additional Sidekiq processes:
1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
2020-05-24 23:13:21 +05:30
sidekiq['interval'] = 5
2019-07-31 22:56:46 +05:30
```
1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
This tells the additional processes how often to check for enqueued jobs.
2020-05-24 23:13:21 +05:30
## Troubleshoot using the CLI
2019-07-31 22:56:46 +05:30
2021-02-22 17:27:13 +05:30
WARNING:
2019-09-30 21:07:59 +05:30
It's recommended to use `/etc/gitlab/gitlab.rb` to configure the Sidekiq processes.
If you experience a problem, you should contact GitLab support. Use the command
line at your own risk.
For debugging purposes, you can start extra Sidekiq processes by using the command
2020-04-08 14:13:33 +05:30
`/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster`. This command
2019-07-31 22:56:46 +05:30
takes arguments using the following syntax:
2020-03-13 15:44:24 +05:30
```shell
2020-04-08 14:13:33 +05:30
/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster [QUEUE,QUEUE,...] [QUEUE, ...]
2019-07-31 22:56:46 +05:30
```
Each separate argument denotes a group of queues that have to be processed by a
Sidekiq process. Multiple queues can be processed by the same process by
separating them with a comma instead of a space.
Instead of a queue, a queue namespace can also be provided, to have the process
automatically listen on all queues in that namespace without needing to
explicitly list all the queue names. For more information about queue namespaces,
see the relevant section in the
[Sidekiq style guide](../../development/sidekiq_style_guide.md#queue-namespaces).
For example, say you want to start 2 extra processes: one to process the
2019-09-30 21:07:59 +05:30
`process_commit` queue, and one to process the `post_receive` queue. This can be
2019-07-31 22:56:46 +05:30
done as follows:
2020-03-13 15:44:24 +05:30
```shell
2020-04-08 14:13:33 +05:30
/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit post_receive
2019-07-31 22:56:46 +05:30
```
2019-09-30 21:07:59 +05:30
If you instead want to start one process processing both queues, you'd use the
2019-07-31 22:56:46 +05:30
following syntax:
2020-03-13 15:44:24 +05:30
```shell
2020-04-08 14:13:33 +05:30
/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive
2019-07-31 22:56:46 +05:30
```
2019-09-30 21:07:59 +05:30
If you want to have one Sidekiq process dealing with the `process_commit` and
`post_receive` queues, and one process to process the `gitlab_shell` queue,
2019-07-31 22:56:46 +05:30
you'd use the following:
2020-03-13 15:44:24 +05:30
```shell
2020-04-08 14:13:33 +05:30
/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive gitlab_shell
2019-07-31 22:56:46 +05:30
```
2020-05-24 23:13:21 +05:30
### Monitor the `sidekiq-cluster` command
2019-07-31 22:56:46 +05:30
2021-09-04 01:27:46 +05:30
The `sidekiq-cluster` command does not terminate once it has started the desired
amount of Sidekiq processes. Instead, the process continues running and
2019-07-31 22:56:46 +05:30
forward any signals to the child processes. This makes it easy to stop all
Sidekiq processes as you simply send a signal to the `sidekiq-cluster` process,
instead of having to send it to the individual processes.
If the `sidekiq-cluster` process crashes or receives a `SIGKILL`, the child
2021-09-04 01:27:46 +05:30
processes terminate themselves after a few seconds. This ensures you don't
2019-07-31 22:56:46 +05:30
end up with zombie Sidekiq processes.
All of this makes monitoring the processes fairly easy. Simply hook up
2020-05-24 23:13:21 +05:30
`sidekiq-cluster` to your supervisor of choice (for example, runit) and you're good to
2019-07-31 22:56:46 +05:30
go.
2021-09-04 01:27:46 +05:30
If a child process died the `sidekiq-cluster` command signals all remaining
2019-07-31 22:56:46 +05:30
process to terminate, then terminate itself. This removes the need for
`sidekiq-cluster` to re-implement complex process monitoring/restarting code.
Instead you should make sure your supervisor restarts the `sidekiq-cluster`
process whenever necessary.
### PID files
The `sidekiq-cluster` command can store its PID in a file. By default no PID
file is written, but this can be changed by passing the `--pidfile` option to
`sidekiq-cluster`. For example:
2020-03-13 15:44:24 +05:30
```shell
2020-04-08 14:13:33 +05:30
/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster --pidfile /var/run/gitlab/sidekiq_cluster.pid process_commit
2019-07-31 22:56:46 +05:30
```
2021-09-04 01:27:46 +05:30
Keep in mind that the PID file contains the PID of the `sidekiq-cluster`
2019-07-31 22:56:46 +05:30
command and not the PID(s) of the started Sidekiq processes.
### Environment
The Rails environment can be set by passing the `--environment` flag to the
`sidekiq-cluster` command, or by setting `RAILS_ENV` to a non-empty value. The
2019-09-30 21:07:59 +05:30
default value can be found in `/opt/gitlab/etc/gitlab-rails/env/RAILS_ENV`.