debian-mirror-gitlab/doc/administration/operations/sidekiq_memory_killer.md

84 lines
4.1 KiB
Markdown
Raw Normal View History

2021-01-03 14:25:43 +05:30
---
2021-01-29 00:20:46 +05:30
stage: Enablement
group: Memory
2021-02-22 17:27:13 +05:30
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
2021-01-03 14:25:43 +05:30
---
2016-11-03 12:29:30 +05:30
# Sidekiq MemoryKiller
The GitLab Rails application code suffers from memory leaks. For web requests
this problem is made manageable using
2020-05-24 23:13:21 +05:30
[`puma-worker-killer`](https://github.com/schneems/puma_worker_killer) which
restarts Puma worker processes if it exceeds a memory limit. The Sidekiq
2016-11-03 12:29:30 +05:30
MemoryKiller applies the same approach to the Sidekiq processes used by GitLab
to process background jobs.
2020-05-24 23:13:21 +05:30
Unlike puma-worker-killer, which is enabled by default for all GitLab
2021-01-03 14:25:43 +05:30
installations of GitLab 13.0 and later, the Sidekiq MemoryKiller is enabled by default
2016-11-03 12:29:30 +05:30
_only_ for Omnibus packages. The reason for this is that the MemoryKiller
2019-12-21 20:55:43 +05:30
relies on runit to restart Sidekiq after a memory-induced shutdown and GitLab
installations from source do not all use runit or an equivalent.
2016-11-03 12:29:30 +05:30
2021-09-04 01:27:46 +05:30
With the default settings, the MemoryKiller causes a Sidekiq restart no
2016-11-03 12:29:30 +05:30
more often than once every 15 minutes, with the restart causing about one
minute of delay for incoming background jobs.
2019-07-07 11:18:12 +05:30
Some background jobs rely on long-running external processes. To ensure these
are cleanly terminated when Sidekiq is restarted, each Sidekiq process should be
2021-09-30 23:02:18 +05:30
run as a process group leader (for example, using `chpst -P`). If using Omnibus or the
2019-07-07 11:18:12 +05:30
`bin/background_jobs` script with `runit` installed, this is handled for you.
2016-11-03 12:29:30 +05:30
## Configuring the MemoryKiller
The MemoryKiller is controlled using environment variables.
2020-11-24 15:15:51 +05:30
- `SIDEKIQ_DAEMON_MEMORY_KILLER`: defaults to 1. When set to 0, the MemoryKiller
works in _legacy_ mode. Otherwise, the MemoryKiller works in _daemon_ mode.
2019-12-21 20:55:43 +05:30
2021-02-22 17:27:13 +05:30
In _legacy_ mode, the MemoryKiller checks the Sidekiq process RSS
([Resident Set Size](https://github.com/mperham/sidekiq/wiki/Memory#rss))
after each job.
2019-12-21 20:55:43 +05:30
In _daemon_ mode, the MemoryKiller checks the Sidekiq process RSS every 3 seconds
(defined by `SIDEKIQ_MEMORY_KILLER_CHECK_INTERVAL`).
2019-12-26 22:10:19 +05:30
- `SIDEKIQ_MEMORY_KILLER_MAX_RSS` (KB): if this variable is set, and its value is greater
2019-12-21 20:55:43 +05:30
than 0, the MemoryKiller is enabled. Otherwise the MemoryKiller is disabled.
`SIDEKIQ_MEMORY_KILLER_MAX_RSS` defines the Sidekiq process allowed RSS.
In _legacy_ mode, if the Sidekiq process exceeds the allowed RSS then an irreversible
2021-09-04 01:27:46 +05:30
delayed graceful restart is triggered. The restart of Sidekiq happens
2019-12-21 20:55:43 +05:30
after `SIDEKIQ_MEMORY_KILLER_GRACE_TIME` seconds.
In _daemon_ mode, if the Sidekiq process exceeds the allowed RSS for longer than
2021-09-04 01:27:46 +05:30
`SIDEKIQ_MEMORY_KILLER_GRACE_TIME` the graceful restart is triggered. If the
2019-12-21 20:55:43 +05:30
Sidekiq process go below the allowed RSS within `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`,
2021-09-04 01:27:46 +05:30
the restart is aborted.
2019-12-21 20:55:43 +05:30
The default value for Omnibus packages is set
[in the Omnibus GitLab
2016-11-03 12:29:30 +05:30
repository](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/gitlab/attributes/default.rb).
2019-12-21 20:55:43 +05:30
2019-12-26 22:10:19 +05:30
- `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS` (KB): is used by _daemon_ mode. If the Sidekiq
2019-12-21 20:55:43 +05:30
process RSS (expressed in kilobytes) exceeds `SIDEKIQ_MEMORY_KILLER_HARD_LIMIT_RSS`,
an immediate graceful restart of Sidekiq is triggered.
- `SIDEKIQ_MEMORY_KILLER_CHECK_INTERVAL`: used in _daemon_ mode to define how
often to check process RSS, default to 3 seconds.
- `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`: defaults to 900 seconds (15 minutes).
The usage of this variable is described as part of `SIDEKIQ_MEMORY_KILLER_MAX_RSS`.
- `SIDEKIQ_MEMORY_KILLER_SHUTDOWN_WAIT`: defaults to 30 seconds. This defines the
2021-09-04 01:27:46 +05:30
maximum time allowed for all Sidekiq jobs to finish. No new jobs are accepted
during that time, and the process exits as soon as all jobs finish.
2019-12-21 20:55:43 +05:30
2021-09-04 01:27:46 +05:30
If jobs do not finish during that time, the MemoryKiller interrupts all currently
2019-12-21 20:55:43 +05:30
running jobs by sending `SIGTERM` to the Sidekiq process.
If the process hard shutdown/restart is not performed by Sidekiq,
2021-09-04 01:27:46 +05:30
the Sidekiq process is forcefully terminated after
2020-10-24 23:57:45 +05:30
`Sidekiq.options[:timeout] + 2` seconds. An external supervision mechanism
2021-09-30 23:02:18 +05:30
(for example, runit) must restart Sidekiq afterwards.