Pirate Praveen d85b82979f New upstream version 12.10.0

2020-04-22 19:07:51 +05:30

18 KiB

Raw Blame History

type
reference, concepts

Scaling

GitLab supports a number of scaling options to ensure that your self-managed instance is able to scale out to meet your organization's needs when scaling up a single-box GitLab installation is no longer practical or feasible.

Please consult our high availability documentation if your organization requires fault tolerance and redundancy features, such as automatic database system failover.

GitLab components and scaling instructions

Here's a list of components directly provided by Omnibus GitLab or installed as part of a source installation and their configuration instructions for scaling.

Component	Description	Configuration instructions
PostgreSQL	Database	PostgreSQL configuration
Redis	Key/value store for fast data lookup and caching	Redis configuration
GitLab application services	Unicorn/Puma, Workhorse, GitLab Shell - serves front-end requests (UI, API, Git over HTTP/SSH)	GitLab app scaling configuration
PgBouncer	Database connection pooler	PgBouncer configuration (PREMIUM ONLY)
Sidekiq	Asynchronous/background jobs	Sidekiq configuration
Gitaly	Provides access to Git repositories	Gitaly configuration
Prometheus and Grafana	GitLab environment monitoring	Monitoring node for scaling

Third-party services used for scaling

Here's a list of third-party services you may require as part of scaling GitLab. The services can be provided by numerous applications or vendors and further advice is given on how best to select the right choice for your organization's needs.

Component	Description	Configuration instructions
Load balancer(s)	Handles load balancing, typically when you have multiple GitLab application services nodes	Load balancer configuration
Object storage service	Recommended store for shared data objects	Cloud Object Storage configuration
NFS	Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages	NFS configuration

Reference architectures

1 - 1000 Users: A single-node Omnibus setup with frequent backups. Refer to the Single-node Omnibus installation section below.
1000 to 50000+ Users: A Scaled-out Omnibus installation with multiple servers, it can be with or without high-availability components applied.
- To decide the level of Availability please refer to our Availability page.

Single-node installation

This solution is appropriate for many teams that have a single server at their disposal. With automatic backup of the GitLab repositories, configuration, and the database, this can be an optimal solution if you don't have strict availability requirements.

You can also optionally configure GitLab to use an external PostgreSQL service or an external object storage service for added performance and reliability at a relatively low complexity cost.

References:

Multi-node installation (scaled out for availability)

This solution is appropriate for teams that are starting to scale out when scaling up is no longer meeting their needs. In this configuration, additional application nodes will handle frontend traffic, with a load balancer in front to distribute traffic across those nodes. Meanwhile, each application node connects to a shared file server and PostgreSQL and Redis services on the back end.

The additional application servers adds limited fault tolerance to your GitLab instance. As long as one application node is online and capable of handling the instance's usage load, your team's productivity will not be interrupted. Having multiple application nodes also enables zero-downtime updates.

References:

In this section we'll detail the Reference Architectures that can support large numbers of users. These were built, tested and verified by our Quality and Support teams.

Testing was done with our GitLab Performance Tool at specific coded workloads, and the throughputs used for testing were calculated based on sample customer data. We test each endpoint type with the following number of requests per second (RPS) per 1000 users:

API: 20 RPS
Web: 2 RPS
Git: 2 RPS

NOTE: Note: Note that depending on your workflow the below recommended reference architectures may need to be adapted accordingly. Your workload is influenced by factors such as - but not limited to - how active your users are, how much automation you use, mirroring, and repo/change size. Additionally the shown memory values are given directly by GCP machine types. On different cloud vendors a best effort like for like can be used.

2,000 user configuration

Supported users (approximate): 2,000
Test RPS rates: API: 40 RPS, Web: 4 RPS, Git: 4 RPS
Known issues: List of known performance issues

Service	Nodes	Configuration¹	GCP type	AWS type²
GitLab Rails³	3	8 vCPU, 7.2GB Memory	n1-highcpu-8	c5.2xlarge
PostgreSQL	3	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
PgBouncer	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Gitaly⁴ ⁵ ⁶	X	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis⁷	3	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
Consul + Sentinel⁷	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Sidekiq	4	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
Cloud Object Storage⁸	-	-	-	-
NFS Server⁵ ⁶	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
Monitoring node	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
External load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Internal load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large

5,000 user configuration

Supported users (approximate): 5,000
Test RPS rates: API: 100 RPS, Web: 10 RPS, Git: 10 RPS
Known issues: List of known performance issues

Service	Nodes	Configuration¹	GCP type	AWS type²
GitLab Rails³	3	16 vCPU, 14.4GB Memory	n1-highcpu-16	c5.4xlarge
PostgreSQL	3	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
PgBouncer	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Gitaly⁴ ⁵ ⁶	X	8 vCPU, 30GB Memory	n1-standard-8	m5.2xlarge
Redis⁷	3	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
Consul + Sentinel⁷	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Sidekiq	4	2 vCPU, 7.5GB Memory	n1-standard-2	m5.large
Cloud Object Storage⁸	-	-	-	-
NFS Server⁵ ⁶	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
Monitoring node	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
External load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Internal load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large

10,000 user configuration

Supported users (approximate): 10,000
Test RPS rates: API: 200 RPS, Web: 20 RPS, Git: 20 RPS
Known issues: List of known performance issues

Service	Nodes	GCP Configuration¹	GCP type	AWS type²
GitLab Rails³	3	32 vCPU, 28.8GB Memory	n1-highcpu-32	c5.9xlarge
PostgreSQL	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
PgBouncer	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Gitaly⁴ ⁵ ⁶	X	16 vCPU, 60GB Memory	n1-standard-16	m5.4xlarge
Redis⁷ - Cache	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis⁷ - Queues / Shared State	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis Sentinel⁷ - Cache	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Redis Sentinel⁷ - Queues / Shared State	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Consul	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Sidekiq	4	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Cloud Object Storage⁸	-	-	-	-
NFS Server⁵ ⁶	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
Monitoring node	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
External load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Internal load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large

25,000 user configuration

Supported users (approximate): 25,000
Test RPS rates: API: 500 RPS, Web: 50 RPS, Git: 50 RPS
Known issues: List of known performance issues

Service	Nodes	Configuration¹	GCP type	AWS type²
GitLab Rails³	5	32 vCPU, 28.8GB Memory	n1-highcpu-32	c5.9xlarge
PostgreSQL	3	8 vCPU, 30GB Memory	n1-standard-8	m5.2xlarge
PgBouncer	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Gitaly⁴ ⁵ ⁶	X	32 vCPU, 120GB Memory	n1-standard-32	m5.8xlarge
Redis⁷ - Cache	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis⁷ - Queues / Shared State	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis Sentinel⁷ - Cache	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Redis Sentinel⁷ - Queues / Shared State	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Consul	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Sidekiq	4	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Cloud Object Storage⁸	-	-	-	-
NFS Server⁵ ⁶	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
Monitoring node	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
External load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Internal load balancing node⁹	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge

50,000 user configuration

Supported users (approximate): 50,000
Test RPS rates: API: 1000 RPS, Web: 100 RPS, Git: 100 RPS
Known issues: List of known performance issues

Service	Nodes	Configuration¹	GCP type	AWS type²
GitLab Rails³	12	32 vCPU, 28.8GB Memory	n1-highcpu-32	c5.9xlarge
PostgreSQL	3	16 vCPU, 60GB Memory	n1-standard-16	m5.4xlarge
PgBouncer	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Gitaly⁴ ⁵ ⁶	X	64 vCPU, 240GB Memory	n1-standard-64	m5.16xlarge
Redis⁷ - Cache	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis⁷ - Queues / Shared State	3	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
Redis Sentinel⁷ - Cache	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Redis Sentinel⁷ - Queues / Shared State	3	1 vCPU, 1.7GB Memory	g1-small	t2.small
Consul	3	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Sidekiq	4	4 vCPU, 15GB Memory	n1-standard-4	m5.xlarge
NFS Server⁵ ⁶	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
Cloud Object Storage⁸	-	-	-	-
Monitoring node	1	4 vCPU, 3.6GB Memory	n1-highcpu-4	c5.xlarge
External load balancing node⁹	1	2 vCPU, 1.8GB Memory	n1-highcpu-2	c5.large
Internal load balancing node⁹	1	8 vCPU, 7.2GB Memory	n1-highcpu-8	c5.2xlarge

The architectures were built and tested with the Intel Xeon E5 v3 (Haswell) CPU platform on GCP. On different hardware you may find that adjustments, either lower or higher, are required for your CPU or Node counts accordingly. For more information, a Sysbench benchmark of the CPU can be found here. ↩︎
AWS-equivalent configurations are rough suggestions and may change in the future. They have not yet been tested and validated. ↩︎
In our architectures we run each GitLab Rails node using the Puma webserver and have its number of workers set to 90% of available CPUs along with 4 threads. ↩︎
Gitaly node requirements are dependent on customer data, specifically the number of projects and their sizes. We recommend 2 nodes as an absolute minimum for HA environments and at least 4 nodes should be used when supporting 50,000 or more users. We also recommend that each Gitaly node should store no more than 5TB of data and have the number of gitaly-ruby workers set to 20% of available CPUs. Additional nodes should be considered in conjunction with a review of expected data size and spread based on the recommendations above. ↩︎
NFS can be used as an alternative for both repository data (replacing Gitaly) and object storage but this isn't typically recommended for performance reasons. Note however it is required for GitLab Pages. ↩︎
We strongly recommend that any Gitaly and / or NFS nodes are set up with SSD disks over HDD with a throughput of at least 8,000 IOPS for read operations and 2,000 IOPS for write as these components have heavy I/O. These IOPS values are recommended only as a starter as with time they may be adjusted higher or lower depending on the scale of your environment's workload. If you're running the environment on a Cloud provider you may need to refer to their documentation on how configure IOPS correctly. ↩︎
Recommended Redis setup differs depending on the size of the architecture. For smaller architectures (up to 5,000 users) we suggest one Redis cluster for all classes and that Redis Sentinel is hosted alongside Consul. For larger architectures (10,000 users or more) we suggest running a separate Redis Cluster for the Cache class and another for the Queues and Shared State classes respectively. We also recommend that you run the Redis Sentinel clusters separately as well for each Redis Cluster. ↩︎
For data objects such as LFS, Uploads, Artifacts, etc. We recommend a Cloud Object Storage service over NFS where possible, due to better performance and availability. ↩︎
Our architectures have been tested and validated with HAProxy as the load balancer. However other reputable load balancers with similar feature sets should also work instead but be aware these aren't validated. ↩︎

18 KiB Raw Blame History