debian-mirror-gitlab/doc/architecture/blueprints/cloud_native_gitlab_pages/index.md
2021-01-30 21:13:32 +05:30

5.4 KiB

comments description
false Making GitLab Pages a Cloud Native application - architecture blueprint.

GitLab Pages New Architecture

GitLab Pages is an important component of the GitLab product. It is mostly being used to serve static content, and has a limited set of well defined responsibilities. That being said, unfortunately it has become a blocker for GitLab.com Kubernetes migration.

Cloud Native and the adoption of Kubernetes has been recognised by GitLab to be one of the top two biggest tailwinds that are helping us grow faster as a company behind the project.

This effort is described in more detail in the infrastructure team handbook page.

GitLab Pages is tightly coupled with NFS and in order to unblock Kubernetes migration a significant change to GitLab Pages' architecture is required. This is an ongoing work that we have started more than a year ago. This blueprint might be useful to understand why it is important, and what is the roadmap.

How GitLab Pages Works

GitLab Pages is a daemon designed to serve static content, written in Go.

Initially, GitLab Pages has been designed to store static content on a local shared block storage (NFS) in a hierarchical group > project directory structure. Each directory, representing a project, was supposed to contain a configuration file and static content that GitLab Pages daemon was supposed to read and serve.

graph LR
  A(GitLab Rails) -- Writes new pages deployment --> B[(NFS)]
  C(GitLab Pages) -. Reads static content .-> B

This initial design has become outdated because of a few reasons - NFS coupling being one of them - and we decided to replace it with more "decoupled service"-like architecture. The new architecture, that we are working on, is described in this blueprint.

NFS coupling

In 2017, we experienced serious problems of scaling our NFS infrastructure. We even tried to replace NFS with CephFS - unsuccessfully.

Since that time it has become apparent that the cost of operations and maintenance of a NFS cluster is significant and that if we ever decide to migrate to Kubernetes we need to decouple GitLab from a shared local storage and NFS.

  1. NFS might be a single point of failure
  2. NFS can only be reliably scaled vertically
  3. Moving to Kubernetes means increasing the number of mount points by an order of magnitude
  4. NFS depends on extremely reliable network which can be difficult to provide in Kubernetes environment
  5. Storing customer data on NFS involves additional security risks

Moving GitLab to Kubernetes without NFS decoupling would result in an explosion of complexity, maintenance cost and enormous, negative impact on availability.

New GitLab Pages Architecture

  • GitLab Pages is going to source domains' configuration from GitLab's internal API, instead of reading config.json files from a local shared storage.
  • GitLab Pages is going to serve static content from Object Storage.
graph TD
  A(User) -- Pushes pages deployment --> B{GitLab}
  C((GitLab Pages)) -. Reads configuration from API .-> B
  C -. Reads static content .-> D[(Object Storage)]
  C -- Serves static content --> E(Visitors)

This new architecture has been briefly described in the blog post too.

Iterations

  1. ✓ Redesign GitLab Pages configuration source to use GitLab's API
  2. ✓ Evaluate performance and build reliable caching mechanisms
  3. ✓ Incrementally rollout the new source on GitLab.com
  4. ✓ Make GitLab Pages API domains config source enabled by default
  5. Enable experimentation with different servings through feature flags
  6. Triangulate object store serving design through meaningful experiments
  7. Design pages migration mechanisms that can work incrementally
  8. Gradually migrate towards object storage serving on GitLab.com

GitLab Pages Architecture epic with detailed roadmap is also available.

Who

Proposal:

Role Who
Author Grzegorz Bizon
Architecture Evolution Coach Kamil Trzciński
Engineering Leader Daniel Croft
Domain Expert Grzegorz Bizon
Domain Expert Vladimir Shushlin
Domain Expert Jaime Martinez

DRIs:

Role Who
Product Jackie Porter
Leadership Daniel Croft
Engineering Kamil Trzciński

Domain Experts:

Role Who
Domain Expert Kamil Trzciński
Domain Expert Grzegorz Bizon
Domain Expert Vladimir Shushlin
Domain Expert Jaime Martinez
Domain Expert Krasimir Angelov