debian-mirror-gitlab/doc/architecture/blueprints/cloud_native_gitlab_pages/index.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

105 lines
4.3 KiB
Markdown
Raw Normal View History

2021-01-03 14:25:43 +05:30
---
2023-01-13 00:05:48 +05:30
status: implemented
creation-date: "2019-05-16"
authors: [ "@grzesiek" ]
2023-03-17 16:20:25 +05:30
coach: [ "@ayufan", "@grzesiek" ]
2023-01-13 00:05:48 +05:30
approvers: [ "@ogolowinski", "@dcroft", "@vshushlin" ]
owning-stage: "~devops::release"
participating-stages: []
2021-01-03 14:25:43 +05:30
---
# GitLab Pages New Architecture
GitLab Pages is an important component of the GitLab product. It is mostly
being used to serve static content, and has a limited set of well defined
responsibilities. That being said, unfortunately it has become a blocker for
GitLab.com Kubernetes migration.
Cloud Native and the adoption of Kubernetes has been recognised by GitLab to be
one of the top two biggest tailwinds that are helping us grow faster as a
company behind the project.
2022-10-11 01:57:18 +05:30
This effort is described in more detail
2022-08-27 11:52:29 +05:30
[in the infrastructure team handbook page](https://about.gitlab.com/handbook/engineering/infrastructure/production/kubernetes/gitlab-com/).
2021-01-03 14:25:43 +05:30
2022-11-25 23:54:43 +05:30
GitLab Pages is tightly coupled with NFS and to unblock Kubernetes
2021-01-03 14:25:43 +05:30
migration a significant change to GitLab Pages' architecture is required. This
is an ongoing work that we have started more than a year ago. This blueprint
might be useful to understand why it is important, and what is the roadmap.
## How GitLab Pages Works
GitLab Pages is a daemon designed to serve static content, written in
2022-01-26 12:08:38 +05:30
[Go](https://go.dev/).
2021-01-03 14:25:43 +05:30
Initially, GitLab Pages has been designed to store static content on a local
shared block storage (NFS) in a hierarchical group > project directory
structure. Each directory, representing a project, was supposed to contain a
configuration file and static content that GitLab Pages daemon was supposed to
read and serve.
```mermaid
graph LR
A(GitLab Rails) -- Writes new pages deployment --> B[(NFS)]
C(GitLab Pages) -. Reads static content .-> B
```
This initial design has become outdated because of a few reasons - NFS coupling
being one of them - and we decided to replace it with more "decoupled
service"-like architecture. The new architecture, that we are working on, is
described in this blueprint.
## NFS coupling
In 2017, we experienced serious problems of scaling our NFS infrastructure. We
even tried to replace NFS with
[CephFS](https://docs.ceph.com/docs/master/cephfs/) - unsuccessfully.
Since that time it has become apparent that the cost of operations and
maintenance of a NFS cluster is significant and that if we ever decide to
2022-10-11 01:57:18 +05:30
migrate to Kubernetes
2022-08-27 11:52:29 +05:30
[we need to decouple GitLab from a shared local storage and NFS](https://gitlab.com/gitlab-org/gitlab-pages/-/issues/426#note_375646396).
2021-01-03 14:25:43 +05:30
1. NFS might be a single point of failure
1. NFS can only be reliably scaled vertically
1. Moving to Kubernetes means increasing the number of mount points by an order
of magnitude
1. NFS depends on extremely reliable network which can be difficult to provide
in Kubernetes environment
1. Storing customer data on NFS involves additional security risks
Moving GitLab to Kubernetes without NFS decoupling would result in an explosion
of complexity, maintenance cost and enormous, negative impact on availability.
## New GitLab Pages Architecture
2021-02-22 17:27:13 +05:30
- GitLab Pages sources domains' configuration from the GitLab internal
2021-01-03 14:25:43 +05:30
API, instead of reading `config.json` files from a local shared storage.
2021-02-22 17:27:13 +05:30
- GitLab Pages serves static content from Object Storage.
2021-01-03 14:25:43 +05:30
```mermaid
graph TD
A(User) -- Pushes pages deployment --> B{GitLab}
C((GitLab Pages)) -. Reads configuration from API .-> B
C -. Reads static content .-> D[(Object Storage)]
C -- Serves static content --> E(Visitors)
```
2022-10-11 01:57:18 +05:30
This new architecture has been briefly described in
2022-08-27 11:52:29 +05:30
[the blog post](https://about.gitlab.com/blog/2020/08/03/how-gitlab-pages-uses-the-gitlab-api-to-serve-content/)
2021-01-03 14:25:43 +05:30
too.
## Iterations
2021-02-22 17:27:13 +05:30
1. ✓ Redesign GitLab Pages configuration source to use the GitLab API
2021-01-03 14:25:43 +05:30
1. ✓ Evaluate performance and build reliable caching mechanisms
1. ✓ Incrementally rollout the new source on GitLab.com
2021-02-22 17:27:13 +05:30
1. ✓ Make GitLab Pages API domains configuration source enabled by default
2021-01-03 14:25:43 +05:30
1. Enable experimentation with different servings through feature flags
1. Triangulate object store serving design through meaningful experiments
1. Design pages migration mechanisms that can work incrementally
1. Gradually migrate towards object storage serving on GitLab.com
[GitLab Pages Architecture](https://gitlab.com/groups/gitlab-org/-/epics/1316)
epic with detailed roadmap is also available.