13 KiB
Praefect
NOTE: Note: Praefect is an experimental service, and for testing purposes only at this time.
Praefect is an optional reverse-proxy for Gitaly to manage a cluster of Gitaly nodes for high availability through replication. If a Gitaly node becomes unavailable, it will be possible to fail over to a warm Gitaly replica.
The first minimal version will support:
- Eventual consistency of the secondary replicas.
- Manual fail over from the primary to the secondary.
Follow the HA Gitaly epic for updates and roadmap.
Omnibus
Architecture
The most common architecture for Praefect is simplified in the diagram below:
graph TB
GitLab --> Praefect;
Praefect --- PostgreSQL;
Praefect --> Gitaly1;
Praefect --> Gitaly2;
Praefect --> Gitaly3;
Where GitLab
is the collection of clients that can request Git operations.
The Praefect node has three storage nodes attached. Praefect itself doesn't
store data, but connects to three Gitaly nodes, Gitaly-1
, Gitaly-2
, and Gitaly-3
.
In order to keep track of replication state, Praefect relies on a PostgreSQL database. This database is a single point of failure so you should use a highly available PostgreSQL server for this. GitLab itself needs a HA PostgreSQL server too, so you could optionally co-locate the Praefect SQL database on the PostgreSQL server you use for the rest of GitLab.
Praefect may be enabled on its own node or can be run on the GitLab server. In the example below we will use a separate server, but the optimal configuration for Praefect is still being determined.
Praefect will handle all Gitaly RPC requests to its child nodes. However, the child nodes will still need to communicate with the GitLab server via its internal API for authentication purposes.
Setup
In this setup guide we will start by configuring Praefect, then its child Gitaly nodes, and lastly the GitLab server configuration.
Secrets
We need to manage the following secrets and make them match across hosts:
GITLAB_SHELL_SECRET_TOKEN
: this is used by Git hooks to make callback HTTP API requests to GitLab when accepting a Git push. This secret is shared with GitLab Shell for legacy reasons.PRAEFECT_EXTERNAL_TOKEN
: repositories hosted on your Praefect cluster can only be accessed by Gitaly clients that carry this token.PRAEFECT_INTERNAL_TOKEN
: this token is used for replication traffic inside your Praefect cluster. This is distinct fromPRAEFECT_EXTERNAL_TOKEN
because Gitaly clients must not be able to access internal nodes of the Praefect cluster directly; that could lead to data loss.PRAEFECT_SQL_PASSWORD
: this password is used by Praefect to connect to PostgreSQL.
We will note in the instructions below where these secrets are required.
Network addresses
POSTGRESQL_SERVER_ADDRESS
: the host name or IP address of your PostgreSQL server
PostgreSQL
To set up a Praefect cluster you need a highly available PostgreSQL server. You need PostgreSQL 9.6 or newer. Praefect needs to have a SQL user with the right to create databases.
In the instructions below we assume you have administrative access to
your PostgreSQL server via psql
. Depending on your environment, you
may also be able to do this via the web interface of your cloud
platform, or via your configuration management system, etc.
Below we assume that you have administrative access as the postgres
user. First open a psql
session as the postgres
user:
/opt/gitlab/embedded/bin/psql -h POSTGRESQL_SERVER_ADDRESS -U postgres -d template1
Once you are connected, run the following command. Replace
PRAEFECT_SQL_PASSWORD
with the actual (random) password you
generated for the praefect
SQL user:
CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD 'PRAEFECT_SQL_PASSWORD';
\q
Now connect as the praefect
user to create the database. This has
the side effect of verifying that you have access:
/opt/gitlab/embedded/bin/psql -h POSTGRESQL_SERVER_ADDRESS -U praefect -d template1
Once you have connected as the praefect
user, run:
CREATE DATABASE praefect_production WITH ENCODING=UTF8;
\q
Praefect
On the Praefect node we disable all other services, including Gitaly. We list each
Gitaly node that will be connected to Praefect as members of the praefect
hash in praefect['virtual_storages']
.
In the example below, the Gitaly nodes are named gitaly-N
. Note that one
node is designated as primary by setting the primary to true
.
If you are using an uncrypted connection to Postgres, set praefect['database_sslmode']
to false.
If you are using an encrypted connection with a client certificate,
praefect['database_sslcert']
and praefect['database_sslkey']
will need to be set.
If you are using a custom CA, also set praefect['database_sslrootcert']
:
# /etc/gitlab/gitlab.rb on praefect server
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
nginx['enable'] = false
prometheus['enable'] = false
grafana['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
gitaly['enable'] = false
# Prevent database connections during 'gitlab-ctl reconfigure'
gitlab_rails['rake_cache_clear'] = false
gitlab_rails['auto_migrate'] = false
praefect['enable'] = true
# Make Praefect accept connections on all network interfaces. You must use
# firewalls to restrict access to this address/port.
praefect['listen_addr'] = '0.0.0.0:2305'
# Replace PRAEFECT_EXTERNAL_TOKEN with a real secret
praefect['auth_token'] = 'PRAEFECT_EXTERNAL_TOKEN'
# Replace each instance of PRAEFECT_INTERNAL_TOKEN below with a real
# secret, distinct from PRAEFECT_EXTERNAL_TOKEN.
# Name of storage hash must match storage name in git_data_dirs on GitLab server.
praefect['virtual_storages'] = {
'praefect' => {
'gitaly-1' => {
# Replace GITALY_URL_OR_IP below with the real address to connect to.
'address' => 'tcp://GITALY_URL_OR_IP:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN',
'primary' => true
},
'gitaly-2' => {
# Replace GITALY_URL_OR_IP below with the real address to connect to.
'address' => 'tcp://GITALY_URL_OR_IP:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN'
},
'gitaly-3' => {
# Replace GITALY_URL_OR_IP below with the real address to connect to.
'address' => 'tcp://GITALY_URL_OR_IP:8075',
'token' => 'PRAEFECT_INTERNAL_TOKEN'
}
}
}
# Replace POSTGRESQL_SERVER below with a real IP/host address of the database.
praefect['database_host'] = 'POSTGRESQL_SERVER_ADDRESS'
praefect['database_port'] = 5432
praefect['database_user'] = 'praefect'
# Replace PRAEFECT_SQL_PASSWORD below with a real password of the database.
praefect['database_password'] = 'PRAEFECT_SQL_PASSWORD'
praefect['database_dbname'] = 'praefect_production'
# Uncomment the line below if you do not want to use an encrypted
# connection to PostgreSQL
# praefect['database_sslmode'] = 'disable'
# Uncomment and modify these lines if you are using a TLS client
# certificate to connect to PostgreSQL
# praefect['database_sslcert'] = '/path/to/client-cert'
# praefect['database_sslkey'] = '/path/to/client-key'
# Uncomment and modify this line if your PostgreSQL server uses a custom
# CA
# praefect['database_sslrootcert'] = '/path/to/rootcert'
Replace POSTGRESQL_SERVER_ADDRESS
, PRAEFECT_EXTERNAL_TOKEN
, PRAEFECT_INTERNAL_TOKEN
,
and PRAEFECT_SQL_PASSWORD
with their respective values.
Save the file and reconfigure Praefect:
sudo gitlab-ctl reconfigure
After you reconfigure, verify that Praefect can reach PostgreSQL:
sudo -u git /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml sql-ping
If the check fails, make sure you have followed the steps correctly. If you edit /etc/gitlab/gitlab.rb
,
remember to run sudo gitlab-ctl reconfigure
again before trying the
sql-ping
command.
Gitaly
Next we will configure each Gitaly server assigned to Praefect. Configuration for these is the same as a normal standalone Gitaly server, except that we use storage names and auth tokens from Praefect instead of GitLab.
Below is an example configuration for gitaly-1
, the only difference for the
other Gitaly nodes is the storage name under git_data_dirs
.
Note that gitaly['auth_token']
matches the token
value listed under praefect['virtual_storages']
on the Praefect node.
# /etc/gitlab/gitlab.rb on gitaly node inside praefect cluster
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
nginx['enable'] = false
prometheus['enable'] = false
grafana['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
prometheus_monitoring['enable'] = false
# Prevent database connections during 'gitlab-ctl reconfigure'
gitlab_rails['rake_cache_clear'] = false
gitlab_rails['auto_migrate'] = false
# Replace GITLAB_SHELL_SECRET_TOKEN below with real secret
gitlab_shell['secret_token'] = 'GITLAB_SHELL_SECRET_TOKEN'
# Configure the gitlab-shell API callback URL. Without this, `git push` will
# fail. This can be your 'front door' GitLab URL or an internal load
# balancer.
# Possible values could be: 'http://10.23.101.53', 'https://gitlab.example.com',
# etc. Please replace GITLAB_SERVER_ADDRESS with proper value and change schema
# to 'https' in case you use encrypted connection.
gitlab_rails['internal_api_url'] = 'http://GITLAB_SERVER_ADDRESS'
# Replace PRAEFECT_INTERNAL_TOKEN below with a real secret.
gitaly['auth_token'] = 'PRAEFECT_INTERNAL_TOKEN'
# Make Gitaly accept connections on all network interfaces. You must use
# firewalls to restrict access to this address/port.
# Comment out following line if you only want to support TLS connections
gitaly['listen_addr'] = "0.0.0.0:8075"
git_data_dirs({
# Update this to the name of this Gitaly server which will be later
# exposed in the UI under "Admin area > Gitaly"
"gitaly-1" => {
"path" => "/var/opt/gitlab/git-data"
}
})
Replace GITLAB_SHELL_SECRET_TOKEN
and PRAEFECT_INTERNAL_TOKEN
with their respective values.
For more information on Gitaly server configuration, see our Gitaly documentation.
When finished editing the configuration file for each Gitaly server, run the reconfigure command to put changes into effect:
sudo gitlab-ctl reconfigure
When all Gitaly servers are configured, you can run the Praefect connection checker to verify Praefect can connect to all Gitaly servers in the Praefect config. This can be done by running the following command on the Praefect server:
sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dial-nodes
GitLab
When Praefect is running, it should be exposed as a storage to GitLab. This
is done through setting the git_data_dirs
. Assuming the default storage
is present, there should be two storages available to GitLab:
# /etc/gitlab/gitlab.rb on gitlab server
# Replace PRAEFECT_URL_OR_IP below with real address Praefect can be accessed at.
# Replace PRAEFECT_EXTERNAL_TOKEN below with real secret.
git_data_dirs({
"default" => {
"path" => "/var/opt/gitlab/git-data"
},
"praefect" => {
"gitaly_address" => "tcp://PRAEFECT_URL_OR_IP:2305",
"gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN'
}
})
# Replace GITLAB_SHELL_SECRET_TOKEN below with real secret
gitlab_shell['secret_token'] = 'GITLAB_SHELL_SECRET_TOKEN'
# Possible values could be: 'http://10.23.101.53', 'https://gitlab.example.com',
# etc. Please replace GITLAB_SERVER_ADDRESS with proper value and change schema
# to 'https' in case you use encrypted connection. For more info please refer
# to https://docs.gitlab.com/omnibus/settings/configuration.html#configuring-the-external-url-for-gitlab
external_url "http://<GITLAB_SERVER_ADDRESS>"
Replace GITLAB_SHELL_SECRET_TOKEN
and PRAEFECT_EXTERNAL_TOKEN
with their respective values.
Note that the storage name used is the same as the praefect['virtual_storage_name']
set
on the Praefect node.
Save your changes and reconfigure GitLab:
sudo gitlab-ctl reconfigure
Run sudo gitlab-rake gitlab:gitaly:check
to confirm that GitLab can reach Praefect.
Testing Praefect
To test Praefect, first set it as the default storage node for new projects using Admin Area > Settings > Repository > Repository storage. Next, create a new project and check the "Initialize repository with a README" box.
If you receive an error, check /var/log/gitlab/gitlab-rails/production.log
.
Here are common errors and potential causes:
- 500 response code
- ActionView::Template::Error (7:permission denied)
praefect['auth_token']
andgitlab_rails['gitaly_token']
do not match on the GitLab server.
- Unable to save project. Error: 7:permission denied
- Secret token in
praefect['storage_nodes']
on GitLab server does not match the value ingitaly['auth_token']
on one or more Gitaly servers.
- Secret token in
- ActionView::Template::Error (7:permission denied)
- 503 response code
- GRPC::Unavailable (14:failed to connect to all addresses)
- GitLab was unable to reach Praefect.
- GRPC::Unavailable (14:all SubCons are in TransientFailure...)
- Praefect cannot reach one or more of its child Gitaly nodes. Try running the Praefect connection checker to diagnose.
- GRPC::Unavailable (14:failed to connect to all addresses)