22 KiB
status | creation-date | authors | coach | approvers | owning-stage | participating-stages | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
proposed | 2022-11-23 |
|
@DylanGriffith |
|
~devops::release |
|
View and manage resources deployed by GitLab Agent For Kuberenetes
Summary
As part of the GitLab Kubernetes Dashboard epic, users want to view and manage their resources deployed by GitLab Agent For Kuberenetes. Users should be able to interact with the resources through GitLab UI, such as Environment Index/Details page.
This blueprint describes how the association is established and how these domain models interact with each other.
Motivation
Goals
- The proposed architecture can be used in GitLab Kubernetes Dashboard.
- The proposed architecture can be used in Organization-level Environment dashboard.
- The cluster resources and events can be visualized per GitLab Environment. An environment-specific view scoped to the resources managed either directly or indirectly by a deployment commit.
- Support both GitOps mode and CI Access mode.
- NOTE: At the moment, we focus on the solution for CI Access mode. GitOps mode will have significant architectural changes outside of this blueprint, such as Flux switching and Manifest projects outside of the Agent configuration project. In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
Non-Goals
- The design details of GitLab Kubernetes Dashboard and Organization-level Environment dashboard.
- Support Environment/Deployment features that rely on GitLab CI/CD pipelines, such as Protected Environments, Deployment Approvals, Deployment safety, and Environment rollback. These features are already available in CI Access mode, however, it's not available in GitOps mode.
Proposal
Overview
- GitLab Environment and Agent-managed Resource Group have 1-to-1 relationship.
- Agent-managed Resource Group tracks all resources produced by the connected agent. This includes not only resources written in manifest files but also subsequently generated resources (e.g.
Pod
s created byDeployment
manifest file). - Agent-managed Resource Group renders dependency graph, such as
Deployment
=>ReplicaSet
=>Pod
. This is for providing ArgoCD-style resource view. - Agent-managed Resource Group has the Resource Health status that represents a summary of resource statuses, such as
Healthy
,Progressing
orDegraded
.
flowchart LR
subgraph Kubernetes["Kubernetes"]
subgraph ResourceGroupProduction["ResourceGroup"]
direction LR
ResourceGroupProductionService(["Service"])
ResourceGroupProductionDeployment(["Deployment"])
ResourceGroupProductionPod1(["Pod1"])
ResourceGroupProductionPod2(["Pod2"])
end
subgraph ResourceGroupStaging["ResourceGroup"]
direction LR
ResourceGroupStagingService(["Service"])
ResourceGroupStagingDeployment(["Deployment"])
ResourceGroupStagingPod1(["Pod1"])
ResourceGroupStagingPod2(["Pod2"])
end
end
subgraph GitLab
subgraph Organization
subgraph Project
environment1["production environment"]
environment2["staging environment"]
end
end
end
environment1 --- ResourceGroupProduction
environment2 --- ResourceGroupStaging
ResourceGroupProductionService -.- ResourceGroupProductionDeployment
ResourceGroupProductionDeployment -.- ResourceGroupProductionPod1
ResourceGroupProductionDeployment -.- ResourceGroupProductionPod2
ResourceGroupStagingService -.- ResourceGroupStagingDeployment
ResourceGroupStagingDeployment -.- ResourceGroupStagingPod1
ResourceGroupStagingDeployment -.- ResourceGroupStagingPod2
Existing components and relationships
- GitLab Project and GitLab Environment have 1-to-many relationship.
- GitLab Project and Agent have 1-to-many direct relationship. Only one project can own a specific agent.
- GitOps mode
- GitLab Project and Agent do NOT have many-to-many indirect relationship yet. This will be supported in Manifest projects outside of the Agent configuration project.
- Agent and Agent-managed Resource Group have 1-to-1 relationship. Inventory IDs are used to group Kubernetes resources. This might be changed in Flux switching.
- CI Access mode
- GitLab Project and Agent have many-to-many indirect relationship. The project owning the agent can share the access with the other proejcts. (NOTE: Technically, only running jobs inside the project are allowed to access the cluster due to job-token authentication.)
- Agent and Agent-managed Resource Group do NOT have relationships yet.
Issues
- Agent-managed Resource Group should have environment ID as the foreign key, which must be unique across resource groups.
- Agent-managed Resource Group should have parameters how to group resources in the associated cluster, for example,
namespace
,lable
andinventory-id
(GitOps mode only) can passed as parameters. - Agent-managed Resource Group should be able to fetch all relevant resources, including both default resource kinds and other Custom Resources.
- Agent-managed Resource Group should be aware of dependency graph.
- Agent-managed Resource Group should be able to compute Resource Health status from the associated resources.
Example: Pull-based deployment (GitOps mode)
NOTE: At the moment, we focus on the solution for CI Access mode. GitOps mode will have significant architectural changes outside of this blueprint, such as Flux switching and Manifest projects outside of the Agent configuration project. In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
Example: Push-based deployment (CI access mode)
This is an example of how the architecture works in push-based deployment. The feature is documented here as CI access mode.
flowchart LR
subgraph ProductionKubernetes["Production Kubernetes"]
subgraph ResourceGroupProductionFrontend["ResourceGroup"]
direction LR
ResourceGroupProductionFrontendService(["Service"])
ResourceGroupProductionFrontendDeployment(["Deployment"])
ResourceGroupProductionFrontendPod1(["Pod1"])
ResourceGroupProductionFrontendPod2(["Pod2"])
end
subgraph ResourceGroupProductionBackend["ResourceGroup"]
direction LR
ResourceGroupProductionBackendService(["Service"])
ResourceGroupProductionBackendDeployment(["Deployment"])
ResourceGroupProductionBackendPod1(["Pod1"])
ResourceGroupProductionBackendPod2(["Pod2"])
end
subgraph ResourceGroupProductionPrometheus["ResourceGroup"]
direction LR
ResourceGroupProductionPrometheusService(["Service"])
ResourceGroupProductionPrometheusDeployment(["Deployment"])
ResourceGroupProductionPrometheusPod1(["Pod1"])
ResourceGroupProductionPrometheusPod2(["Pod2"])
end
end
subgraph GitLab
subgraph Organization
subgraph OperationGroup
subgraph AgentManagementProject
AgentManagementAgentProduction["Production agent"]
AgentManagementManifestFiles["Kubernetes Manifest Files"]
AgentManagementEnvironmentProductionPrometheus["production prometheus environment"]
AgentManagementPipelines["CI/CD pipelines"]
end
end
subgraph DevelopmentGroup
subgraph FrontendAppProject
FrontendAppCode["VueJS"]
FrontendDockerfile["Dockerfile"]
end
subgraph BackendAppProject
BackendAppCode["Golang"]
BackendDockerfile["Dockerfile"]
end
subgraph DeploymentProject
DeploymentManifestFiles["Kubernetes Manifest Files"]
DeploymentPipelines["CI/CD pipelines"]
DeploymentEnvironmentProductionFrontend["production frontend environment"]
DeploymentEnvironmentProductionBackend["production backend environment"]
end
end
end
end
DeploymentEnvironmentProductionFrontend --- ResourceGroupProductionFrontend
DeploymentEnvironmentProductionBackend --- ResourceGroupProductionBackend
AgentManagementEnvironmentProductionPrometheus --- ResourceGroupProductionPrometheus
ResourceGroupProductionFrontendService -.- ResourceGroupProductionFrontendDeployment
ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod1
ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod2
ResourceGroupProductionBackendService -.- ResourceGroupProductionBackendDeployment
ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod1
ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod2
ResourceGroupProductionPrometheusService -.- ResourceGroupProductionPrometheusDeployment
ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod1
ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod2
AgentManagementAgentProduction -- Shared with --- DeploymentProject
DeploymentPipelines -- "Deploy" --> ResourceGroupProductionFrontend
DeploymentPipelines -- "Deploy" --> ResourceGroupProductionBackend
AgentManagementPipelines -- "Deploy" --> ResourceGroupProductionPrometheus
Further details
Multi-Project Deployment Pipelines
The microservice project setup can be improved by Multi-Project Deployment Pipelines:
- Deployment Project can behave as the shared deployment engine for any upstream application projects and environments.
- Environments can be created within the application projects. It gives more visibility of environments for developers.
- Deployment Project can be managed under Operator group. More segregation of duties.
- Users don't need to setup RBAC to restrict CI/CD jobs.
- This is especitially helpful for dynamic environments, such as Review Apps.
flowchart LR
subgraph ProductionKubernetes["Production Kubernetes"]
subgraph ResourceGroupProductionFrontend["ResourceGroup"]
direction LR
ResourceGroupProductionFrontendService(["Service"])
ResourceGroupProductionFrontendDeployment(["Deployment"])
ResourceGroupProductionFrontendPod1(["Pod1"])
ResourceGroupProductionFrontendPod2(["Pod2"])
end
subgraph ResourceGroupProductionBackend["ResourceGroup"]
direction LR
ResourceGroupProductionBackendService(["Service"])
ResourceGroupProductionBackendDeployment(["Deployment"])
ResourceGroupProductionBackendPod1(["Pod1"])
ResourceGroupProductionBackendPod2(["Pod2"])
end
subgraph ResourceGroupProductionPrometheus["ResourceGroup"]
direction LR
ResourceGroupProductionPrometheusService(["Service"])
ResourceGroupProductionPrometheusDeployment(["Deployment"])
ResourceGroupProductionPrometheusPod1(["Pod1"])
ResourceGroupProductionPrometheusPod2(["Pod2"])
end
end
subgraph GitLab
subgraph Organization
subgraph OperationGroup
subgraph DeploymentProject
DeploymentAgentProduction["Production agent"]
DeploymentManifestFiles["Kubernetes Manifest Files"]
DeploymentEnvironmentProductionPrometheus["production prometheus environment"]
DeploymentPipelines["CI/CD pipelines"]
end
end
subgraph DevelopmentGroup
subgraph FrontendAppProject
FrontendDeploymentPipelines["CI/CD pipelines"]
FrontendEnvironmentProduction["production environment"]
end
subgraph BackendAppProject
BackendDeploymentPipelines["CI/CD pipelines"]
BackendEnvironmentProduction["production environment"]
end
end
end
end
FrontendEnvironmentProduction --- ResourceGroupProductionFrontend
BackendEnvironmentProduction --- ResourceGroupProductionBackend
DeploymentEnvironmentProductionPrometheus --- ResourceGroupProductionPrometheus
ResourceGroupProductionFrontendService -.- ResourceGroupProductionFrontendDeployment
ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod1
ResourceGroupProductionFrontendDeployment -.- ResourceGroupProductionFrontendPod2
ResourceGroupProductionBackendService -.- ResourceGroupProductionBackendDeployment
ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod1
ResourceGroupProductionBackendDeployment -.- ResourceGroupProductionBackendPod2
ResourceGroupProductionPrometheusService -.- ResourceGroupProductionPrometheusDeployment
ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod1
ResourceGroupProductionPrometheusDeployment -.- ResourceGroupProductionPrometheusPod2
FrontendDeploymentPipelines -- "Trigger downstream pipeline" --> DeploymentProject
BackendDeploymentPipelines -- "Trigger downstream pipeline" --> DeploymentProject
DeploymentPipelines -- "Deploy" --> ResourceGroupProductionFrontend
DeploymentPipelines -- "Deploy" --> ResourceGroupProductionBackend
View all Agent-managed Resource Groups on production environment
At the group-level, we can accumulate all environments match a specific tier, for example,
listing all environments with production
tier from subsequent projects.
This is useful to see the entire Agent-managed Resource Groups on production environment.
The following diagram examplifies the relationship between GitLab group and Kubernetes resources:
flowchart LR
subgraph Kubernetes["Kubernetes"]
subgraph ResourceGroupProduction["ResourceGroup"]
direction LR
ResourceGroupProductionService(["Service"])
ResourceGroupProductionDeployment(["Deployment"])
ResourceGroupProductionPod1(["Pod1"])
ResourceGroupProductionPod2(["Pod2"])
end
subgraph ResourceGroupStaging["ResourceGroup"]
direction LR
ResourceGroupStagingService(["Service"])
ResourceGroupStagingDeployment(["Deployment"])
ResourceGroupStagingPod1(["Pod1"])
ResourceGroupStagingPod2(["Pod2"])
end
end
subgraph GitLab
subgraph Organization
OrganizationProduction["All resources on production"]
subgraph Frontend project
FrontendEnvironmentProduction["production environment"]
end
subgraph Backend project
BackendEnvironmentProduction["production environment"]
end
end
end
FrontendEnvironmentProduction --- ResourceGroupProduction
BackendEnvironmentProduction --- ResourceGroupStaging
ResourceGroupProductionService -.- ResourceGroupProductionDeployment
ResourceGroupProductionDeployment -.- ResourceGroupProductionPod1
ResourceGroupProductionDeployment -.- ResourceGroupProductionPod2
ResourceGroupStagingService -.- ResourceGroupStagingDeployment
ResourceGroupStagingDeployment -.- ResourceGroupStagingPod1
ResourceGroupStagingDeployment -.- ResourceGroupStagingPod2
OrganizationProduction --- FrontendEnvironmentProduction
OrganizationProduction --- BackendEnvironmentProduction
A few notes:
- In the future, we'd have more granular filters for resource search.
For example, there are two environments
production/us-region
andproduction/eu-region
in each project and show only resources in US region at the group-level. This could be achivable by query filtering in PostgreSQL or label/namespace filtering in Kubernetes. - Please see Add dynamically populated organization-level environments page for more information.
Design and implementation details
NOTE: The following solution might be only applicable for CI Access mode. GitOps mode will have significant architectural changes outside of this blueprint, such as Flux switching and Manifest projects outside of the Agent configuration project. In order to derisk potential rework, we'll revisit the GitOps mode after these upstream changes have been settled.
Associate Environment with Agent
As a preliminary step, we allow users to explicitly define "which deployment job" uses "which agent" and deploy to "which namespace". The following keywords are supported in .gitlab-ci.yml
.
environment:kubernetes:agent
... Define which agent the deployment job uses. It can select the appropriate context from theKUBE_CONFIG
.environment:kubernetes:namespace
... Define which namespace the deployment job deploys to. It injectsKUBE_NAMESPACE
predefined variable into the job. This keyword already exists.
Here is an example of .gitlab-ci.yml
.
deploy-production:
environment:
name: production
kubernetes:
agent: path/to/agent/repository:agent-name
namespace: default
script:
- helm --context="$KUBE_CONTEXT" --namespace="$KUBE_NAMESPACE" upgrade --install
When a deployment job is created, GitLab persists the relationship of specified agent, namespace and deployment job. If the CI job is NOT authorized to access the agent (Please refer Clusters::Agents::FilterAuthorizationsService
for more details), this relationship aren't recorded. This process happens in Deployments::CreateForBuildService
. The database table scheme is:
agent_deployments:
- deployment_id (bigint/FK/NOT NULL/Unique)
- agent_id (bigint/FK/NOT NULL)
- kubernetes_namespace (character varying(255)/NOT NULL)
To idenfity an associated agent for a specific environment, environment.last_deployment.agent
can be used in Rails.
Fetch resources through user_access
When user visits an environment page, GitLab frontend fetches an environment via GraphQL. Frontend additionally fetches the associated agent-ID and namespace through deployment relationship, which being tracked by the agent_deployments
table.
Here is an example of GraphQL query:
{
project(fullPath: "group/project") {
id
environment(name: "<environment-name>") {
slug
lastDeployment(status: SUCCESS) {
agent {
id
kubernetesNamespace
}
}
}
}
}
GitLab frontend authenticate/authorize the user access with browser cookie. If the access is forbidden, frontend shows an error message that You don't have access to an agent that deployed to this environment. Please contact agent administrator if you are allowed in "user_access" in agent config file. See <troubleshooting-doc-link>
.
After the user gained access to the agent, GitLab frontend fetches available API Resource list in the Kubernetes and fetches the resources with the following parameters:
namespace
...#{environment.lastDeployment.agent.kubernetesNamespace}
labels
app.gitlab.com/project_id=#{project.id}
ANDapp.gitlab.com/environment_slug: #{environment.slug}
If no resources are found, this is likely that the users have not embedded these lables into their resources. In this case, frontend shows an warning message There are no resources found for the environment. Do resources have GitLab preserved labels? See <troubleshooting-doc-link>
.
Dependency graph
- GitLab frontend uses Owner References to idenfity the dependencies between resources. These are embedded in resources as
metadata.ownerReferences
field. - For the resoruces that don't have owner references, we can use Well-Known Labels, Annotations and Taints as complement. e.g.
EndpointSlice
doesn't havemetadata.ownerReferences
, but haskubernetes.io/service-name
as a reference to the parentService
resource.
Health status of resources
- GitLab frontend computes the status summary from the fetched resources. Something similar to ArgoCD's Resource Health e.g.
Healthy
,Progressing
,Degraded
andSuspended
. The formula is TBD.