info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
For this architecture, this diagram shows a request to `agentk 3, Pod1` for the list of pods:
```mermaid
sequenceDiagram
GitLab->>+kas1: Get list of running<br/>Pods from agentk<br/>with agent_id=3
Note right of kas1: kas1 checks for<br/>agent connected with agent_id=3.<br/>It does not.<br/>Queries Redis
kas1->>+Redis: Get list of connected agents<br/>with agent_id=3
Redis-->-kas1: List of connected agents<br/>with agent_id=3
Note right of kas1: kas1 picks a specific agentk instance<br/>to address and talks to<br/>the corresponding kas instance,<br/>specifying which agentk instance<br/>to route the request to.
kas1->>+kas2: Get the list of running Pods<br/>from agentk 3, Pod1
kas2->>+agentk 3 Pod1: Get list of Pods
agentk 3 Pod1->>-kas2: Get list of Pods
kas2-->>-kas1: List of running Pods<br/>from agentk 3, Pod1
kas1-->>-GitLab: List of running Pods<br/>from agentk with agent_id=3
```
Each `kas` instance tracks the agents connected to it in Redis. For each agent, it
stores a serialized protobuf object with information about the agent. When an agent
disconnects, `kas` removes all corresponding information from Redis. For both events,
`kas` publishes a notification to a Redis [pub-sub channel](https://redis.io/topics/pubsub).
Each agent, while logically a single entity, can have multiple replicas (multiple pods)
in a cluster. `kas` accommodates that and records per-replica (generally per-connection)
information. Each open `GetConfiguration()` streaming request is given
a unique identifier which, combined with agent ID, identifies an `agentk` instance.
gRPC can keep multiple TCP connections open for a single target host. `agentk` only
runs one `GetConfiguration()` streaming request. `kas` uses that connection, and
doesn't see idle TCP connections because they are handled by the gRPC framework.
Each `kas` instance provides information to Redis, so other `kas` instances can discover and access it.
Information is stored in Redis with an [expiration time](https://redis.io/commands/expire),
to expire information for `kas` instances that become unavailable. To prevent
information from expiring too quickly, `kas` periodically updates the expiration time
for valid entries. Before terminating, `kas` cleans up the information it adds into Redis.
When `kas` must atomically update multiple data structures in Redis, it uses
[transactions](https://redis.io/topics/transactions) to ensure data consistency.
Grouped data items must have the same expiration time.
In addition to the existing `agentk -> kas` gRPC endpoint, `kas` exposes two new,
separate gRPC endpoints for GitLab and for `kas -> kas` requests. Each endpoint
is a separate network listener, making it easier to control network access to endpoints
and allowing separate configuration for each endpoint.
Databases, like PostgreSQL, aren't used because the data is transient, with no need
to reliably persist it.
### `GitLab : kas` external endpoint
GitLab authenticates with `kas` using JWT and the same shared secret used by the
`kas -> GitLab` communication. The JWT issuer should be `gitlab` and the audience
should be `gitlab-kas`.
When accessed through this endpoint, `kas` plays the role of request router.
If a request from GitLab comes but no connected agent can handle it, `kas` blocks
and waits for a suitable agent to connect to it or to another `kas` instance. It
stops waiting when the client disconnects, or when some long timeout happens, such
as client timeout. `kas` is notified of new agent connections through a
[pub-sub channel](https://redis.io/topics/pubsub) to avoid frequent polling.
When a suitable agent connects, `kas` routes the request to it.
### `kas : kas` internal endpoint
This endpoint is an implementation detail, an internal API, and should not be used
by any other system. It's protected by JWT using a secret, shared among all `kas`
instances. No other system must have access to this secret.
When accessed through this endpoint, `kas` uses the request itself to determine
which `agentk` to send the request to. It prevents request cycles by only following
the instructions in the request, rather than doing discovery. It's the responsibility
of the `kas` receiving the request from the _external_ endpoint to retry and re-route
requests. This method ensures a single central component for each request can determine
how a request is routed, rather than distributing the decision across several `kas` instances.