debian-mirror-gitlab/doc/development/feature_flags/controls.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

441 lines
18 KiB
Markdown
Raw Normal View History

2020-11-24 15:15:51 +05:30
---
type: reference, dev
stage: none
group: Development
2022-11-25 23:54:43 +05:30
info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments-to-development-guidelines"
2020-11-24 15:15:51 +05:30
---
2022-11-25 23:54:43 +05:30
# Use ChatOps to enable and disable feature flags
2019-09-30 21:07:59 +05:30
2022-11-25 23:54:43 +05:30
NOTE:
This document explains how to contribute to the development of the GitLab product.
If you want to use feature flags to show and hide functionality in your own applications,
view [this feature flags information](../../operations/feature_flags.md) instead.
2019-12-21 20:55:43 +05:30
2022-11-25 23:54:43 +05:30
To turn on/off features behind feature flags in any of the
GitLab-provided environments, like staging and production, you need to
2021-03-11 19:13:27 +05:30
have access to the [ChatOps](../chatops_on_gitlabcom.md) bot. The ChatOps bot
2022-01-26 12:08:38 +05:30
is currently running on the ops instance, which is different from
2023-01-13 00:05:48 +05:30
[GitLab.com](https://gitlab.com) or `dev.gitlab.org`.
2019-09-30 21:07:59 +05:30
2021-03-11 19:13:27 +05:30
Follow the ChatOps document to [request access](../chatops_on_gitlabcom.md#requesting-access).
2019-09-30 21:07:59 +05:30
2021-03-11 19:13:27 +05:30
After you are added to the project test if your access propagated,
2019-09-30 21:07:59 +05:30
run:
2020-04-22 19:07:51 +05:30
```shell
2019-09-30 21:07:59 +05:30
/chatops run feature --help
```
## Rolling out changes
When the changes are deployed to the environments it is time to start
rolling out the feature to our users. The exact procedure of rolling out a
change is unspecified, as this can vary from change to change. However, in
general we recommend rolling out changes incrementally, instead of enabling them
for everybody right away. We also recommend you to _not_ enable a feature
_before_ the code is being deployed.
This allows you to separate rolling out a feature from a deploy, making it
easier to measure the impact of both separately.
2021-02-22 17:27:13 +05:30
The GitLab feature library (using
2022-01-26 12:08:38 +05:30
[Flipper](https://github.com/jnunemaker/flipper), and covered in the
2023-06-20 00:43:36 +05:30
[Feature flags process](https://about.gitlab.com/handbook/product-development-flow/feature-flag-lifecycle/) guide) supports rolling out changes to a percentage of
2021-03-11 19:13:27 +05:30
time to users. This in turn can be controlled using [GitLab ChatOps](../../ci/chatops/index.md).
2019-09-30 21:07:59 +05:30
2022-01-26 12:08:38 +05:30
For an up to date list of feature flag commands please see
[the source code](https://gitlab.com/gitlab-com/chatops/blob/master/lib/chatops/commands/feature.rb).
2019-09-30 21:07:59 +05:30
Note that all the examples in that file must be preceded by
`/chatops run`.
If you get an error "Whoops! This action is not allowed. This incident
will be reported." that means your Slack account is not allowed to
2022-11-25 23:54:43 +05:30
change feature flags or you do not have access.
2019-09-30 21:07:59 +05:30
2021-03-11 19:13:27 +05:30
### Enabling a feature for pre-production testing
2019-09-30 21:07:59 +05:30
2022-01-26 12:08:38 +05:30
As a first step in a feature rollout, you should enable the feature on
2023-01-13 00:05:48 +05:30
`staging.gitlab.com`
and `dev.gitlab.org`.
2019-09-30 21:07:59 +05:30
These two environments have different scopes.
`dev.gitlab.org` is a production CE environment that has internal GitLab Inc.
traffic and is used for some development and other related work.
`staging.gitlab.com` has a smaller subset of GitLab.com database and repositories
and does not have regular traffic. Staging is an EE instance and can give you
2022-01-26 12:08:38 +05:30
a (very) rough estimate of how your feature will look and behave on GitLab.com.
2019-09-30 21:07:59 +05:30
Both of these instances are connected to Sentry so make sure you check the projects
there for any exceptions while testing your feature after enabling the feature flag.
2023-03-04 22:38:38 +05:30
For these pre-production environments, it's strongly encouraged to run the command in
`#staging`, `#production`, or `#chatops-ops-test`, for improved visibility.
2020-05-24 23:13:21 +05:30
2022-03-02 08:16:31 +05:30
To enable a feature for 25% of the time, run the following in Slack:
2020-05-24 23:13:21 +05:30
```shell
2022-05-07 20:08:51 +05:30
/chatops run feature set new_navigation_bar 25 --random --dev
/chatops run feature set new_navigation_bar 25 --random --staging
2020-05-24 23:13:21 +05:30
```
2019-09-30 21:07:59 +05:30
2019-12-21 20:55:43 +05:30
### Enabling a feature for GitLab.com
2019-09-30 21:07:59 +05:30
2020-05-24 23:13:21 +05:30
When a feature has successfully been
2021-03-11 19:13:27 +05:30
[enabled on a pre-production](#enabling-a-feature-for-pre-production-testing)
2020-05-24 23:13:21 +05:30
environment and verified as safe and working, you can roll out the
change to GitLab.com (production).
2023-05-27 22:25:52 +05:30
If a feature is [deprecated](../../update/deprecations.md), do not enable the flag.
2020-05-24 23:13:21 +05:30
#### Communicate the change
Some feature flag changes on GitLab.com should be communicated with
parts of the company. The developer responsible needs to determine
whether this is necessary and the appropriate level of communication.
This depends on the feature and what sort of impact it might have.
2021-01-03 14:25:43 +05:30
Guidelines:
2020-05-24 23:13:21 +05:30
2021-12-11 22:18:48 +05:30
- Consider notifying `#support_gitlab-com` beforehand. So in case if the feature has any side effects on user experience, they can mitigate and disable the feature flag to reduce some impact.
- If the feature meets the requirements for creating a [Change Management](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#feature-flags-and-the-change-management-process) issue, create a Change Management issue per [criticality guidelines](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#change-request-workflows).
- For simple, low-risk, easily reverted features, proceed and [enable the feature in `#production`](#process).
2022-07-23 23:45:48 +05:30
- For support requests to toggle feature flags for specific groups or projects, please follow the process outlined in the [support workflows](https://about.gitlab.com/handbook/support/workflows/saas_feature_flags.html).
2020-05-24 23:13:21 +05:30
#### Process
2021-10-27 15:23:28 +05:30
When enabling a feature flag rollout, the system will automatically block the
2022-01-26 12:08:38 +05:30
ChatOps command from succeeding if there are active `"severity::1"` or `~"severity::2"`
2021-10-27 15:23:28 +05:30
incidents or in-progress change issues, for example:
```shell
/chatops run feature set gitaly_lfs_pointers_pipeline true
- Production checks fail!
- active incidents
2021-06-29 Canary deployment failing QA tests
```
2020-05-24 23:13:21 +05:30
2023-06-20 00:43:36 +05:30
Before enabling a feature flag, verify that you are not violating any [Production Change Lock periods](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#production-change-lock-pcl) and are in compliance with the [Feature flags and the Change Management Process](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#feature-flags-and-the-change-management-process).
2022-10-11 01:57:18 +05:30
2020-05-24 23:13:21 +05:30
The following `/chatops` commands should be performed in the Slack
`#production` channel.
When you begin to enable the feature, please link to the relevant
2023-06-20 00:43:36 +05:30
feature flag rollout issue within a Slack thread of the first `/chatops`
2020-05-24 23:13:21 +05:30
command you make so people can understand the change if they need to.
2020-06-23 00:09:42 +05:30
To enable a feature for 25% of the time, run the following in Slack:
2019-09-30 21:07:59 +05:30
2020-04-22 19:07:51 +05:30
```shell
2022-05-07 20:08:51 +05:30
/chatops run feature set new_navigation_bar 25 --random
2019-09-30 21:07:59 +05:30
```
2020-06-23 00:09:42 +05:30
This sets a feature flag to `true` based on the following formula:
```ruby
feature_flag_state = rand < (25 / 100.0)
```
2019-09-30 21:07:59 +05:30
This will enable the feature for GitLab.com, with `new_navigation_bar` being the
name of the feature.
2020-04-22 19:07:51 +05:30
This command does *not* enable the feature for 25% of the total users.
Instead, when the feature is checked with `enabled?`, it will return `true` 25% of the time.
2019-09-30 21:07:59 +05:30
2020-06-23 00:09:42 +05:30
To enable a feature for 25% of actors such as users, projects, or groups,
run the following in Slack:
```shell
/chatops run feature set some_feature 25 --actors
```
This sets a feature flag to `true` based on the following formula:
```ruby
2020-10-24 23:57:45 +05:30
feature_flag_state = Zlib.crc32("some_feature<Actor>:#{actor.id}") % (100 * 1_000) < 25 * 1_000
2020-06-23 00:09:42 +05:30
# where <Actor>: is a `User`, `Group`, `Project` and actor is an instance
```
During development, based on the nature of the feature, an actor choice
should be made.
For user focused features:
```ruby
Feature.enabled?(:feature_cool_avatars, current_user)
```
For group or namespace level features:
```ruby
Feature.enabled?(:feature_cooler_groups, group)
```
For project level features:
```ruby
Feature.enabled?(:feature_ice_cold_projects, project)
```
2022-11-25 23:54:43 +05:30
If you are not certain what percentages to use, use the following steps:
2019-09-30 21:07:59 +05:30
1. 25%
1. 50%
1. 75%
1. 100%
Between every step you'll want to wait a little while and monitor the
appropriate graphs on <https://dashboards.gitlab.net>. The exact time to wait
may differ. For some features a few minutes is enough, while for others you may
want to wait several hours or even days. This is entirely up to you, just make
sure it is clearly communicated to your team, and the Production team if you
anticipate any potential problems.
Feature gates can also be actor based, for example a feature could first be
2019-12-21 20:55:43 +05:30
enabled for only the `gitlab` project. The project is passed by supplying a
2019-09-30 21:07:59 +05:30
`--project` flag:
2020-04-22 19:07:51 +05:30
```shell
2019-12-21 20:55:43 +05:30
/chatops run feature set --project=gitlab-org/gitlab some_feature true
2019-09-30 21:07:59 +05:30
```
2023-04-23 21:23:45 +05:30
You can use the `--user` option to enable a feature flag for a specific user:
```shell
/chatops run feature set --user=myusername some_feature true
```
If you would like to gather feedback internally first,
feature flags scoped to a user can also be enabled
for GitLab team members with the `gitlab_team_members`
[feature group](index.md#feature-groups):
```shell
/chatops run feature set --feature-group=gitlab_team_members some_feature true
```
You can use the `--group` flag to enable a feature flag for a specific group:
2019-09-30 21:07:59 +05:30
2020-04-22 19:07:51 +05:30
```shell
2019-09-30 21:07:59 +05:30
/chatops run feature set --group=gitlab-org some_feature true
```
2022-07-23 23:45:48 +05:30
Note that `--group` does not work with user namespaces. To enable a feature flag for a
generic namespace (including groups) use `--namespace`:
```shell
/chatops run feature set --namespace=gitlab-org some_feature true
/chatops run feature set --namespace=myusername some_feature true
```
2020-03-13 15:44:24 +05:30
Note that actor-based gates are applied before percentages. For example, considering the
`group/project` as `gitlab-org/gitlab` and a given example feature as `some_feature`, if
you run these 2 commands:
2020-04-22 19:07:51 +05:30
```shell
2020-03-13 15:44:24 +05:30
/chatops run feature set --project=gitlab-org/gitlab some_feature true
2020-06-23 00:09:42 +05:30
/chatops run feature set some_feature 25 --actors
```
Then `some_feature` will be enabled for both 25% of actors and always when interacting with
`gitlab-org/gitlab`. This is a good idea if the feature flag development makes use of group
actors.
```ruby
Feature.enabled?(:some_feature, group)
2020-03-13 15:44:24 +05:30
```
2022-07-23 23:45:48 +05:30
Multiple actors can be passed together in a comma-separated form:
```shell
/chatops run feature set --project=gitlab-org/gitlab,example-org/example-project some_feature true
/chatops run feature set --group=gitlab-org,example-org some_feature true
/chatops run feature set --namespace=gitlab-org,example-org some_feature true
```
2020-07-28 23:09:34 +05:30
Lastly, to verify that the feature is deemed stable in as many cases as possible,
you should fully roll out the feature by enabling the flag **globally** by running:
```shell
/chatops run feature set some_feature true
```
This changes the feature flag state to be **enabled** always, which overrides the
2021-09-30 23:02:18 +05:30
existing gates (for example, `--group=gitlab-org`) in the above processes.
2020-07-28 23:09:34 +05:30
2021-09-04 01:27:46 +05:30
Note, that if an actor based feature gate is present, switching the
`default_enabled` attribute of the YAML definition from `false` to `true`
will not have any effect. The feature gate must be deleted first.
2022-01-26 12:08:38 +05:30
For example, a feature flag is set via ChatOps:
2021-09-04 01:27:46 +05:30
```shell
/chatops run feature set --project=gitlab-org/gitlab some_feature true
```
When the `default_enabled` attribute in the YAML definition is switched to
`true`, the feature gate must be deleted to have the desired effect:
```shell
/chatops run feature delete some_feature
```
##### Percentage of actors vs percentage of time rollouts
If you want to make sure a feature is always on or off for users, use a **Percentage of actors**
rollout. Avoid using percentage of _time_ rollouts in this case.
A percentage of _time_ rollout can introduce inconsistent behavior when `Feature.enabled?`
is used multiple times in the code because the feature flag value is randomized each time
`Feature.enabled?` is called on your code path.
2021-03-11 19:13:27 +05:30
##### Disabling feature flags
To disable a feature flag that has been globally enabled you can run:
```shell
/chatops run feature set some_feature false
```
To disable a feature flag that has been enabled for a specific project you can run:
```shell
2022-07-16 23:28:13 +05:30
/chatops run feature set --project=gitlab-org/gitlab some_feature false
2021-03-11 19:13:27 +05:30
```
2023-03-04 22:38:38 +05:30
You cannot selectively disable feature flags for a specific project/group/user without applying a [specific method of implementing](controls.md#selectively-disable-by-actor) the feature flags.
2021-03-11 19:13:27 +05:30
2022-07-23 23:45:48 +05:30
If a feature flag is disabled via ChatOps, that will take precedence over the `default_enabled` value in the YAML. In other words, you could have a feature enabled for on-premise installations but not for GitLab.com.
2021-09-30 23:02:18 +05:30
2023-03-04 22:38:38 +05:30
#### Selectively disable by actor
By default you cannot selectively disable a feature flag by actor.
```shell
# This will not work how you would expect.
/chatops run feature set some_feature true
/chatops run feature set --project=gitlab-org/gitlab some_feature false
```
However, if you add two feature flags, you can write your conditional statement in such a way that the equivalent selective disable is possible.
```ruby
Feature.enabled?(:a_feature, project) && Feature.disabled?(:a_feature_override, project)
```
```shell
# This will enable a feature flag globally, except for gitlab-org/gitlab
/chatops run feature set a_feature true
/chatops run feature set --project=gitlab-org/gitlab a_feature_override true
```
#### Percentage-based actor selection
When using the percentage rollout of actors on multiple feature flags, the actors for each feature flag are selected separately.
For example, the following feature flags are enabled for a certain percentage of actors:
```plaintext
/chatops run feature set feature-set-1 25 --actors
/chatops run feature set feature-set-2 25 --actors
```
If a project A has `:feature-set-1` enabled, there is no guarantee that project A also has `:feature-set-2` enabled.
For more detail, see [This is how percentages work in Flipper](https://www.hackwithpassion.com/this-is-how-percentages-work-in-flipper/).
2023-03-17 16:20:25 +05:30
### Verifying metrics after enabling feature flag
After turning on the feature flag, you need to [monitor the relevant graphs](https://about.gitlab.com/handbook/engineering/monitoring/) between each step:
1. Go to [`dashboards.gitlab.net`](https://dashboards.gitlab.net).
1. Turn on the `feature-flag`.
1. Watch `Latency: Apdex` for services that might be impacted by your change
(like `sidekiq service`, `api service` or `web service`). Then check out more in-depth
dashboards by selecting `Service Overview Dashboards` and choosing a dashboard that might
be related to your change.
In this illustration, you can see that the Apdex score started to decline after the feature flag was enabled at `09:46`. The feature flag was then deactivated at `10:31`, and the service returned to the original value:
2023-06-20 00:43:36 +05:30
![Feature flag metrics](../img/feature-flag-metrics.png)
2023-03-17 16:20:25 +05:30
2020-05-24 23:13:21 +05:30
### Feature flag change logging
2021-03-11 19:13:27 +05:30
#### ChatOps level
2021-02-22 17:27:13 +05:30
2021-03-11 19:13:27 +05:30
Any feature flag change that affects GitLab.com (production) via [ChatOps](https://gitlab.com/gitlab-com/chatops)
2021-02-22 17:27:13 +05:30
is automatically logged in an issue.
2020-05-24 23:13:21 +05:30
The issue is created in the
2021-09-04 01:27:46 +05:30
[gl-infra/feature-flag-log](https://gitlab.com/gitlab-com/gl-infra/feature-flag-log/-/issues?scope=all&state=closed)
2020-05-24 23:13:21 +05:30
project, and it will at minimum log the Slack handle of person enabling
a feature flag, the time, and the name of the flag being changed.
2021-02-22 17:27:13 +05:30
The issue is then also posted to the GitLab internal
2020-05-24 23:13:21 +05:30
[Grafana dashboard](https://dashboards.gitlab.net/) as an annotation
marker to make the change even more visible.
Changes to the issue format can be submitted in the
2021-03-11 19:13:27 +05:30
[ChatOps project](https://gitlab.com/gitlab-com/chatops).
2020-05-24 23:13:21 +05:30
2021-02-22 17:27:13 +05:30
#### Instance level
Any feature flag change that affects any GitLab instance is automatically logged in
2022-08-27 11:52:29 +05:30
[features_json.log](../../administration/logs/index.md#features_jsonlog).
2021-02-22 17:27:13 +05:30
You can search the change history in [Kibana](https://about.gitlab.com/handbook/support/workflows/kibana.html).
2021-06-08 01:23:25 +05:30
You can also access the feature flag change history for GitLab.com [in Kibana](https://log.gprd.gitlab.net/goto/d060337c017723084c6d97e09e591fc6).
2021-02-22 17:27:13 +05:30
2019-09-30 21:07:59 +05:30
## Cleaning up
2021-01-03 14:25:43 +05:30
A feature flag should be removed as soon as it is no longer needed. Each additional
feature flag in the codebase increases the complexity of the application
and reduces confidence in our testing suite covering all possible combinations.
Additionally, a feature flag overwritten in some of the environments can result
in undefined and untested system behavior.
2022-03-02 08:16:31 +05:30
`development` type feature flags should have a short life-cycle because their purpose
is for rolling out a persistent change. `development` feature flags that are older
than 2 milestones are reported to engineering managers. The
[report tool](https://gitlab.com/gitlab-org/gitlab-feature-flag-alert) runs on a
monthly basis. For example, see [the report for December 2021](https://gitlab.com/gitlab-org/quality/triage-reports/-/issues/5480).
If a `development` feature flag is still present in the codebase after 6 months we should
take one of the following actions:
- Enable the feature flag by default and remove it.
- Convert it to an instance, group, or project setting.
- Revert the changes if it's still disabled and not needed anymore.
2021-04-17 20:07:23 +05:30
To remove a feature flag, open **one merge request** to make the changes. In the MR:
2021-01-03 14:25:43 +05:30
2022-10-11 01:57:18 +05:30
1. Add the ~"feature flag" label so release managers are aware of the removal.
2023-07-09 08:55:56 +05:30
1. If the merge request has to be backported into the current version, follow the
[patch release runbook](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/patch/engineers.md) process.
2021-04-29 21:17:54 +05:30
See [the feature flag process](https://about.gitlab.com/handbook/product-development-flow/feature-flag-lifecycle/#including-a-feature-behind-feature-flag-in-the-final-release)
2021-01-03 14:25:43 +05:30
for further details.
2021-04-17 20:07:23 +05:30
1. Remove all references to the feature flag from the codebase, including tests.
2021-01-03 14:25:43 +05:30
1. Remove the YAML definition for the feature from the repository.
2021-04-17 20:07:23 +05:30
Once the above MR has been merged, you should:
1. [Clean up the feature flag from all environments](#cleanup-chatops) with `/chatops run feature delete some_feature`.
2021-01-03 14:25:43 +05:30
1. Close the rollout issue for the feature flag after the feature flag is removed from the codebase.
### Cleanup ChatOps
2019-09-30 21:07:59 +05:30
2021-02-22 17:27:13 +05:30
When a feature gate has been removed from the codebase, the feature
2019-12-21 20:55:43 +05:30
record still exists in the database that the flag was deployed too.
The record can be deleted once the MR is deployed to each environment:
2019-09-30 21:07:59 +05:30
2020-03-13 15:44:24 +05:30
```shell
2019-12-21 20:55:43 +05:30
/chatops run feature delete some_feature --dev
/chatops run feature delete some_feature --staging
2019-09-30 21:07:59 +05:30
```
2019-12-21 20:55:43 +05:30
Then, you can delete it from production after the MR is deployed to prod:
2020-03-13 15:44:24 +05:30
```shell
2019-09-30 21:07:59 +05:30
/chatops run feature delete some_feature
```