info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines"
This document contains descriptions and guidelines for addressing security
vulnerabilities commonly identified in the GitLab codebase. They are intended
to help developers identify potential security vulnerabilities early, with the
goal of reducing the number of vulnerabilities released over time.
**Contributing**
If you would like to contribute to one of the existing documents, or add
guidelines for a new vulnerability type, please open an MR! Please try to
include links to examples of the vulnerability found, and link to any resources
used in defined mitigations. If you have questions or when ready for a review,
please ping `gitlab-com/gl-security/appsec`.
## Permissions
### Description
Application permissions are used to determine who can access what and what actions they can perform.
For more information about the permission model at GitLab, please see [the GitLab permissions guide](permissions.md) or the [EE docs on permissions](../../ee/user/permissions.md).
### Impact
Improper permission handling can have significant impacts on the security of an application.
Some situations may reveal [sensitive data](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/477) or allow a malicious actor to perform [harmful actions](https://gitlab.com/gitlab-org/gitlab/-/issues/8180).
A common vulnerability when permission checks are missing is called [IDOR](https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/05-Authorization_Testing/04-Testing_for_Insecure_Direct_Object_References) for Insecure Direct Object References.
Each time you implement a new feature/endpoint, whether it is at UI, API or GraphQL level.
### Mitigations
**Start by writing tests** around permissions: unit and feature specs should both include tests based around permissions
- Fine-grained, nitty-gritty specs for permissions are good: it is ok to be verbose here
- Make assertions based on the actors and objects involved: can a user or group or XYZ perform this action on this object?
- Consider defining them upfront with stakeholders, particularly for the edge cases
- Do not forget **abuse cases**: write specs that **make sure certain things can't happen**
- A lot of specs are making sure things do happen and coverage percentage doesn't take into account permissions as same piece of code is used.
- Make assertions that certain actors cannot perform actions
- Naming convention to ease auditability: to be defined, e.g. a subfolder containing those specific permission tests or a `#permissions` block
Be careful to **also test [visibility levels](https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/doc/development/permissions.md#feature-specific-permissions)** and not only project access rights.
Some example of well implemented access controls and tests:
Unlike other programming languages (e.g. Perl or Python) Regular Expressions are matching multi-line by default in Ruby. Consider the following example in Python:
The Python example will output an empty array (`[]`) as the matcher considers the whole string `foo\nbar` including the newline (`\n`). In contrast Ruby's Regular Expression engine acts differently:
The output of this example is `#<MatchData "bar">`, as Ruby treats the input `text` line by line. In order to match the whole __string__ the Regex anchors `\A` and `\z` should be used.
if params[:ip] =~ /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/
render :text => `ping -c 4 #{params[:ip]}`
else
render :text => "Invalid IP"
end
end
end
```
Here `params[:ip]` should not contain anything else but numbers and dots. However this restriction can be easily bypassed as the Regex anchors `^` and `$` are being used. Ultimately this leads to a shell command injection in `ping -c 4 #{params[:ip]}` by using newlines in `params[:ip]`.
#### Mitigation
In most cases the anchors `\A` for beginning of text and `\z` for end of text should be used instead of `^` and `$`.
Consider the following example application, which defines a check using a regular expression. A user entering `user@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!.com` as the email on a form will hang the web server.
- [The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale](https://people.cs.vt.edu/~davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf). This research paper discusses approaches to automatically detect ReDoS vulnerabilities.
- [Freezing the web: A study of ReDoS vulnerabilities in JavaScript-based web servers](https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-staicu.pdf). Another research paper about detecting ReDoS vulnerabilities.
- Reading internal services, including cloud service metadata.
- The latter can be a serious problem, because an attacker can obtain keys that allow control of the victim's cloud infrastructure. (This is also a good reason
- When the application makes any outbound connection
### Mitigations
In order to mitigate SSRF vulnerabilities, it is necessary to validate the destination of the outgoing request, especially if it includes user-supplied information.
The preferred SSRF mitigations within GitLab are:
1. Only connect to known, trusted domains/IP addresses.
1. Use the [GitLab::HTTP](#gitlab-http-library) library
The [GitLab::HTTP](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/http.rb) wrapper library has grown to include mitigations for all of the GitLab-known SSRF vectors. It is also configured to respect the
`Outbound requests` options that allow instance administrators to block all internal connections, or limit the networks to which connections can be made.
In some cases, it has been possible to configure GitLab::HTTP as the HTTP
For situations in which an allowlist or GitLab:HTTP cannot be used, it will be necessary to implement mitigations directly in the feature. It is best to validate the destination IP addresses themselves, not just domain names, as DNS can be controlled by the attacker. Below are a list of mitigations that should be implemented.
There are many tricks to bypass common SSRF validations. If feature-specific mitigations are necessary, they should be reviewed by the AppSec team, or a developer who has worked on SSRF mitigations previously.
Cross site scripting (XSS) is an issue where malicious JavaScript code gets injected into a trusted web application and executed in a client's browser. The input is intended to be data, but instead gets treated as code by the browser.
XSS issues are commonly classified in three categories, by their delivery method:
The injected client-side code is executed on the victim's browser in the context of their current session. This means the attacker could perform any same action the victim would normally be able to do through a browser. The attacker would also have the ability to:
- perform actions that lead to data loss/theft or account takeover
Much of the impact is contingent upon the function of the application and the capabilities of the victim's session. For further impact possibilities, please check out [the beef project](https://beefproject.com/).
### When to consider?
When user submitted data is included in responses to end users, which is just about anywhere.
For any and all input fields, ensure to define expectations on the type/format of input, the contents, <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [size limits](https://youtu.be/2VFavqfDS6w?t=7582), the context in which it will be output. It's important to work with both security and product teams to determine what is considered acceptable input.
- Validate the <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [input size limits](https://youtu.be/2VFavqfDS6w?t=7582).
- Validate the input using an <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [allowlist approach](https://youtu.be/2VFavqfDS6w?t=7816) to only allow characters through which you are expecting to receive for the field.
- When adding redirects or links to a user-controlled URL, ensure that the scheme is HTTP or HTTPS. Allowing other schemes like `javascript://` can lead to XSS and other security issues.
Note that denylists should be avoided, as it is near impossible to block all [variations of XSS](https://owasp.org/www-community/xss-filter-evasion-cheatsheet).
Once you've [determined when and where](#setting-expectations) the user submitted data will be output, it's important to encode it based on the appropriate context. For example:
- Content placed inside HTML elements need to be [HTML entity encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-1---html-escape-before-inserting-untrusted-data-into-html-element-content).
- Content placed into a JSON response needs to be [JSON encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-31---html-escape-json-values-in-an-html-context-and-read-the-data-with-jsonparse).
- Content placed inside <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [HTML URL GET parameters](https://youtu.be/2VFavqfDS6w?t=3494) need to be [URL-encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-5---url-escape-before-inserting-untrusted-data-into-html-url-parameter-values)
-<iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [Additional contexts may require context-specific encoding](https://youtu.be/2VFavqfDS6w?t=2341).
#### XSS mitigation and prevention in JavaScript and Vue
- When updating the content of an HTML element using JavaScript, mark user-controlled values as `textContent` or `nodeValue` instead of `innerHTML`.
- Avoid using `v-html` with user-controlled data, use [`v-safe-html`](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/directives-safe-html-directive--default) instead.
- Consider using [`gl-sprintf`](../../ee/development/i18n/externalization.md#interpolation) to interpolate translated strings securely.
- Avoid `__()` with translations that contain user-controlled values.
- When working with `postMessage`, ensure the `origin` of the message is allowlisted.
- Consider using the [Safe Link Directive](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/directives-safe-link-directive--default) to generate secure hyperlinks by default.
Path Traversal vulnerabilities grant attackers access to arbitrary directories and files on the server that is executing an application, including data, code or credentials.
### Impact
Path Traversal attacks can lead to multiple critical and high severity issues, like arbitrary file read, remote code execution or information disclosure.
- After validating the user supplied input, it should be appended to the base directory and the path should be canonicalized using the file system API.
This outputs `1` followed by the content of `/etc/passwd`.
## GitLab Internal Authorization
### Introduction
There are some cases where `users` passed in the code is actually referring to a `DeployToken`/`DeployKey` entity instead of a real `User`, because of the code below in **`/lib/api/api_guard.rb`**
```ruby
def find_user_from_sources
strong_memoize(:find_user_from_sources) do
deploy_token_from_request ||
find_user_from_bearer_token ||
find_user_from_job_token ||
user_from_warden
end
end
```
### Past Vulnerable Code
In some scenarios such as [this one](https://gitlab.com/gitlab-org/gitlab/-/issues/237795), user impersonation is possible because a `DeployToken` ID can be used in place of a `User` ID. This happened because there was no check on the line with `Gitlab::Auth::CurrentUserMode.bypass_session!(user.id)`. In this case, the `id` is actually a `DeployToken` ID instead of a `User` ID.
```ruby
def find_current_user!
user = find_user_from_sources
return unless user
# Sessions are enforced to be unavailable for API calls, so ignore them for admin mode
Gitlab::Auth::CurrentUserMode.bypass_session!(user.id) if Feature.enabled?(:user_mode_in_session)
unless api_access_allowed?(user)
forbidden!(api_access_denied_message(user))
end
```
### Best Practices
In order to prevent this from happening, it is recommended to use the method `user.is_a?(User)` to make sure it returns `true` when we are expecting to deal with a `User` object. This could prevent the ID confusion from the method `find_user_from_sources` mentioned above. Below code snippet shows the fixed code after applying the best practice to the vulnerable code above.
```ruby
def find_current_user!
user = find_user_from_sources
return unless user
if user.is_a?(User) && Feature.enabled?(:user_mode_in_session)
# Sessions are enforced to be unavailable for API calls, so ignore them for admin mode