info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines"
This document contains descriptions and guidelines for addressing security
vulnerabilities commonly identified in the GitLab codebase. They are intended
to help developers identify potential security vulnerabilities early, with the
goal of reducing the number of vulnerabilities released over time.
**Contributing**
If you would like to contribute to one of the existing documents, or add
guidelines for a new vulnerability type, please open an MR! Please try to
include links to examples of the vulnerability found, and link to any resources
used in defined mitigations. If you have questions or when ready for a review,
please ping `gitlab-com/gl-security/appsec`.
## Permissions
### Description
Application permissions are used to determine who can access what and what actions they can perform.
For more information about the permission model at GitLab, please see [the GitLab permissions guide](permissions.md) or the [EE docs on permissions](../../ee/user/permissions.md).
### Impact
Improper permission handling can have significant impacts on the security of an application.
Some situations may reveal [sensitive data](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/477) or allow a malicious actor to perform [harmful actions](https://gitlab.com/gitlab-org/gitlab/-/issues/8180).
A common vulnerability when permission checks are missing is called [IDOR](https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/05-Authorization_Testing/04-Testing_for_Insecure_Direct_Object_References) for Insecure Direct Object References.
Be careful to **also test [visibility levels](https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/doc/development/permissions.md#feature-specific-permissions)** and not only project access rights.
Some example of well implemented access controls and tests:
Unlike other programming languages (for example, Perl or Python) Regular Expressions are matching multi-line by default in Ruby. Consider the following example in Python:
The Python example will output an empty array (`[]`) as the matcher considers the whole string `foo\nbar` including the newline (`\n`). In contrast Ruby's Regular Expression engine acts differently:
The output of this example is `#<MatchData "bar">`, as Ruby treats the input `text` line by line. In order to match the whole __string__ the Regex anchors `\A` and `\z` should be used.
if params[:ip] =~ /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/
render :text => `ping -c 4 #{params[:ip]}`
else
render :text => "Invalid IP"
end
end
end
```
Here `params[:ip]` should not contain anything else but numbers and dots. However this restriction can be easily bypassed as the Regex anchors `^` and `$` are being used. Ultimately this leads to a shell command injection in `ping -c 4 #{params[:ip]}` by using newlines in `params[:ip]`.
#### Mitigation
In most cases the anchors `\A` for beginning of text and `\z` for end of text should be used instead of `^` and `$`.
Consider the following example application, which defines a check using a regular expression. A user entering `user@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!.com` as the email on a form will hang the web server.
- [The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale](https://people.cs.vt.edu/~davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf). This research paper discusses approaches to automatically detect ReDoS vulnerabilities.
- [Freezing the web: A study of ReDoS vulnerabilities in JavaScript-based web servers](https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-staicu.pdf). Another research paper about detecting ReDoS vulnerabilities.
A [Server-side Request Forgery (SSRF)](https://www.hackerone.com/application-security/how-server-side-request-forgery-ssrf) is an attack in which an attacker
- Reading internal services, including cloud service metadata.
- The latter can be a serious problem, because an attacker can obtain keys that allow control of the victim's cloud infrastructure. (This is also a good reason
- When the application makes any outbound connection
### Mitigations
In order to mitigate SSRF vulnerabilities, it is necessary to validate the destination of the outgoing request, especially if it includes user-supplied information.
The preferred SSRF mitigations within GitLab are:
1. Only connect to known, trusted domains/IP addresses.
1. Use the [GitLab::HTTP](#gitlab-http-library) library
The [GitLab::HTTP](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/http.rb) wrapper library has grown to include mitigations for all of the GitLab-known SSRF vectors. It is also configured to respect the
`Outbound requests` options that allow instance administrators to block all internal connections, or limit the networks to which connections can be made.
In some cases, it has been possible to configure GitLab::HTTP as the HTTP
[`Gitlab::UrlBlocker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/url_blocker.rb) can be used to validate that a
provided URL meets a set of constraints. Importantly, when `dns_rebind_protection` is `true`, the method returns a known-safe URI where the hostname
has been replaced with an IP address. This prevents DNS rebinding attacks, because the DNS record has been resolved. However, if we ignore this returned
value, we **will not** be protected against DNS rebinding.
This is the case with validators such as the `AddressableUrlValidator` (called with `validates :url, addressable_url: {opts}` or `public_url: {opts}`).
Validation errors are only raised when validations are called, for example when a record is created or saved. If we ignore the value returned by the validation
when persisting the record, **we need to recheck** its validity before using it. You can learn more about [Time of Check to Time of Use bugs](#time-of-check-to-time-of-use-bugs) in a later section
There are many tricks to bypass common SSRF validations. If feature-specific mitigations are necessary, they should be reviewed by the AppSec team, or a developer who has worked on SSRF mitigations previously.
- To mitigate DNS rebinding attacks, validate and use the first IP address received.
See [`url_blocker_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/lib/gitlab/url_blocker_spec.rb) for examples of SSRF payloads. See [time of check to time of use bugs](#time-of-check-to-time-of-use-bugs) to learn more about DNS rebinding's class of bug.
Cross site scripting (XSS) is an issue where malicious JavaScript code gets injected into a trusted web application and executed in a client's browser. The input is intended to be data, but instead gets treated as code by the browser.
XSS issues are commonly classified in three categories, by their delivery method:
The injected client-side code is executed on the victim's browser in the context of their current session. This means the attacker could perform any same action the victim would normally be able to do through a browser. The attacker would also have the ability to:
- perform actions that lead to data loss/theft or account takeover
Much of the impact is contingent upon the function of the application and the capabilities of the victim's session. For further impact possibilities, please check out [the beef project](https://beefproject.com/).
For a demonstration of the impact on GitLab with a realistic attack scenario, see [this video on the GitLab Unfiltered channel](https://www.youtube.com/watch?v=t4PzHNycoKo) (internal, it requires being logged in with the GitLab Unfiltered account).
For any and all input fields, ensure to define expectations on the type/format of input, the contents, <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [size limits](https://youtu.be/2VFavqfDS6w?t=7582), the context in which it will be output. It's important to work with both security and product teams to determine what is considered acceptable input.
- Validate the <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [input size limits](https://youtu.be/2VFavqfDS6w?t=7582).
- Validate the input using an <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [allowlist approach](https://youtu.be/2VFavqfDS6w?t=7816) to only allow characters through which you are expecting to receive for the field.
- When adding redirects or links to a user-controlled URL, ensure that the scheme is HTTP or HTTPS. Allowing other schemes like `javascript://` can lead to XSS and other security issues.
Note that denylists should be avoided, as it is near impossible to block all [variations of XSS](https://owasp.org/www-community/xss-filter-evasion-cheatsheet).
Once you've [determined when and where](#setting-expectations) the user submitted data will be output, it's important to encode it based on the appropriate context. For example:
- Content placed inside HTML elements need to be [HTML entity encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-1---html-escape-before-inserting-untrusted-data-into-html-element-content).
- Content placed into a JSON response needs to be [JSON encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-31---html-escape-json-values-in-an-html-context-and-read-the-data-with-jsonparse).
- Content placed inside <iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [HTML URL GET parameters](https://youtu.be/2VFavqfDS6w?t=3494) need to be [URL-encoded](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-5---url-escape-before-inserting-untrusted-data-into-html-url-parameter-values)
-<iclass="fa fa-youtube-play youtube"aria-hidden="true"></i> [Additional contexts may require context-specific encoding](https://youtu.be/2VFavqfDS6w?t=2341).
#### XSS mitigation and prevention in JavaScript and Vue
- When updating the content of an HTML element using JavaScript, mark user-controlled values as `textContent` or `nodeValue` instead of `innerHTML`.
- Avoid using `v-html` with user-controlled data, use [`v-safe-html`](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/directives-safe-html-directive--default) instead.
- Consider using [`gl-sprintf`](../../ee/development/i18n/externalization.md#interpolation) to interpolate translated strings securely.
- Avoid `__()` with translations that contain user-controlled values.
- When working with `postMessage`, ensure the `origin` of the message is allowlisted.
- Consider using the [Safe Link Directive](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/directives-safe-link-directive--default) to generate secure hyperlinks by default.
Path Traversal vulnerabilities grant attackers access to arbitrary directories and files on the server that is executing an application, including data, code or credentials.
### Impact
Path Traversal attacks can lead to multiple critical and high severity issues, like arbitrary file read, remote code execution or information disclosure.
- After validating the user supplied input, it should be appended to the base directory and the path should be canonicalized using the file system API.
As we have [moved away from supporting TLS 1.0 and 1.1](https://about.gitlab.com/blog/2018/10/15/gitlab-to-deprecate-older-tls/), you must use TLS 1.2 and above.
We recommend using the ciphers that Mozilla is providing in their [recommended SSL configuration generator](https://ssl-config.mozilla.org/#server=go&version=1.17&config=intermediate&guideline=5.6) for TLS 1.2:
-`ECDHE-ECDSA-AES128-GCM-SHA256`
-`ECDHE-RSA-AES128-GCM-SHA256`
-`ECDHE-ECDSA-AES256-GCM-SHA384`
-`ECDHE-RSA-AES256-GCM-SHA384`
-`ECDHE-ECDSA-CHACHA20-POLY1305`
-`ECDHE-RSA-CHACHA20-POLY1305`
And the following cipher suites (according to the [RFC 8446](https://datatracker.ietf.org/doc/html/rfc8446#appendix-B.4)) for TLS 1.3:
-`TLS_AES_128_GCM_SHA256`
-`TLS_AES_256_GCM_SHA384`
-`TLS_CHACHA20_POLY1305_SHA256`
*Note*: **Golang** does [not support](https://github.com/golang/go/blob/go1.17/src/crypto/tls/cipher_suites.go#L676) all cipher suites with TLS 1.3.
##### Implementation examples
##### TLS 1.3
For TLS 1.3, **Golang** only supports [3 cipher suites](https://github.com/golang/go/blob/go1.17/src/crypto/tls/cipher_suites.go#L676), as such we only need to set the TLS version:
```golang
cfg := &tls.Config{
MinVersion: tls.VersionTLS13,
}
```
For **Ruby**, you can use [HTTParty](https://github.com/jnunemaker/httparty) and specify TLS 1.3 version as well as ciphers:
Whenever possible this example should be **avoided** for security purposes:
**Golang** does support multiple cipher suites that we do not want to use with TLS 1.2. We need to explicitly list authorized ciphers:
```golang
func secureCipherSuites() []uint16 {
return []uint16{
tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,
tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,
}
```
And then use `secureCipherSuites()` in `tls.Config`:
```golang
tls.Config{
(...),
CipherSuites: secureCipherSuites(),
MinVersion: tls.VersionTLS12,
(...),
}
```
This example was taken [here](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/871b52dc700f1a66f6644fbb1e78a6d463a6ff83/internal/tool/tlstool/tlstool.go#L72).
For **Ruby**, you can use again [HTTParty](https://github.com/jnunemaker/httparty) and specify this time TLS 1.2 version alongside with the recommended ciphers:
There are some cases where `users` passed in the code is actually referring to a `DeployToken`/`DeployKey` entity instead of a real `User`, because of the code below in **`/lib/api/api_guard.rb`**
```ruby
def find_user_from_sources
strong_memoize(:find_user_from_sources) do
deploy_token_from_request ||
find_user_from_bearer_token ||
find_user_from_job_token ||
user_from_warden
end
end
```
### Past Vulnerable Code
In some scenarios such as [this one](https://gitlab.com/gitlab-org/gitlab/-/issues/237795), user impersonation is possible because a `DeployToken` ID can be used in place of a `User` ID. This happened because there was no check on the line with `Gitlab::Auth::CurrentUserMode.bypass_session!(user.id)`. In this case, the `id` is actually a `DeployToken` ID instead of a `User` ID.
```ruby
def find_current_user!
user = find_user_from_sources
return unless user
# Sessions are enforced to be unavailable for API calls, so ignore them for admin mode
In order to prevent this from happening, it is recommended to use the method `user.is_a?(User)` to make sure it returns `true` when we are expecting to deal with a `User` object. This could prevent the ID confusion from the method `find_user_from_sources` mentioned above. Below code snippet shows the fixed code after applying the best practice to the vulnerable code above.
## Guidelines when defining missing methods with metaprogramming
Metaprogramming is a way to define methods **at runtime**, instead of at the time of writing and deploying the code. It is a powerful tool, but can be dangerous if we allow untrusted actors (like users) to define their own arbitrary methods. For example, imagine we accidentally let an attacker overwrite an access control method to always return true! It can lead to many classes of vulnerabilities such as access control bypass, information disclosure, arbitrary file reads, and remote code execution.
Key methods to watch out for are `method_missing`, `define_method`, `delegate`, and similar methods.
### Insecure metaprogramming example
This example is adapted from an example submitted by [@jobert](https://hackerone.com/jobert?type=user) through our HackerOne bug bounty program.
Thank you for your contribution!
Before Ruby 2.5.1, you could implement delegators using the `delegate` or `method_missing` methods. For example:
```ruby
class User
def initialize(attributes)
@options = OpenStruct.new(attributes)
end
def is_admin?
name.eql?("Sid") # Note - never do this!
end
def method_missing(method, *args)
@options.send(method, *args)
end
end
```
When a method was called on a `User` instance that didn't exist, it passed it along to the `@options` instance variable.
Because the `is_admin?` method is already defined on the class, its behavior is not overridden when passing `is_admin?` to the initializer.
This class can be refactored to use the `Forwardable` method and `def_delegators`:
```ruby
class User
extend Forwardable
def initialize(attributes)
@options = OpenStruct.new(attributes)
self.class.instance_eval do
def_delegators :@options, *attributes.keys
end
end
def is_admin?
name.eql?("Sid") # Note - never do this!
end
end
```
It might seem like this example has the same behavior as the first code example. However, there's one crucial difference: **because the delegators are meta-programmed after the class is loaded, it can overwrite existing methods**:
Working with archive files like `zip`, `tar`, `jar`, `war`, `cpio`, `apk`, `rar` and `7z` presents an area where potentially critical security vulnerabilities can sneak into an application.
### Zip Slip
In 2018, the security company Snyk [released a blog post](https://snyk.io/research/zip-slip-vulnerability) describing research into a widespread and critical vulnerability present in many libraries and applications which allows an attacker to overwrite arbitrary files on the server file system which, in many cases, can be leveraged to achieve remote code execution. The vulnerability was dubbed Zip Slip.
A Zip Slip vulnerability happens when an application extracts an archive without validating and sanitizing the filenames inside the archive for directory traversal sequences that change the file location when the file is extracted.
Example malicious file names:
-`../../etc/passwd`
-`../../root/.ssh/authorized_keys`
-`../../etc/gitlab/gitlab.rb`
If a vulnerable application extracts an archive file with any of these file names, the attacker can overwrite these files with arbitrary content.
### Insecure archive extraction examples
#### Ruby
For zip files, the [rubyzip](https://rubygems.org/gems/rubyzip) Ruby gem is already patched against the Zip Slip vulnerability and will refuse to extract files that try to perform directory traversal, so for this vulnerable example we will extract a `tar.gz` file with `Gem::Package::TarReader`:
Always expand the destination file path by resolving all potential directory traversals and other sequences that can alter the path and refuse extraction if the final destination path does not start with the intended destination directory.
##### Ruby
```ruby
# tar.gz extraction example with protection against Zip Slip attacks.
raise "filename is outside of destination directory" unless
destination.start_with?(destination_dir + "/"))
destination
end
```
```ruby
# zip extraction example using rubyzip with built-in protection against Zip Slip attacks.
require 'zip'
Zip::File.open("/tmp/uploaded.zip") do |zip_file|
zip_file.each do |entry|
# Extract entry to /tmp/extracted directory.
entry.extract("/tmp/extracted")
end
end
```
##### Go
You are encouraged to use the secure archive utilities provided by [LabSec](https://gitlab.com/gitlab-com/gl-security/appsec/labsec) which will handle Zip Slip and other types of vulnerabilities for you. The LabSec utilities are also context aware which makes it possible to cancel or timeout extractions:
Symlink attacks makes it possible for an attacker to read the contents of arbitrary files on the server of a vulnerable application. While it is a high-severity vulnerability that can often lead to remote code execution and other critical vulnerabilities, it is only exploitable in scenarios where a vulnerable application accepts archive files from the attacker and somehow displays the extracted contents back to the attacker without any validation or sanitization of symbolic links inside the archive.
### Insecure archive symlink extraction examples
#### Ruby
For zip files, the [rubyzip](https://rubygems.org/gems/rubyzip) Ruby gem is already patched against symlink attacks as it simply ignores symbolic links, so for this vulnerable example we will extract a `tar.gz` file with `Gem::Package::TarReader`:
STDERR.puts("archive file does not exist or is not readable")
exit(false)
end
tar_extract.rewind
# Loop over each entry and output file contents
tar_extract.each do |entry|
next if entry.directory?
# Oops! We don't check if the file is actually a symbolic link to a potentially sensitive file.
puts entry.read
end
```
#### Go
```golang
// printZipContents INSECURELY prints contents of files in a zip file.
func printZipContents(src string) error {
r, err := zip.OpenReader(src)
if err != nil {
return err
}
defer r.Close()
// Loop over each entry and output file contents
for _, f := range r.File {
if f.FileInfo().IsDir() {
continue
}
rc, err := f.Open()
if err != nil {
return err
}
defer rc.Close()
// Oops! We don't check if the file is actually a symbolic link to a potentially sensitive file.
buf, err := ioutil.ReadAll(rc)
if err != nil {
return err
}
fmt.Println(buf.String())
}
return nil
}
```
#### Best practices
Always check the type of the archive entry before reading the contents and ignore entries that are not plain files. If you absolutely must support symbolic links, ensure that they only point to files inside the archive and nowhere else.
##### Ruby
```ruby
# tar.gz extraction example with protection against symlink attacks.
STDERR.puts("archive file does not exist or is not readable")
exit(false)
end
tar_extract.rewind
# Loop over each entry and output file contents
tar_extract.each do |entry|
next if entry.directory?
# By skipping symbolic links entirely, we are sure they can't cause any trouble!
next if entry.symlink?
puts entry.read
end
```
##### Go
You are encouraged to use the secure archive utilities provided by [LabSec](https://gitlab.com/gitlab-com/gl-security/appsec/labsec) which will handle Zip Slip and symlink vulnerabilities for you. The LabSec utilities are also context aware which makes it possible to cancel or timeout extractions.
In case the LabSec utilities do not fit your needs, here is an example for extracting a zip file with protection against symlink attacks:
```golang
// printZipContents prints contents of files in a zip file with protection against symlink attacks.
func printZipContents(src string) error {
r, err := zip.OpenReader(src)
if err != nil {
return err
}
defer r.Close()
// Loop over each entry and output file contents
for _, f := range r.File {
if f.FileInfo().IsDir() {
continue
}
// By skipping all irregular file types (including symbolic links), we are sure they can't cause any trouble!
These types of bugs are often seen in environments which allow multi-threading and concurrency, like filesystems and distributed web applications; these are a type of race condition. TOCTOU also occurs when state is checked and stored, then after a period of time that state is relied on without re-checking its accuracy and/or validity.
### Examples
**Example 1:** you have a model which accepts a URL as input. When the model is created you verify that the URL's host resolves to a public IP address, to prevent attackers making internal network calls. But DNS records can change ([DNS rebinding](#server-side-request-forgery-ssrf)]). An attacker updates the DNS record to `127.0.0.1`, and when your code resolves those URL's host it results in sending a potentially malicious request to a server on the internal network. The property was valid at the "time of check", but invalid and malicious at "time of use".
GitLab-specific example can be found in [this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/214401) where, although `Gitlab::UrlBlocker.validate!` was called, the returned value was not used. This made it vulnerable to TOCTOU bug and SSRF protection bypass through [DNS rebinding](#server-side-request-forgery-ssrf). The fix was to [use the validated IP address](https://gitlab.com/gitlab-org/gitlab/-/commit/7af8abd4df9a98f7a1ae7c4ec9840d0a7a8c684d).
**Example 2:** you have a feature which schedules jobs. When the user schedules the job, they have permission to do so. But imagine if, between the time they schedule the job and the time it is run, their permissions are restricted. Unless you re-check permissions at time of use, you could inadvertently allow unauthorized activity.
**Example 3:** you need to fetch a remote file, and perform a `HEAD` request to get and validate the content length and content type. When you subsequently make a `GET` request, though, the file delivered is a different size or different file type. (This is stretching the definition of TOCTOU, but things _have_ changed between time of check and time of use).
**Example 4:** you allow users to upvote a comment if they haven't already. The server is multi-threaded, and you aren't using transactions or an applicable database index. By repeatedly clicking upvote in quick succession a malicious user is able to add multiple upvotes: the requests arrive at the same time, the checks run in parallel and confirm that no upvote exists yet, and so each upvote is written to the database.