debian-mirror-gitlab/doc/development/go_guide/index.md
2022-08-13 15:12:31 +05:30

22 KiB

stage group info
none unassigned To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments

Go standards and style guidelines

This document describes various guidelines and best practices for GitLab projects using the Go language.

Overview

GitLab is built on top of Ruby on Rails, but we're also using Go for projects where it makes sense. Go is a very powerful language, with many advantages, and is best suited for projects with a lot of IO (disk/network access), HTTP requests, parallel processing, and so on. Since we have both Ruby on Rails and Go at GitLab, we should evaluate carefully which of the two is best for the job.

This page aims to define and organize our Go guidelines, based on our various experiences. Several projects were started with different standards and they can still have specifics. They are described in their respective README.md or PROCESS.md files.

Go language versions

The Go upgrade documentation provides an overview of how GitLab manages and ships Go binary support.

If a GitLab component requires a newer version of Go, please follow the upgrade process to ensure no customer, team, or component is adversely impacted.

Sometimes, individual projects must also manage builds with multiple versions of Go.

Dependency Management

Go uses a source-based strategy for dependency management. Dependencies are downloaded as source from their source repository. This differs from the more common artifact-based strategy where dependencies are downloaded as artifacts from a package repository that is separate from the dependency's source repository.

Go did not have first-class support for version management prior to 1.11. That version introduced Go modules and the use of semantic versioning. Go 1.12 introduced module proxies, which can serve as an intermediate between clients and source version control systems, and checksum databases, which can be used to verify the integrity of dependency downloads.

See Dependency Management in Go for more details.

Code Review

We follow the common principles of Go Code Review Comments.

Reviewers and maintainers should pay attention to:

  • defer functions: ensure the presence when needed, and after err check.
  • Inject dependencies as parameters.
  • Void structs when marshaling to JSON (generates null instead of []).

Security

Security is our top priority at GitLab. During code reviews, we must take care of possible security breaches in our code:

  • XSS when using text/template
  • CSRF Protection using Gorilla
  • Use a Go version without known vulnerabilities
  • Don't leak secret tokens
  • SQL injections

Remember to run SAST and Dependency Scanning (ULTIMATE) on your project (or at least the gosec analyzer), and to follow our Security requirements.

Web servers can take advantages of middlewares like Secure.

Finding a reviewer

Many of our projects are too small to have full-time maintainers. That's why we have a shared pool of Go reviewers at GitLab. To find a reviewer, use the "Go" section of the "GitLab" project on the Engineering Projects page in the handbook.

To add yourself to this list, add the following to your profile in the team.yml file and ask your manager to review and merge.

projects:
  gitlab: reviewer go

Code style and format

  • Avoid global variables, even in packages. By doing so you introduce side effects if the package is included multiple times.

  • Use goimports before committing. goimports is a tool that automatically formats Go source code using Gofmt, in addition to formatting import lines, adding missing ones and removing unreferenced ones.

    Most editors/IDEs allow you to run commands before/after saving a file, you can set it up to run goimports so that it's applied to every file when saving.

  • Place private methods below the first caller method in the source file.

Automatic linting

All Go projects should include these GitLab CI/CD jobs:

lint:
  image: registry.gitlab.com/gitlab-org/gitlab-build-images:golangci-lint-alpine
  stage: test
  script:
    # Use default .golangci.yml file from the image if one is not present in the project root.
    - '[ -e .golangci.yml ] || cp /golangci/.golangci.yml .'
    # Write the code coverage report to gl-code-quality-report.json
    # and print linting issues to stdout in the format: path/to/file:line description
    # remove `--issues-exit-code 0` or set to non-zero to fail the job if linting issues are detected
    - golangci-lint run --issues-exit-code 0 --out-format code-climate | tee gl-code-quality-report.json | jq -r '.[] | "\(.location.path):\(.location.lines.begin) \(.description)"'
  artifacts:
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json

Including a .golangci.yml in the root directory of the project allows for configuration of golangci-lint. All options for golangci-lint are listed in this example.

Once recursive includes become available, you can share job templates like this analyzer.

Go GitLab linter plugins are maintained in the gitlab-org/language-tools/go/linters namespace.

Dependencies

Dependencies should be kept to the minimum. The introduction of a new dependency should be argued in the merge request, as per our Approval Guidelines. Both License Scanning (ULTIMATE) and Dependency Scanning (ULTIMATE) should be activated on all projects to ensure new dependencies security status and license compatibility.

Modules

In Go 1.11 and later, a standard dependency system is available behind the name Go Modules. It provides a way to define and lock dependencies for reproducible builds. It should be used whenever possible.

When Go Modules are in use, there should not be a vendor/ directory. Instead, Go automatically downloads dependencies when they are needed to build the project. This is in line with how dependencies are handled with Bundler in Ruby projects, and makes merge requests easier to review.

In some cases, such as building a Go project for it to act as a dependency of a CI run for another project, removing the vendor/ directory means the code must be downloaded repeatedly, which can lead to intermittent problems due to rate limiting or network failures. In these circumstances, you should cache the downloaded code between.

There was a bug on modules checksums in Go versions earlier than v1.11.4, so make sure to use at least this version to avoid checksum mismatch errors.

ORM

We don't use object-relational mapping libraries (ORMs) at GitLab (except ActiveRecord in Ruby on Rails). Projects can be structured with services to avoid them. pgx should be enough to interact with PostgreSQL databases.

Migrations

In the rare event of managing a hosted database, it's necessary to use a migration system like ActiveRecord is providing. A simple library like Journey, designed to be used in postgres containers, can be deployed as long-running pods. New versions deploy a new pod, migrating the data automatically.

Testing

Testing frameworks

We should not use any specific library or framework for testing, as the standard library provides already everything to get started. If there is a need for more sophisticated testing tools, the following external dependencies might be worth considering in case we decide to use a specific library or framework:

Subtests

Use subtests whenever possible to improve code readability and test output.

Better output in tests

When comparing expected and actual values in tests, use testify/require.Equal, testify/require.EqualError, testify/require.EqualValues, and others to improve readability when comparing structs, errors, large portions of text, or JSON documents:

type TestData struct {
    // ...
}

func FuncUnderTest() TestData {
    // ...
}

func Test(t *testing.T) {
    t.Run("FuncUnderTest", func(t *testing.T) {
        want := TestData{}
        got := FuncUnderTest()

        require.Equal(t, want, got) // note that expected value comes first, then comes the actual one ("diff" semantics)
    })
}

Table-Driven Tests

Using Table-Driven Tests is generally good practice when you have multiple entries of inputs/outputs for the same function. Below are some guidelines one can follow when writing table-driven test. These guidelines are mostly extracted from Go standard library source code. Keep in mind it's OK not to follow these guidelines when it makes sense.

Defining test cases

Each table entry is a complete test case with inputs and expected results, and sometimes with additional information such as a test name to make the test output easily readable.

Contents of the test case

  • Ideally, each test case should have a field with a unique identifier to use for naming subtests. In the Go standard library, this is commonly the name string field.
  • Use want/expect/actual when you are specifying something in the test case that is used for assertion.

Variable names

  • Each table-driven test map/slice of struct can be named tests.
  • When looping through tests the anonymous struct can be referred to as tt or tc.
  • The description of the test can be referred to as name/testName/tn.

Benchmarks

Programs handling a lot of IO or complex operations should always include benchmarks, to ensure performance consistency over time.

Error handling

Adding context

Adding context before you return the error can be helpful, instead of just returning the error. This allows developers to understand what the program was trying to do when it entered the error state making it much easier to debug.

For example:

// Wrap the error
return nil, fmt.Errorf("get cache %s: %w", f.Name, err)

// Just add context
return nil, fmt.Errorf("saving cache %s: %v", f.Name, err)

A few things to keep in mind when adding context:

  • Decide if you want to expose the underlying error to the caller. If so, use %w, if not, you can use %v.
  • Don't use words like failed, error, didn't. As it's an error, the user already knows that something failed and this might lead to having strings like failed xx failed xx failed xx. Explain what failed instead.
  • Error strings should not be capitalized or end with punctuation or a newline. You can use golint to check for this.

Naming

  • When using sentinel errors they should always be named like ErrXxx.
  • When creating a new error type they should always be named like XxxError.

Checking Error types

  • To check error equality don't use ==. Use errors.Is instead (for Go versions >= 1.13).
  • To check if the error is of a certain type don't use type assertion, use errors.As instead (for Go versions >= 1.13).

References for working with errors

CLIs

Every Go program is launched from the command line. cli is a convenient package to create command line apps. It should be used whether the project is a daemon or a simple CLI tool. Flags can be mapped to environment variables directly, which documents and centralizes at the same time all the possible command line interactions with the program. Don't use os.GetEnv, it hides variables deep in the code.

Daemons

Logging

The usage of a logging library is strongly recommended for daemons. Even though there is a log package in the standard library, we generally use Logrus. Its plugin ("hooks") system makes it a powerful logging library, with the ability to add notifiers and formatters at the logger level directly.

Structured (JSON) logging

Every binary ideally must have structured (JSON) logging in place as it helps with searching and filtering the logs. At GitLab we use structured logging in JSON format, as all our infrastructure assumes that. When using Logrus you can turn on structured logging simply by using the build in JSON formatter. This follows the same logging type we use in our Ruby applications.

How to use Logrus

There are a few guidelines one should follow when using the Logrus package:

  • When printing an error use WithError. For example, logrus.WithError(err).Error("Failed to do something").
  • Since we use structured logging we can log fields in the context of that code path, such as the URI of the request using WithField or WithFields. For example, logrus.WithField("file", "/app/go").Info("Opening dir"). If you have to log multiple keys, always use WithFields instead of calling WithField more than once.

Tracing and Correlation

LabKit is a place to keep common libraries for Go services. Currently it's vendored into two projects: Workhorse and Gitaly, and it exports two main (but related) pieces of functionality:

This gives us a thin abstraction over underlying implementations that is consistent across Workhorse, Gitaly, and, in future, other Go servers. For example, in the case of gitlab.com/gitlab-org/labkit/tracing we can switch from using Opentracing directly to using Zipkin or Gokit's own tracing wrapper without changes to the application code, while still keeping the same consistent configuration mechanism (that is, the GITLAB_TRACING environment variable).

Context

Since daemons are long-running applications, they should have mechanisms to manage cancellations, and avoid unnecessary resources consumption (which could lead to DDoS vulnerabilities). Go Context should be used in functions that can block and passed as the first parameter.

Dockerfiles

Every project should have a Dockerfile at the root of their repository, to build and run the project. Since Go program are static binaries, they should not require any external dependency, and shells in the final image are useless. We encourage Multistage builds:

  • They let the user build the project with the right Go version and dependencies.
  • They generate a small, self-contained image, derived from Scratch.

Generated Docker images should have the program at their Entrypoint to create portable commands. That way, anyone can run the image, and without parameters it displays its help message (if cli has been used).

Secure Team standards and style guidelines

The following are some style guidelines that are specific to the Secure Team.

Code style and format

Use goimports -local gitlab.com/gitlab-org before committing. goimports is a tool that automatically formats Go source code using Gofmt, in addition to formatting import lines, adding missing ones and removing unreferenced ones. By using the -local gitlab.com/gitlab-org option, goimports groups locally referenced packages separately from external ones. See the imports section of the Code Review Comments page on the Go wiki for more details. Most editors/IDEs allow you to run commands before/after saving a file, you can set it up to run goimports -local gitlab.com/gitlab-org so that it's applied to every file when saving.

Initializing slices

If initializing a slice, provide a capacity where possible to avoid extra allocations.

Do Don't
s2 := make([]string, 0, size)
for _, val := range s1 {
    s2 = append(s2, val)
}
var s2 []string
for _, val := range s1 {
    s2 = append(s2, val)
}

If no capacity is passed to make when creating a new slice, append will continuously resize the slice's backing array if it cannot hold the values. Providing the capacity ensures that allocations are kept to a minimum. It is recommended that the prealloc golanci-lint rule automatically check for this.

Analyzer Tests

The conventional Secure analyzer has a convert function that converts SAST/DAST scanner reports into GitLab Security Reports. When writing tests for the convert function, we should make use of test fixtures using a testdata directory at the root of the analyzer's repository. The testdata directory should contain two subdirectories: expect and reports. The reports directory should contain sample SAST/DAST scanner reports which are passed into the convert function during the test setup. The expect directory should contain the expected GitLab Security Report that the convert returns. See Secret Detection for an example.

If the scanner report is small, less than 35 lines, then feel free to inline the report rather than use a testdata directory.

Test Diffs

The go-cmp package should be used when comparing large structs in tests. It makes it possible to output a specific diff where the two structs differ, rather than seeing the whole of both structs printed out in the test logs. Here is a small example:

package main

import (
  "reflect"
  "testing"

  "github.com/google/go-cmp/cmp"
)

type Foo struct {
  Desc  Bar
  Point Baz
}

type Bar struct {
  A string
  B string
}

type Baz struct {
  X int
  Y int
}

func TestHelloWorld(t *testing.T) {
  want := Foo{
    Desc:  Bar{A: "a", B: "b"},
    Point: Baz{X: 1, Y: 2},
  }

  got := Foo{
    Desc:  Bar{A: "a", B: "b"},
    Point: Baz{X: 2, Y: 2},
  }

  t.Log("reflect comparison:")
  if !reflect.DeepEqual(got, want) {
    t.Errorf("Wrong result. want:\n%v\nGot:\n%v", want, got)
  }

  t.Log("cmp comparison:")
  if diff := cmp.Diff(want, got); diff != "" {
    t.Errorf("Wrong result. (-want +got):\n%s", diff)
  }
}

The output demonstrates why go-cmp is far superior when comparing large structs. Even though you could spot the difference with this small difference, it quickly gets unwieldy as the data grows.

  main_test.go:36: reflect comparison:
  main_test.go:38: Wrong result. want:
      {{a b} {1 2}}
      Got:
      {{a b} {2 2}}
  main_test.go:41: cmp comparison:
  main_test.go:43: Wrong result. (-want +got):
        main.Foo{
              Desc: {A: "a", B: "b"},
              Point: main.Baz{
      -               X: 1,
      +               X: 2,
                      Y: 2,
              },
        }

Return to Development documentation.