8.3 KiB
File Storage in GitLab
We use the CarrierWave gem to handle file upload, store and retrieval.
There are many places where file uploading is used, according to contexts:
- System
- Instance Logo (logo visible in sign in/sign up pages)
- Header Logo (one displayed in the navigation bar)
- Group
- Group avatars
- User
- User avatars
- User snippet attachments
- Project
- Project avatars
- Issues/MR/Notes Markdown attachments
- Issues/MR/Notes Legacy Markdown attachments
- CI Artifacts (archive, metadata, trace)
- LFS Objects
- Merge request diffs
Disk storage
GitLab started saving everything on local disk. While directory location changed from previous versions, they are still not 100% standardized. You can see them below:
Description | In DB? | Relative path (from CarrierWave.root) | Uploader class | model_type |
---|---|---|---|---|
Instance logo | yes | uploads/-/system/appearance/logo/:id/:filename | AttachmentUploader |
Appearance |
Header logo | yes | uploads/-/system/appearance/header_logo/:id/:filename | AttachmentUploader |
Appearance |
Group avatars | yes | uploads/-/system/group/avatar/:id/:filename | AvatarUploader |
Group |
User avatars | yes | uploads/-/system/user/avatar/:id/:filename | AvatarUploader |
User |
User snippet attachments | yes | uploads/-/system/personal_snippet/:id/:random_hex/:filename | PersonalFileUploader |
Snippet |
Project avatars | yes | uploads/-/system/project/avatar/:id/:filename | AvatarUploader |
Project |
Issues/MR/Notes Markdown attachments | yes | uploads/:project_path_with_namespace/:random_hex/:filename | FileUploader |
Project |
Issues/MR/Notes Legacy Markdown attachments | no | uploads/-/system/note/attachment/:id/:filename | AttachmentUploader |
Note |
CI Artifacts (CE) | yes | shared/artifacts/:disk_hash[0..1]/:disk_hash[2..3]/:disk_hash/:year_:month_:date/:job_id/:job_artifact_id (:disk_hash is SHA256 digest of project_id) | JobArtifactUploader |
Ci::JobArtifact |
LFS Objects (CE) | yes | shared/lfs-objects/:hex/:hex/:object_hash | LfsObjectUploader |
LfsObject |
External merge request diffs | yes | shared/external-diffs/merge_request_diffs/mr-:parent_id/diff-:id | ExternalDiffUploader |
MergeRequestDiff |
CI Artifacts and LFS Objects behave differently in CE and EE. In CE they inherit the GitlabUploader
while in EE they inherit the ObjectStorage
and store files in and S3 API compatible object store.
In the case of Issues/MR/Notes Markdown attachments, there is a different approach using the Hashed Storage layout,
instead of basing the path into a mutable variable :project_path_with_namespace
, it's possible to use the
hash of the project ID instead, if project migrates to the new approach (introduced in 10.2).
Note: We provide an all-in-one rake task to migrate all uploads to object storage in one go. If a new Uploader class or model type is introduced, make sure you add a rake task invocation corresponding to it to the category list.
Path segments
Files are stored at multiple locations and use different path schemes.
All the GitlabUploader
derived classes should comply with this path segment schema:
| GitlabUploader
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `<gitlab_root>/public/` | `uploads/-/system/` | `user/avatar/:id/` | `:filename` |
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `CarrierWave.root` | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment` | `CarrierWave::Uploader#filename` |
| | `CarrierWave::Uploader#store_dir` | |
| FileUploader
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `<gitlab_root>/shared/` | `artifacts/` | `:year_:month/:id` | `:filename` |
| `<gitlab_root>/shared/` | `snippets/` | `:secret/` | `:filename` |
| ----------------------- + ------------------------- + --------------------------------- + -------------------------------- |
| `CarrierWave.root` | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment` | `CarrierWave::Uploader#filename` |
| | `CarrierWave::Uploader#store_dir` | |
| | | `FileUploader#upload_path |
| ObjectStore::Concern (store = remote)
| ----------------------- + ------------------------- + ----------------------------------- + -------------------------------- |
| `<bucket_name>` | <ignored> | `user/avatar/:id/` | `:filename` |
| ----------------------- + ------------------------- + ----------------------------------- + -------------------------------- |
| `#fog_dir` | `GitlabUploader.base_dir` | `GitlabUploader#dynamic_segment` | `CarrierWave::Uploader#filename` |
| | | `ObjectStorage::Concern#store_dir` | |
| | | `ObjectStorage::Concern#upload_path |
The RecordsUploads::Concern
concern will create an Upload
entry for every file stored by a GitlabUploader
persisting the dynamic parts of the path using
GitlabUploader#dynamic_path
. You may then use the Upload#build_uploader
method to manipulate the file.
Object Storage
By including the ObjectStorage::Concern
in the GitlabUploader
derived class, you may enable the object storage for this uploader. To enable the object storage
in your uploader, you need to either 1) include RecordsUpload::Concern
and prepend ObjectStorage::Extension::RecordsUploads
or 2) mount the uploader and create a new field named <mount>_store
.
The CarrierWave::Uploader#store_dir
is overridden to
GitlabUploader.base_dir
+GitlabUploader.dynamic_segment
when the store is LOCALGitlabUploader.dynamic_segment
when the store is REMOTE (the bucket name is used to namespace)
Using ObjectStorage::Extension::RecordsUploads
Note: this concern will automatically include
RecordsUploads::Concern
if not already included.
The ObjectStorage::Concern
uploader will search for the matching Upload
to select the correct object store. The Upload
is mapped using #store_dirs + identifier
for each store (LOCAL/REMOTE).
class SongUploader < GitlabUploader
include RecordsUploads::Concern
include ObjectStorage::Concern
prepend ObjectStorage::Extension::RecordsUploads
...
end
class Thing < ActiveRecord::Base
mount :theme, SongUploader # we have a great theme song!
...
end
Using a mounted uploader
The ObjectStorage::Concern
will query the model.<mount>_store
attribute to select the correct object store.
This column must be present in the model schema.
class SongUploader < GitlabUploader
include ObjectStorage::Concern
...
end
class Thing < ActiveRecord::Base
attr_reader :theme_store # this is an ActiveRecord attribute
mount :theme, SongUploader # we have a great theme song!
def theme_store
super || ObjectStorage::Store::LOCAL
end
...
end