info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---
# Working with diffs
We rely on different sources to present diffs. These include:
- Gitaly service
- Database (through `merge_request_diff_files`)
- Redis (cached highlighted diffs)
## Deep Dive
<!-- vale gitlab.Spelling = NO -->
In January 2019, Oswaldo Ferreira hosted a Deep Dive (GitLab team members only:
Everything covered in this deep dive was accurate as of GitLab 11.7, and while specific details may
have changed since then, it should still serve as a good introduction.
## Architecture overview
### Merge request diffs
When refreshing a merge request (pushing to a source branch, force-pushing to target branch, or if the target branch now contains any commits from the MR)
we fetch the comparison information using `Gitlab::Git::Compare`, which fetches `base` and `head` data using Gitaly and diff between them through
`Gitlab::Git::Diff.between`.
The diffs fetching process _limits_ single file diff sizes and the overall size of the whole diff through a series of constant values. Raw diff files are
then persisted on `merge_request_diff_files` table.
Even though diffs larger than 10% of the value of `ApplicationSettings#diff_max_patch_bytes` are collapsed,
we still keep them on PostgreSQL. However, diff files larger than defined _safety limits_
(see the [Diff limits section](#diff-limits)) are _not_ persisted in the database.
In order to present diffs information on the merge request diffs page, we:
1. Fetch all diff files from database `merge_request_diff_files`
1. Fetch the _old_ and _new_ file blobs in batch to:
- Highlight old and new file content
- Know which viewer it should use for each file (text, image, deleted, etc)
- Know if the file content changed
- Know if it was stored externally
- Know if it had storage errors
1. If the diff file is cacheable (text-based), it's cached on Redis
using `Gitlab::Diff::FileCollection::MergeRequestDiff`
### Note diffs
When commenting on a diff (any comparison), we persist a truncated diff version
on `NoteDiffFile` (which is associated with the actual `DiffNote`). So instead
of hitting the repository every time we need the diff of the file, we:
1. Check whether we have the `NoteDiffFile#diff` persisted and use it
1. Otherwise, if it's a current MR revision, use the persisted
`MergeRequestDiffFile#diff`
1. In the last scenario, go the repository and fetch the diff
## Diff limits
As explained above, we limit single diff files and the size of the whole diff. There are scenarios where we collapse the diff file,
and cases where the diff file is not presented at all, and the user is guided to the Blob view.
### Diff collection limits
Limits that act onto all diff files collection. Files number, lines number and files size are considered.