Creating a new buffered reader for every part of the blame can miss
lines, as it will read and buffer bytes that the next buffered reader
will not get.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
During the refactoring of the git module, I found there were some
strange operations. This PR tries to fix 2 of them
1. The empty argument `--` in repo_attribute.go, which was introduced by
#16773. It seems unnecessary because nothing else would be added later.
2. The complex git service logic in repo/http.go.
* Before: the `hasAccess` only allow `service == "upload-pack" ||
service == "receive-pack"`
* After: unrelated code is removed. No need to call ToTrustedCmdArgs
anymore.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
This PR follows #21535 (and replace #22592)
## Review without space diff
https://github.com/go-gitea/gitea/pull/22678/files?diff=split&w=1
## Purpose of this PR
1. Make git module command completely safe (risky user inputs won't be
passed as argument option anymore)
2. Avoid low-level mistakes like
https://github.com/go-gitea/gitea/pull/22098#discussion_r1045234918
3. Remove deprecated and dirty `CmdArgCheck` function, hide the `CmdArg`
type
4. Simplify code when using git command
## The main idea of this PR
* Move the `git.CmdArg` to the `internal` package, then no other package
except `git` could use it. Then developers could never do
`AddArguments(git.CmdArg(userInput))` any more.
* Introduce `git.ToTrustedCmdArgs`, it's for user-provided and already
trusted arguments. It's only used in a few cases, for example: use git
arguments from config file, help unit test with some arguments.
* Introduce `AddOptionValues` and `AddOptionFormat`, they make code more
clear and simple:
* Before: `AddArguments("-m").AddDynamicArguments(message)`
* After: `AddOptionValues("-m", message)`
* -
* Before: `AddArguments(git.CmdArg(fmt.Sprintf("--author='%s <%s>'",
sig.Name, sig.Email)))`
* After: `AddOptionFormat("--author='%s <%s>'", sig.Name, sig.Email)`
## FAQ
### Why these changes were not done in #21535 ?
#21535 is mainly a search&replace, it did its best to not change too
much logic.
Making the framework better needs a lot of changes, so this separate PR
is needed as the second step.
### The naming of `AddOptionXxx`
According to git's manual, the `--xxx` part is called `option`.
### How can it guarantee that `internal.CmdArg` won't be not misused?
Go's specification guarantees that. Trying to access other package's
internal package causes compilation error.
And, `golangci-lint` also denies the git/internal package. Only the
`git/command.go` can use it carefully.
### There is still a `ToTrustedCmdArgs`, will it still allow developers
to make mistakes and pass untrusted arguments?
Generally speaking, no. Because when using `ToTrustedCmdArgs`, the code
will be very complex (see the changes for examples). Then developers and
reviewers can know that something might be unreasonable.
### Why there was a `CmdArgCheck` and why it's removed?
At the moment of #21535, to reduce unnecessary changes, `CmdArgCheck`
was introduced as a hacky patch. Now, almost all code could be written
as `cmd := NewCommand(); cmd.AddXxx(...)`, then there is no need for
`CmdArgCheck` anymore.
### Why many codes for `signArg == ""` is deleted?
Because in the old code, `signArg` could never be empty string, it's
either `-S[key-id]` or `--no-gpg-sign`. So the `signArg == ""` is just
dead code.
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
- Remove code that isn't being used.
Found this is my stash from a few weeks ago, not sure how I found this
in the first place.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Provide a new type to make it easier to parse a ref name.
Actually, it's picked up from #21937, to make the origin PR lighter.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Change all license headers to comply with REUSE specification.
Fix #16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
Although git does expect that author names should be of the form: `NAME
<EMAIL>` some users have been able to create commits with: `<EMAIL>`
Fix #21900
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Fix #20456
At some point during the 1.17 cycle abbreviated refishs to issue
branches started breaking. This is likely due serious inconsistencies in
our management of refs throughout Gitea - which is a bug needing to be
addressed in a different PR. (Likely more than one)
We should try to use non-abbreviated `fullref`s as much as possible.
That is where a user has inputted a abbreviated `refish` we should add
`refs/heads/` if it is `branch` etc. I know people keep writing and
merging PRs that remove prefixes from stored content but it is just
wrong and it keeps causing problems like this. We should only remove the
prefix at the time of
presentation as the prefix is the only way of knowing umambiguously and
permanently if the `ref` is referring to a `branch`, `tag` or `commit` /
`SHA`. We need to make it so that every ref has the appropriate prefix,
and probably also need to come up with some definitely unambiguous way
of storing `SHA`s if they're used in a `ref` or `refish` field. We must
not store a potentially
ambiguous `refish` as a `ref`. (Especially when referring a `tag` -
there is no reason why users cannot create a `branch` with the same
short name as a `tag` and vice versa and any attempt to prevent this
will fail. You can even create a `branch` and a
`tag` that matches the `SHA` pattern.)
To that end in order to fix this bug, when parsing issue templates check
the provided `Ref` (here a `refish` because almost all users do not know
or understand the subtly), if it does not start with `refs/` add the
`BranchPrefix` to it. This allows people to make their templates refer
to a `tag` but not to a `SHA` directly. (I don't think that is
particularly unreasonable but if people disagree I can make the `refish`
be checked to see if it matches the `SHA` pattern.)
Next we need to handle the issue links that are already written. The
links here are created with `git.RefURL`
Here we see there is a bug introduced in #17551 whereby the provided
`ref` argument can be double-escaped so we remove the incorrect external
escape. (The escape added in #17551 is in the right place -
unfortunately I missed that the calling function was doing the wrong
thing.)
Then within `RefURL()` we check if an unprefixed `ref` (therefore
potentially a `refish`) matches the `SHA` pattern before assuming that
is actually a `commit` - otherwise is assumed to be a `branch`. This
will handle most of the problem cases excepting the very unusual cases
where someone has deliberately written a `branch` to look like a `SHA1`.
But please if something is called a `ref` or interpreted as a `ref` make
it a full-ref before storing or using it. By all means if something is a
`branch` assume the prefix is removed but always add it back in if you
are using it as a `ref`. Stop storing abbreviated `branch` names and
`tag` names - which are `refish` as a `ref`. It will keep on causing
problems like this.
Fix #20456
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
The doctor check `storages` currently only checks the attachment
storage. This PR adds some basic garbage collection functionality for
the other types of storage.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
There was a bug introduced in #21352 due to a change of behaviour caused
by #19280. This causes a panic on running the default doctor checks
because the panic introduced by #19280 assumes that the only way
opts.StdOut and opts.Stderr can be set in RunOpts is deliberately.
Unfortunately, when running a git.Command the provided RunOpts can be
set, therefore if you share a common set of RunOpts these two values can
be set by the previous commands.
This PR stops using common RunOpts for the commands in that doctor check
but secondly stops RunCommand variants from changing the provided
RunOpts.
Signed-off-by: Andrew Thornton <art27@cantab.net>
A lot of our code is repeatedly testing if individual errors are
specific types of Not Exist errors. This is repetitative and unnecesary.
`Unwrap() error` provides a common way of labelling an error as a
NotExist error and we can/should use this.
This PR has chosen to use the common `io/fs` errors e.g.
`fs.ErrNotExist` for our errors. This is in some ways not completely
correct as these are not filesystem errors but it seems like a
reasonable thing to do and would allow us to simplify a lot of our code
to `errors.Is(err, fs.ErrNotExist)` instead of
`package.IsErr...NotExist(err)`
I am open to suggestions to use a different base error - perhaps
`models/db.ErrNotExist` if that would be felt to be better.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: delvh <dev.lh@web.de>
After some discussion, introduce a new slice `brokenArgs` to make
`gitCmd.Run()` return errors if any dynamic argument is invalid.
Co-authored-by: delvh <dev.lh@web.de>
We should only log CheckPath errors if they are not simply due to
context cancellation - and we should add a little more context to the
error message.
Fix #20709
Signed-off-by: Andrew Thornton <art27@cantab.net>
Close #20315 (fix the panic when parsing invalid input), Speed up #20231 (use ls-tree without size field)
Introduce ListEntriesRecursiveFast (ls-tree without size) and ListEntriesRecursiveWithSize (ls-tree with size)
Using `append(args, strings.Fields(arg)...)` is dangerous, it may
generate incorrect results.
For example: `arg1 "the dangerous"` will be splitted to 3 arguments:
`arg1`, `"the`, `dangerous"`. In some cases the incorrect arguments may
lead to security problems.
This fixes #5709 and #17316 by changing the order of listed branches
and tags to show the ones with latest commits atop.
It's achieved with changing underlying "show-ref" git command with
"for-each-ref" as suggested in https://stackoverflow.com/a/5188364
Also, it's passing format string so the output matches "show-ref"
command output.
close #5709
close #17316
A testing cleanup.
This pull request replaces `os.MkdirTemp` with `t.TempDir`. We can use the `T.TempDir` function from the `testing` package to create temporary directory. The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete.
This saves us at least 2 lines (error check, and cleanup) on every instance, or in some cases adds cleanup that we forgot.
Reference: https://pkg.go.dev/testing#T.TempDir
```go
func TestFoo(t *testing.T) {
// before
tmpDir, err := os.MkdirTemp("", "")
require.NoError(t, err)
defer os.RemoveAll(tmpDir)
// now
tmpDir := t.TempDir()
}
```
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
When setting.Git.DisablePartialClone is set to false then the web server will add filter support to web http. It does this by using`-c` command arguments but this will not work on gitea serv as the upload-pack and receive-pack commands do not support this.
Instead we move these options into the .gitconfig instead.
Fix #20400
Signed-off-by: Andrew Thornton <art27@cantab.net>
When migrating add several more important sanity checks:
* SHAs must be SHAs
* Refs must be valid Refs
* URLs must be reasonable
Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <matti@mdranta.net>
* Set no-tags in git fetch on compare
In the compare endpoint the git fetch is restricted to a certain branch however,
this does not completely prevent tag acquisition/pollution as git fetch will collect
any tags on that branch.
This causes pollution of the tag namespace and could cause confusion by users.
This PR adds `--no-tags` to the `git fetch` call.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Update modules/git/repo_compare.go
* Update modules/git/repo_compare.go
Signed-off-by: Andrew Thornton <art27@cantab.net>
The use of `--follow` makes getting these commits very slow on large repositories
as it results in searching the whole commit tree for a blob.
Now as nice as the results of `--follow` are, I am uncertain whether it is really
of sufficient importance to keep around.
Fix #20764
Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* merge `CheckLFSVersion` into `InitFull` (renamed from `InitWithSyncOnce`)
* remove the `Once` during git init, no data-race now
* for doctor sub-commands, `InitFull` should only be called in initialization stage
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
This enables git.Command's Run to optionally use the given context directly so its deadline will be respected. Otherwise, it falls back to the previous behavior of using the supplied timeout or a default timeout value of 360 seconds.
repo's serviceRPC() calls now use the context's deadline (which is unset/unlimited) instead of the default 6-minute timeout. This means that large repo clones will no longer arbitrarily time out on the upload-pack step, and pushes can take longer than 6 minutes on the receive-pack step.
Fixes #20680
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
* Add latest commit's SHA to content response
- When requesting the contents of a filepath, add the latest commit's
SHA to the requested file.
- Resolves #12840
* Add swagger
* Fix NPE
* Fix tests
* Hook into LastCommitCache
* Move AddLastCommitCache to a common nogogit and gogit file
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Prevent NPE
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
The LastCommitCache code is a little complex and there is unnecessary
duplication between the gogit and nogogit variants.
This PR adds the LastCommitCache as a field to the git.Repository and
pre-creates it in the ReferencesGit helpers etc. There has been some
simplification and unification of the variant code.
Signed-off-by: Andrew Thornton <art27@cantab.net>
When viewing a subdirectory and the latest commit to that directory in
the table, the commit status icon incorrectly showed the status of the
HEAD commit instead of the latest for that directory.
* Prevent context deadline error propagation in GetCommitsInfo
Although `WalkGitLog` tries to test for `context.DeadlineExceededErr`
there is a small chance that the error will propagate to the reader
before it is recognised. This will cause the error to propagate up to
`renderDirectoryFiles` and cause a http status 500.
Here we check that the error passed is a `DeadlineExceededErr` via error.Is
Fix #20329
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add git.HOME_PATH
* add legacy file check
* Apply suggestions from code review
Co-authored-by: zeripath <art27@cantab.net>
* pass env GNUPGHOME to git command, move the existing .gitconfig to new home, make the fix for 1.17rc more clear.
* set git.HOME_PATH for docker images to default HOME
* Revert "set git.HOME_PATH for docker images to default HOME"
This reverts commit f120101ddc267cef74e4f4b92c783d5fc8e275a1.
* force Gitea to use a stable GNUPGHOME directory
* extra check to ensure only process dir or symlink for legacy files
* refactor variable name
* The legacy dir check (for 1.17-rc1) could be removed with 1.18 release, since users should have upgraded from 1.17-rc to 1.17-stable
* Update modules/git/git.go
Co-authored-by: Steven Kriegler <61625851+justusbunsi@users.noreply.github.com>
* remove initFixGitHome117rc
* Update git.go
* Update docs/content/doc/advanced/config-cheat-sheet.en-us.md
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Steven Kriegler <61625851+justusbunsi@users.noreply.github.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Add fetch.writeCommitGraph to gitconfig to ensure that a commit-graph will be written
on git fetch calls.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Allow git push to work when networked file systems with mixed
ownership are used with Gitea docker images >= 1.16.6 or Gitea
binaries running alongside git versions published after 04/2022.
There are circumstances independent of Gitea (networked file systems
with various permission systems) by which the git repositories managed
by Gitea may have mixed owners. It is not a behavior that Gitea have
control over nor is it a problem as long as the permissions for Gitea to
operate are correct. Gitea instances have been operating under these
conditions for a number of years.
It is detected as a potential security risk ( see
GHSA-vw2c-22j4-2fh2
) by the most recent git versions. However, Gitea always runs git
commands with a current directory matching the repository on
which it operates. That makes Gitea immune from this security problem
and it is safe to ignore the mixed owner permission check.
This gitconfig modification is done on a file dedicated to the user
exclusively used by Gitea.
Fixes: #19455
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: zeripath <art27@cantab.net>
* clean git support for ver < 2.0
* fine tune tests for markup (which requires git module)
* remove unnecessary comments
* try to fix tests
* try test again
* use const for GitVersionRequired instead of var
* try to fix integration test
* Refactor CheckAttributeReader to make a *git.Repository version
* update document for commit signing with Gitea's internal gitconfig
* update document for commit signing with Gitea's internal gitconfig
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
* Fix GetNote
* Only log errors if the error is not ErrNotExist
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Andrew Thornton <art27@cantab.net>
When Gitea is running as PID 1 git will occassionally orphan child processes leading
to (defunct) processes. This PR simply sets Setpgid to true on these child processes
meaning that these defunct processes will also be correctly reaped.
Fix #19077
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Fix indention
Signed-off-by: kolaente <k@knt.li>
* Add option to merge a pr right now without waiting for the checks to succeed
Signed-off-by: kolaente <k@knt.li>
* Fix lint
Signed-off-by: kolaente <k@knt.li>
* Add scheduled pr merge to tables used for testing
Signed-off-by: kolaente <k@knt.li>
* Add status param to make GetPullRequestByHeadBranch reusable
Signed-off-by: kolaente <k@knt.li>
* Move "Merge now" to a seperate button to make the ui clearer
Signed-off-by: kolaente <k@knt.li>
* Update models/scheduled_pull_request_merge.go
Co-authored-by: 赵智超 <1012112796@qq.com>
* Update web_src/js/index.js
Co-authored-by: 赵智超 <1012112796@qq.com>
* Update web_src/js/index.js
Co-authored-by: 赵智超 <1012112796@qq.com>
* Re-add migration after merge
* Fix frontend lint
* Fix version compare
* Add vendored dependencies
* Add basic tets
* Make sure the api route is capable of scheduling PRs for merging
* Fix comparing version
* make vendor
* adopt refactor
* apply suggestion: User -> Doer
* init var once
* Fix Test
* Update templates/repo/issue/view_content/comments.tmpl
* adopt
* nits
* next
* code format
* lint
* use same name schema; rm CreateUnScheduledPRToAutoMergeComment
* API: can not create schedule twice
* Add TestGetBranchNamesForSha
* nits
* new go routine for each pull to merge
* Update models/pull.go
Co-authored-by: a1012112796 <1012112796@qq.com>
* Update models/scheduled_pull_request_merge.go
Co-authored-by: a1012112796 <1012112796@qq.com>
* fix & add renaming sugestions
* Update services/automerge/pull_auto_merge.go
Co-authored-by: a1012112796 <1012112796@qq.com>
* fix conflict relicts
* apply latest refactors
* fix: migration after merge
* Update models/error.go
Co-authored-by: delvh <dev.lh@web.de>
* Update options/locale/locale_en-US.ini
Co-authored-by: delvh <dev.lh@web.de>
* Update options/locale/locale_en-US.ini
Co-authored-by: delvh <dev.lh@web.de>
* adapt latest refactors
* fix test
* use more context
* skip potential edgecases
* document func usage
* GetBranchNamesForSha() -> GetRefsBySha()
* start refactoring
* ajust to new changes
* nit
* docu nit
* the great check move
* move checks for branchprotection into own package
* resolve todo now ...
* move & rename
* unexport if posible
* fix
* check if merge is allowed before merge on scheduled pull
* debugg
* wording
* improve SetDefaults & nits
* NotAllowedToMerge -> DisallowedToMerge
* fix test
* merge files
* use package "errors"
* merge files
* add string names
* other implementation for gogit
* adapt refactor
* more context for models/pull.go
* GetUserRepoPermission use context
* more ctx
* use context for loading pull head/base-repo
* more ctx
* more ctx
* models.LoadIssueCtx()
* models.LoadIssueCtx()
* Handle pull_service.Merge in one DB transaction
* add TODOs
* next
* next
* next
* more ctx
* more ctx
* Start refactoring structure of old pull code ...
* move code into new packages
* shorter names ... and finish **restructure**
* Update models/branches.go
Co-authored-by: zeripath <art27@cantab.net>
* finish UpdateProtectBranch
* more and fix
* update datum
* template: use "svg" helper
* rename prQueue 2 prPatchCheckerQueue
* handle automerge in queue
* lock pull on git&db actions ...
* lock pull on git&db actions ...
* add TODO notes
* the regex
* transaction in tests
* GetRepositoryByIDCtx
* shorter table name and lint fix
* close transaction bevore notify
* Update models/pull.go
* next
* CheckPullMergable check all branch protections!
* Update routers/web/repo/pull.go
* CheckPullMergable check all branch protections!
* Revert "PullService lock via pullID (#19520)" (for now...)
This reverts commit 6cde7c9159a5ea75a10356feb7b8c7ad4c434a9a.
* Update services/pull/check.go
* Use for a repo action one database transaction
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: delvh <dev.lh@web.de>
* Update services/issue/status.go
Co-authored-by: delvh <dev.lh@web.de>
* Update services/issue/status.go
Co-authored-by: delvh <dev.lh@web.de>
* use db.WithTx()
* gofmt
* make pr.GetDefaultMergeMessage() context aware
* make MergePullRequestForm.SetDefaults context aware
* use db.WithTx()
* pull.SetMerged only with context
* fix deadlock in `test-sqlite\#TestAPIBranchProtection`
* dont forget templates
* db.WithTx allow to set the parentCtx
* handle db transaction in service packages but not router
* issue_service.ChangeStatus just had caused another deadlock :/
it has to do something with how notification package is handled
* if we merge a pull in one database transaktion, we get a lock, because merge infoce internal api that cant handle open db sessions to the same repo
* ajust to current master
* Apply suggestions from code review
Co-authored-by: delvh <dev.lh@web.de>
* dont open db transaction in router
* make generate-swagger
* one _success less
* wording nit
* rm
* adapt
* remove not needed test files
* rm less diff & use attr in JS
* ...
* Update services/repository/files/commit.go
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* ajust db schema for PullAutoMerge
* skip broken pull refs
* more context in error messages
* remove webUI part for another pull
* remove more WebUI only parts
* API: add CancleAutoMergePR
* Apply suggestions from code review
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* fix lint
* Apply suggestions from code review
* cancle -> cancel
Co-authored-by: delvh <dev.lh@web.de>
* change queue identifyer
* fix swagger
* prevent nil issue
* fix and dont drop error
* as per @zeripath
* Update integrations/git_test.go
Co-authored-by: delvh <dev.lh@web.de>
* Update integrations/git_test.go
Co-authored-by: delvh <dev.lh@web.de>
* more declarative integration tests (dedup code)
* use assert.False/True helper
Co-authored-by: 赵智超 <1012112796@qq.com>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
There is a potential rare race possible whereby the c.running channel could
be closed twice. Looking at the code I do not see a need for this c.running
channel and therefore I think we can remove this. (I think the c.running
might have been some attempt to prevent a hang but the use of os.Pipes should
prevent that.)
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: 6543 <6543@obermui.de>
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`. If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.
There are two possible solutions:
1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.
This PR provides the second option - which is a simpler change that can
be more easily backported.
Closes #19448
Signed-off-by: Andrew Thornton <art27@cantab.net>
Follows #19266, #8553, Close #18553, now there are only three `Run..(&RunOpts{})` functions.
* before: `stdout, err := RunInDir(path)`
* now: `stdout, _, err := RunStdString(&git.RunOpts{Dir:path})`
This addresses https://github.com/go-gitea/gitea/issues/18352
It aims to improve performance (and resource use) of the `SyncReleasesWithTags` operation for pull-mirrors.
For large repositories with many tags, `SyncReleasesWithTags` can be a costly operation (taking several minutes to complete). The reason is two-fold:
1. on sync, every upstream repo tag is compared (for changes) against existing local entries in the release table to ensure that they are up-to-date.
2. the procedure for getting _each tag_ involves a series of git operations
```bash
git show-ref --tags -- v8.2.4477
git cat-file -t 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
git cat-file -p 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
git rev-list --count 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
```
of which the `git rev-list --count` can be particularly heavy.
This PR optimizes performance for pull-mirrors. We utilize the fact that a pull-mirror is always identical to its upstream and rebuild the entire release table on every sync and use a batch `git for-each-ref .. refs/tags` call to retrieve all tags in one go.
For large mirror repos, with hundreds of annotated tags, this brings down the duration of the sync operation from several minutes to a few seconds. A few unscientific examples run on my local machine:
- https://github.com/spring-projects/spring-boot (223 tags)
- before: `0m28,673s`
- after: `0m2,244s`
- https://github.com/kubernetes/kubernetes (890 tags)
- before: `8m00s`
- after: `0m8,520s`
- https://github.com/vim/vim (13954 tags)
- before: `14m20,383s`
- after: `0m35,467s`
I added a `foreachref` package which contains a flexible way of specifying which reference fields are of interest (`git-for-each-ref(1)`) and to produce a parser for the expected output. These could be reused in other places where `for-each-ref` is used. I'll add unit tests for those if the overall PR looks promising.
This follows
* https://github.com/go-gitea/gitea/issues/18553
Introduce `RunWithContextString` and `RunWithContextBytes` to help the refactoring. Add related unit tests. They keep the same behavior to save stderr into err.Error() as `RunInXxx` before.
Remove `RunInDirTimeoutPipeline` `RunInDirTimeoutFullPipeline` `RunInDirTimeout` `RunInDirTimeoutEnv` `RunInDirPipeline` `RunInDirFullPipeline` `RunTimeout`, `RunInDirTimeoutEnvPipeline`, `RunInDirTimeoutEnvFullPipeline`, `RunInDirTimeoutEnvFullPipelineFunc`.
Then remaining `RunInDir` `RunInDirBytes` `RunInDirWithEnv` can be easily refactored in next PR with a simple search & replace:
* before: `stdout, err := RunInDir(path)`
* next: `stdout, _, err := RunWithContextString(&git.RunContext{Dir:path})`
Other changes:
1. When `timeout <= 0`, use default. Because `timeout==0` is meaningless and could cause bugs. And now many functions becomes more simple, eg: `GitGcRepos` 9 lines to 1 line. `Fsck` 6 lines to 1 line.
2. Only set defaultCommandExecutionTimeout when the option `setting.Git.Timeout.Default > 0`
Strangely #19038 appears to relate to an issue whereby a tag appears to
be listed in `git show-ref --tags` but then does not appear when `git
show-ref --tags -- short_name` is called.
As a solution though I propose to stop the second call as it is
unnecessary and only likely to cause problems.
I've also noticed that the tags calls are wildly inefficient and aren't using the common cat-files - so these have been added.
I've also noticed that the git commit-graph is not being written on mirroring - so I've also added writing this to the migration which should improve mirror rendering somewhat.
Fix #19038
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
The git command by default adds a number of global arguments. These are not
helpful to be displayed in the process manager and so should be skipped for
default process descriptions.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Make SKIP_TLS_VERIFY apply to git data migrations too through adding the `-c http.sslVerify=false` option to the git clone command.
Fix #18998
Signed-off-by: Andrew Thornton <art27@cantab.net>
It appears possible that there could be a hang due to unread data from the
repo-attribute command pipes. This PR simply closes these during the defer.
Signed-off-by: Andrew Thornton <art27@cantab.net>
This code adds a simple endpoint to apply patches to repositories and
branches on gitea. This is then used along with the conflicting checking
code in #18004 to provide a basic implementation of cherry-pick revert.
Now because the buttons necessary for cherry-pick and revert have
required us to create a dropdown next to the Browse Source button
I've also implemented Create Branch and Create Tag operations.
Fix #3880
Fix #17986
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Ensure git tag tests and other create test repos in tmpdir
There are a few places where tests appear to reuse testing repos which
causes random CI failures.
This PR simply changes these tests to ensure that cloning always happens
into new temporary directories.
Fix #18444
* Change log root for integration tests to use the REPO_TEST_DIR
There is a potential race in the drone integration tests whereby test-mysql etc
will start writing to log files causing make test-check fail.
Fix #18077
Signed-off-by: Andrew Thornton <art27@cantab.net>
- Pass the Global command args into serviceRPC.
- Fixes error with partial cloning.
- Add partial clone test
- Include diff
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.
This now means that the if there is a git repo already open in the context it will be used instead of reopening it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Git will and can pack references into packfiles and therefore if you write/read the
files directly you will get false results. Instead you should use update-ref and
show-ref. To that end I have created three new functions in git/repo_commit.go that
will do this correctly.
Related #17191
Signed-off-by: Andrew Thornton <art27@cantab.net>
There are repeated panics in tests due to TestRepository_GetTag failing
to run properly. This happens when we attempt to reset the internal
repo for a tag which has failed to load. The problem is - the panic that
this is causing is preventing us from finding what the real error is.
This PR simply moves the failure out so we have a chance to see what
really is failing.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Prevent off-by-one error on comments on newly appended lines
There was a bug in CutDiffAroundLine whereby if a file without a terminal new line
has a patch which appends lines to it and a comment is placed on one of those lines
the comment diff will be a line out of place.
This fixes CutDiffAroundLine to simply ignore the missing terminal newline - however,
we should really improve this rendering to add a marker to say that there was a
previously missing terminal newline.
Fix #17875
Signed-off-by: Andrew Thornton <art27@cantab.net>
The current TestPatch conflict code uses a plain git apply which does not properly
account for 3-way merging. However, we can improve things using `git read-tree -m` to
do a three-way merge then follow the algorithm used in merge-one-file. We can also use
`--patience` and/or `--histogram` to generate a nicer diff for applying patches too.
Fix #13679
Fix #6417
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR contains multiple fixes. The most important of which is:
* Prevent hang in git cat-file if the repository is not a valid repository
Unfortunately it appears that if git cat-file is run in an invalid
repository it will hang until stdin is closed. This will result in
deadlocked /pulls pages and dangling git cat-file calls if a broken
repository is tried to be reviewed or pulls exists for a broken
repository.
Fix #14734
Fix #9271
Fix #16113
Otherwise there are a few small other fixes included which this PR was initially intending to fix:
* Fix panic on partial compares due to missing PullRequestWorkInProgressPrefixes
* Fix links on pulls pages due to regression from #17551 - by making most /issues routes match /pulls too - Fix #17983
* Fix links on feeds pages due to another regression from #17551 but also fix issue with syncing tags - Fix #17943
* Add missing locale entries for oauth group claims
* Prevent NPEs if ColorFormat is called on nil users, repos or teams.
The current implementation of checkBranchName is highly inefficient
involving opening the repository, the listing all of the branch names
checking them individually before then using using opened repo to get
the tags.
This PR avoids this by simply walking the references from show-ref
instead of opening the repository (in the nogogit case).
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR registers requests with the process manager and manages hierarchy within the processes.
Git repos are then associated with a context, (usually the request's context) - with sub commands using this context as their base context.
Signed-off-by: Andrew Thornton <art27@cantab.net>