There is a potential rare race possible whereby the c.running channel could
be closed twice. Looking at the code I do not see a need for this c.running
channel and therefore I think we can remove this. (I think the c.running
might have been some attempt to prevent a hang but the use of os.Pipes should
prevent that.)
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: 6543 <6543@obermui.de>
If an `os/exec.Command` is passed non `*os.File` as an input/output, go
will create `os.Pipe`s and wait for their closure in `cmd.Wait()`. If
the code following this is responsible for closing `io.Pipe`s or other
handlers then on process death from context cancellation the `Wait` can
hang.
There are two possible solutions:
1. use `os.Pipe` as the input/output as `cmd.Wait` does not wait for these.
2. create a goroutine waiting on the context cancellation that will close the inputs.
This PR provides the second option - which is a simpler change that can
be more easily backported.
Closes#19448
Signed-off-by: Andrew Thornton <art27@cantab.net>
Follows #19266, #8553, Close#18553, now there are only three `Run..(&RunOpts{})` functions.
* before: `stdout, err := RunInDir(path)`
* now: `stdout, _, err := RunStdString(&git.RunOpts{Dir:path})`
This addresses https://github.com/go-gitea/gitea/issues/18352
It aims to improve performance (and resource use) of the `SyncReleasesWithTags` operation for pull-mirrors.
For large repositories with many tags, `SyncReleasesWithTags` can be a costly operation (taking several minutes to complete). The reason is two-fold:
1. on sync, every upstream repo tag is compared (for changes) against existing local entries in the release table to ensure that they are up-to-date.
2. the procedure for getting _each tag_ involves a series of git operations
```bash
git show-ref --tags -- v8.2.4477
git cat-file -t 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
git cat-file -p 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
git rev-list --count 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
```
of which the `git rev-list --count` can be particularly heavy.
This PR optimizes performance for pull-mirrors. We utilize the fact that a pull-mirror is always identical to its upstream and rebuild the entire release table on every sync and use a batch `git for-each-ref .. refs/tags` call to retrieve all tags in one go.
For large mirror repos, with hundreds of annotated tags, this brings down the duration of the sync operation from several minutes to a few seconds. A few unscientific examples run on my local machine:
- https://github.com/spring-projects/spring-boot (223 tags)
- before: `0m28,673s`
- after: `0m2,244s`
- https://github.com/kubernetes/kubernetes (890 tags)
- before: `8m00s`
- after: `0m8,520s`
- https://github.com/vim/vim (13954 tags)
- before: `14m20,383s`
- after: `0m35,467s`
I added a `foreachref` package which contains a flexible way of specifying which reference fields are of interest (`git-for-each-ref(1)`) and to produce a parser for the expected output. These could be reused in other places where `for-each-ref` is used. I'll add unit tests for those if the overall PR looks promising.
This follows
* https://github.com/go-gitea/gitea/issues/18553
Introduce `RunWithContextString` and `RunWithContextBytes` to help the refactoring. Add related unit tests. They keep the same behavior to save stderr into err.Error() as `RunInXxx` before.
Remove `RunInDirTimeoutPipeline` `RunInDirTimeoutFullPipeline` `RunInDirTimeout` `RunInDirTimeoutEnv` `RunInDirPipeline` `RunInDirFullPipeline` `RunTimeout`, `RunInDirTimeoutEnvPipeline`, `RunInDirTimeoutEnvFullPipeline`, `RunInDirTimeoutEnvFullPipelineFunc`.
Then remaining `RunInDir` `RunInDirBytes` `RunInDirWithEnv` can be easily refactored in next PR with a simple search & replace:
* before: `stdout, err := RunInDir(path)`
* next: `stdout, _, err := RunWithContextString(&git.RunContext{Dir:path})`
Other changes:
1. When `timeout <= 0`, use default. Because `timeout==0` is meaningless and could cause bugs. And now many functions becomes more simple, eg: `GitGcRepos` 9 lines to 1 line. `Fsck` 6 lines to 1 line.
2. Only set defaultCommandExecutionTimeout when the option `setting.Git.Timeout.Default > 0`
Strangely #19038 appears to relate to an issue whereby a tag appears to
be listed in `git show-ref --tags` but then does not appear when `git
show-ref --tags -- short_name` is called.
As a solution though I propose to stop the second call as it is
unnecessary and only likely to cause problems.
I've also noticed that the tags calls are wildly inefficient and aren't using the common cat-files - so these have been added.
I've also noticed that the git commit-graph is not being written on mirroring - so I've also added writing this to the migration which should improve mirror rendering somewhat.
Fix#19038
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
The git command by default adds a number of global arguments. These are not
helpful to be displayed in the process manager and so should be skipped for
default process descriptions.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Make SKIP_TLS_VERIFY apply to git data migrations too through adding the `-c http.sslVerify=false` option to the git clone command.
Fix#18998
Signed-off-by: Andrew Thornton <art27@cantab.net>
It appears possible that there could be a hang due to unread data from the
repo-attribute command pipes. This PR simply closes these during the defer.
Signed-off-by: Andrew Thornton <art27@cantab.net>
This code adds a simple endpoint to apply patches to repositories and
branches on gitea. This is then used along with the conflicting checking
code in #18004 to provide a basic implementation of cherry-pick revert.
Now because the buttons necessary for cherry-pick and revert have
required us to create a dropdown next to the Browse Source button
I've also implemented Create Branch and Create Tag operations.
Fix#3880Fix#17986
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Ensure git tag tests and other create test repos in tmpdir
There are a few places where tests appear to reuse testing repos which
causes random CI failures.
This PR simply changes these tests to ensure that cloning always happens
into new temporary directories.
Fix#18444
* Change log root for integration tests to use the REPO_TEST_DIR
There is a potential race in the drone integration tests whereby test-mysql etc
will start writing to log files causing make test-check fail.
Fix#18077
Signed-off-by: Andrew Thornton <art27@cantab.net>
- Pass the Global command args into serviceRPC.
- Fixes error with partial cloning.
- Add partial clone test
- Include diff
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.
This now means that the if there is a git repo already open in the context it will be used instead of reopening it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Git will and can pack references into packfiles and therefore if you write/read the
files directly you will get false results. Instead you should use update-ref and
show-ref. To that end I have created three new functions in git/repo_commit.go that
will do this correctly.
Related #17191
Signed-off-by: Andrew Thornton <art27@cantab.net>
There are repeated panics in tests due to TestRepository_GetTag failing
to run properly. This happens when we attempt to reset the internal
repo for a tag which has failed to load. The problem is - the panic that
this is causing is preventing us from finding what the real error is.
This PR simply moves the failure out so we have a chance to see what
really is failing.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Prevent off-by-one error on comments on newly appended lines
There was a bug in CutDiffAroundLine whereby if a file without a terminal new line
has a patch which appends lines to it and a comment is placed on one of those lines
the comment diff will be a line out of place.
This fixes CutDiffAroundLine to simply ignore the missing terminal newline - however,
we should really improve this rendering to add a marker to say that there was a
previously missing terminal newline.
Fix#17875
Signed-off-by: Andrew Thornton <art27@cantab.net>
The current TestPatch conflict code uses a plain git apply which does not properly
account for 3-way merging. However, we can improve things using `git read-tree -m` to
do a three-way merge then follow the algorithm used in merge-one-file. We can also use
`--patience` and/or `--histogram` to generate a nicer diff for applying patches too.
Fix#13679Fix#6417
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR contains multiple fixes. The most important of which is:
* Prevent hang in git cat-file if the repository is not a valid repository
Unfortunately it appears that if git cat-file is run in an invalid
repository it will hang until stdin is closed. This will result in
deadlocked /pulls pages and dangling git cat-file calls if a broken
repository is tried to be reviewed or pulls exists for a broken
repository.
Fix#14734Fix#9271Fix#16113
Otherwise there are a few small other fixes included which this PR was initially intending to fix:
* Fix panic on partial compares due to missing PullRequestWorkInProgressPrefixes
* Fix links on pulls pages due to regression from #17551 - by making most /issues routes match /pulls too - Fix#17983
* Fix links on feeds pages due to another regression from #17551 but also fix issue with syncing tags - Fix#17943
* Add missing locale entries for oauth group claims
* Prevent NPEs if ColorFormat is called on nil users, repos or teams.
The current implementation of checkBranchName is highly inefficient
involving opening the repository, the listing all of the branch names
checking them individually before then using using opened repo to get
the tags.
This PR avoids this by simply walking the references from show-ref
instead of opening the repository (in the nogogit case).
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR registers requests with the process manager and manages hierarchy within the processes.
Git repos are then associated with a context, (usually the request's context) - with sub commands using this context as their base context.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Use check attribute code to check the assigned language of a file and send that in to
chroma as a hint for the language of the file.
Signed-off-by: Andrew Thornton <art27@cantab.net>
There are multiple places where Gitea does not properly escape URLs that it is building and there are multiple places where it builds urls when there is already a simpler function available to use this.
This is an extensive PR attempting to fix these issues.
1. The first commit in this PR looks through all href, src and links in the Gitea codebase and has attempted to catch all the places where there is potentially incomplete escaping.
2. Whilst doing this we will prefer to use functions that create URLs over recreating them by hand.
3. All uses of strings should be directly escaped - even if they are not currently expected to contain escaping characters. The main benefit to doing this will be that we can consider relaxing the constraints on user names and reponames in future.
4. The next commit looks at escaping in the wiki and re-considers the urls that are used there. Using the improved escaping here wiki files containing '/'. (This implementation will currently still place all of the wiki files the root directory of the repo but this would not be difficult to change.)
5. The title generation in feeds is now properly escaped.
6. EscapePound is no longer needed - urls should be PathEscaped / QueryEscaped as necessary but then re-escaped with Escape when creating html with locales Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
There is a slight race in checking of a context deadline exceed in #16467
which leads to a 500 on the repository page.
The solution is to check the error coming back from `*LogNameStatusRepoParser.Next()`
and if it is the `ContextDeadlineExceeded` break from the loop.
Fix#17314
Signed-off-by: Andrew Thornton <art27@cantab.net>
core.protectNTFS protects NTFS from files which may be difficult to remove or interact
with using the win32 api, however, it also appears to prevent such files from
being entered into the git indexes - fundamentally causing breakages with PRs that
affect these files. However, deliberately setting this to false may cause security
issues due to the remain sparse checkout of files in the merge pipeline.
The only sensible option therefore is to provide an optional setting which admins
could set which would forcibly switch this off if they are affected by this issue.
Fix#17092
Signed-off-by: Andrew Thornton <art27@cantab.net>
- Update default branch if needed
- Update protected branch if needed
- Update all not merged pull request base branch name
- Rename git branch
- Record this rename work and auto redirect for old branch on ui
Signed-off-by: a1012112796 <1012112796@qq.com>
Co-authored-by: delvh <dev.lh@web.de>
One of the biggest reasons for slow repository browsing is that we wait
until last commit information has been generated for all files in the
repository.
This PR proposes deferring this generation to a new POST endpoint that
does the look up outside of the main page request.
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR changes the compare page to make the "..." in the between branches a clickable
link. This changes the comparison type from "..." to "..". Similarly it makes the
initial compare icon clickable to switch the head and base branches.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Ignore Sync errors on pipes when doing `CheckAttributeReader.CheckPath`
* apply env patch
* Drop the Sync and fix a number of issues with the Close function
Signed-off-by: Andrew Thornton <art27@cantab.net>
* add logs for DBIndexer and CheckPath
* Fix some more closing bugs
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add test case for language_stats
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Update modules/indexer/stats/db.go
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Some people still appear to report unclosed cat-files. This PR simply adds the caller
to the process descriptor for the CatFileBatch and CatFileBatchCheck calls.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Fixes#16381
Note that changes to unprotected files via the web editor still cannot be pushed directly to the protected branch. I could easily add such support for edits and deletes if needed. But for adding, uploading or renaming unprotected files, it is not trivial.
* Extract & Move GetAffectedFiles to modules/git
When the external context is cancelled it is possible for the
GitLogReader to not itself be Closed.
This PR does three things:
1. Instead of adding a plain defer it wraps the `g.Close` in a func as
`g` may change.
2. It adds the missing explicit g.Close - although the defer fix makes
this unnecessary.
3. It passes down the external context as the base context for the
GitLogReader meaning that the cancellation of the external context will
pass down automatically.
Fix#17007
Signed-off-by: Andrew Thornton <art27@cantab.net>
Replaces #16262
Replaces #16250
Replaces #14833
This PR first implements a `git check-attr` pipe reader - using `git check-attr --stdin -z --cached` - taking account of the change in the output format in git 1.8.5 and creates a helper function to read a tree into a temporary index file for that pipe reader.
It then wires this in to the language stats helper and into the git diff generation.
Files which are marked generated will be folded by default.
Fixes#14786Fixes#12653
* make sure headGitRepo is closed on err too
* refactor
* Fix git.Blob.DataAsync(): exec cancel since we already read all bytes (close pipe since we return a NopCloser)
* Add proxy settings and support for migration and webhook
* Fix default value
* Add newline for example ini
* Add lfs proxy support
* Fix lint
* Follow @zeripath's review
* Fix git clone
* Fix test
* missgin http requests for proxy
* use empty
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: zeripath <art27@cantab.net>
* Add info about list endpoints to CONTRIBUTING.md
* Let all list endpoints return X-Total-Count header
* Add TODOs for GetCombinedCommitStatusByRef
* Fix models/issue_stopwatch.go
* Rrefactor models.ListDeployKeys
* Introduce helper func and use them for SetLinkHeader related func
* Fix modified files list in webhooks when there is a space
There is an unfortunate bug with GetCommitFileStatus where files with
spaces are misparsed and split at the space.
There is a second bug because modern gits detect renames meaning that
this function no longer works correctly.
There is a third bug in that merge commits don't have their modified
files detected correctly.
Fix#15865
Signed-off-by: Andrew Thornton <art27@cantab.net>
Following the merging of https://github.com/go-git/go-git/pull/330 we
can now add a setting to avoid go-git reading and caching large objects.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Use `..` instead of `...` with `rev-list`. In combination with #16282 the receiver can get the correct commit. The behaviour is now like Github.
fixes#11802
There is a bug with last commit cache recursive cache where the last
commit information that refers to the current tree itself will cause a
panic due to its path ("") not being included in the expected tree entry
paths.
This PR fixes this by skipping the missing entry.
Fix#16290
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
remove log() func from gogs times and switch to proper logging
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Andrew Thornton <art27@cantab.net>
* Fix 500 Error with branch and tag sharing the same name #15592
Fixed 500 error while create Pull request when there are more
than one sources (branch, tag) with the same name
Fix#15592
Signed-off-by: Viktor Yakovchuk <viktor@yakovchuk.net>
* fix logging
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: 6543 <6543@obermui.de>
* More efficiently parse shas for shaPostProcessor
The shaPostProcessor currently repeatedly calls git rev-parse --verify on both backends
which is fine if there is only one thing that matches a sha - however if there are
multiple things then this becomes wildly inefficient.
This PR provides functions for both backends which are much faster to use.
Fix#16092
* Add ShaExistCache to RenderContext
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
* Improve get last commit using git log --name-status
git log --name-status -c provides information about the diff between a
commit and its parents. Using this and adjusting the algorithm to use
the first change to a path allows for a much faster generation of commit
info.
There is a subtle change in the results generated but this will cause
the results to more closely match those from elsewhere.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
* Make modules/context.Context a context.Context
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Simplify context calls
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Set the base context for requests to the HammerContext
Signed-off-by: Andrew Thornton <art27@cantab.net>
* pass context into get-last-commit
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Make commit_info cancellable
Signed-off-by: Andrew Thornton <art27@cantab.net>
* use context as context
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
* Added type sniffer.
* Switched content detection from base to typesniffer.
* Added GuessContentType to Blob.
* Moved image info logic to client.
Added support for SVG images in diff.
* Restore old blocked svg behaviour.
* Added missing image formats.
* Execute image diff only when container is visible.
* add margin to spinner
* improve BIN tag on image diffs
* Default to render view.
* Show image diff on incomplete diff.
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
Unfortunately some old repositories can have tags with empty Tagger, Commit
or Author. Go-Git variants will always have empty values for these whereas
the native git variant leaves them at nil. The simplest solution is just to
always have these set to empty Signatures.
v156 migration also makes the incorrect assumption that these cannot be empty.
Therefore add some handling to this and add logging and adjust broken
logging elsewhere in this migration.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Improve performance when there are multiple commits in the last commit cache
* read refs directly if we can
Signed-off-by: Andrew Thornton <art27@cantab.net>