- The ambiguous character detection is an important security feature to
combat against sourcebase attacks (https://trojansource.codes/).
- However there are a few problems with the feature as it stands
today (i) it's apparantly an big performance hitter, it's twice as slow
as syntax highlighting (ii) it contains false positives, because it's
reporting valid problems but not valid within the context of a
programming language (ambiguous charachters in code comments being a
prime example) that can lead to security issues (iii) charachters from
certain languages always being marked as ambiguous. It's a lot of effort
to fix the aforementioned issues.
- Therefore, make it configurable in which context the ambiguous
character detection should be run, this avoids running detection in all
contexts such as file views, but still enable it in commits and pull
requests diffs where it matters the most. Ideally this also becomes an
per-repository setting, but the code architecture doesn't allow for a
clean implementation of that.
- Adds unit test.
- Adds integration tests to ensure that the contexts and instance-wide
is respected (and that ambigious charachter detection actually work in
different places).
- Ref: https://codeberg.org/forgejo/forgejo/pulls/2395#issuecomment-1575547
- Ref: https://codeberg.org/forgejo/forgejo/issues/564
Follow:
* #23574
* Remove all ".tooltip[data-content=...]"
Major changes:
* Remove "tooltip" class, use "[data-tooltip-content=...]" instead of
".tooltip[data-content=...]"
* Remove legacy `data-position`, it's dead code since last Fomantic
Tooltip -> Tippy Tooltip refactoring
* Rename reaction attribute from `data-content` to
`data-reaction-content`
* Add comments for some `data-content`: `{{/* used by the form */}}`
* Remove empty "ui" class
* Use "text color" for SVG icons (a few)
Change all license headers to comply with REUSE specification.
Fix #16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
The recovery, API, Web and package frameworks all create their own HTML
Renderers. This increases the memory requirements of Gitea
unnecessarily with duplicate templates being kept in memory.
Further the reloading framework in dev mode for these involves locking
and recompiling all of the templates on each load. This will potentially
hide concurrency issues and it is inefficient.
This PR stores the templates renderer in the context and stores this
context in the NormalRoutes, it then creates a fsnotify.Watcher
framework to watch files.
The watching framework is then extended to the mailer templates which
were previously not being reloaded in dev.
Then the locales are simplified to a similar structure.
Fix #20210
Fix #20211
Fix #20217
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR rewrites the invisible unicode detection algorithm to more
closely match that of the Monaco editor on the system. It provides a
technique for detecting ambiguous characters and relaxes the detection
of combining marks.
Control characters are in addition detected as invisible in this
implementation whereas they are not on monaco but this is related to
font issues.
Close #19913
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Don't treat BOM escape sequence as hidden character.
- BOM sequence is a common non-harmfull escape sequence, it shouldn't be
shown as hidden character.
- Follows GitHub's behavior.
- Resolves #18837
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
There is a potential panic due to a mistaken resetting of the length parameter when
multibyte characters go over a read boundary.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Fix #17514
Given the comments I've adjusted this somewhat. The numbers of characters detected are increased and include things like the use of U+300 to make à instead of à and non-breaking spaces.
There is a button which can be used to escape the content to show it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Gwyneth Morgan <gwymor@tilde.club>
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>