1. fix a performance regression when using line-by-line highlighting
* the root cause is that chroma's `lexers.Get` is slow and a lexer cache
is missing during recent changes
2. clarify the chroma lexer detection behavior
* now we fully manage our logic to detect lexer, and handle overriding
problems, everything is fully under control
3. clarify "code analyze" behavior, now only 2 usages:
* only use file name and language to detect lexer (very fast), mainly
for "diff" page which contains a lot of files
* if no lexer is detected by file name and language, use code content to
detect again (slow), mainly for "view file" or "blame" page, which can
get best result
4. fix git diff bug, it caused "broken pipe" error for large diff files
* Fix#35252
* Fix#35999
* Improve diff rendering, don't add unnecessary "added"/"removed" tags for a full-line change
* Also fix a "space trimming" bug in #36539 and add tests
* Use chroma "SQL" lexer instead of "MySQL" to workaround a bug (35999)
Fix#33358, fix#21970
This adds a step in the `GitDiffForRender` that does syntax highlighting for the
entire file and then only references lines from that syntax highlighted
code. This allows things like multi-line comments to be syntax
highlighted correctly.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
The behaviour of `PreventSurroundingPre` has changed in
https://github.com/alecthomas/chroma/pull/618 so that apparently it now
causes line wrapper tags to be no longer emitted, but we need some form
of indication to split the HTML into lines, so I did what
https://github.com/yuin/goldmark-highlighting/pull/33 did and added the
`nopWrapper`.
Maybe there are more elegant solutions but for some reason, just
splitting the HTML string on `\n` did not work.
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Use Unicode placeholders to replace HTML tags and HTML entities first, then do diff, then recover the HTML tags and HTML entities. Now the code diff with highlight has stable behavior, and won't emit broken tags.