fix: graphemes are not correctly highlighted

Graphemes don't all have the same width, not even when you use a monospace font.
For latin characters it usually works find to assume the same width. But emojis,
japanese or chinese characters have have different width. There are even some
ultra wide characters like 𒐫 or ﷽. There is also a thing
called 'half-width' character. E.g. the japanese 'a' can be ア or ア.

Fixed by actually computing the width of graphemes and using pixel.
This commit is contained in:
2025-03-23 21:00:53 +01:00
parent 21b2da1e69
commit 61132d242f
6 changed files with 181 additions and 86 deletions

View File

@@ -136,7 +136,7 @@ class LogFileModel:
return re.match(r"\w", char) is not None
def prune_cache(self, range_start: int, range_end: int):
print(f"cache size: {len(self._line_cache.keys())}")
# print(f"cache size: {len(self._line_cache.keys())}")
for key in list(self._line_cache.keys()):
line = self._line_cache[key]
if range_start > line.byte_end() or line.byte_offset() > range_end:
@@ -171,7 +171,7 @@ class LogFileModel:
new_offset = f.tell()
if 0 <= range_end < new_offset:
break
line = Line(offset, new_offset, line_bytes.decode("utf8", errors="ignore"))
line = Line(offset, new_offset, line_bytes.decode("utf8", errors="ignore"), line_bytes)
if previous_line_is_complete: # only cache lines when we know they are complete
self._line_cache[offset] = line
offset = new_offset