Commit Graph

9 Commits

Author SHA1 Message Date
zeripath bc4764ffc6
Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773)
Our character detection algorithm can potentially incorrectly detect utf-8 as iso-8859-x
if there is a truncated character at the end of the partially read file.

This PR changes the detection algorithm to truncated utf8 characters at the end of the
buffer.

Fix #19743

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-05-21 14:06:24 +01:00
6543 54e9ee37a7
format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
KN4CK3R f99d50fc9f
Read expected buffer size (#17409)
* Read expected buffer size.

* Changed name.
2021-10-24 22:12:43 +01:00
Eng Zer Jun f2e7d5477f
refactor: move from io/ioutil to io and os package (#17109)
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-09-22 13:38:34 +08:00
Lunny Xiao 9d99f6ab19
Refactor renders (#15175)
* Refactor renders

* Some performance optimization

* Fix comment

* Transform reader

* Fix csv test

* Fix test

* Fix tests

* Improve optimaziation

* Fix test

* Fix test

* Detect file encoding with reader

* Improve optimaziation

* reduce memory usage

* improve code

* fix build

* Fix test

* Fix for go1.15

* Fix render

* Fix comment

* Fix lint

* Fix test

* Don't use NormalEOF when unnecessary

* revert change on util.go

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

* rename function

* Take NormalEOF back

Co-authored-by: zeripath <art27@cantab.net>
2021-04-19 18:25:08 -04:00
zeripath a1ad188326
Fix chardet test and add ordering option (#11621)
* Fix chardet test and add ordering option

Signed-off-by: Andrew Thornton <art27@cantab.net>

* minor fixes

Signed-off-by: Andrew Thornton <art27@cantab.net>

* remove log

Signed-off-by: Andrew Thornton <art27@cantab.net>

* remove log2

Signed-off-by: Andrew Thornton <art27@cantab.net>

* only iterate through top results

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Update docs/content/doc/advanced/config-cheat-sheet.en-us.md

* slight restructure of for loop

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-06-02 19:20:19 -03:00
Antoine GIRARD 81a52442a1 deps: update and fix chardet import (#9351) 2019-12-14 02:15:48 +02:00
guillep2k 2628b15ee3 Fix utf8 tests (#8192)
* Prevent compiler environment from making the tests fail

* Remove unused function

* Pass lint
2019-09-21 13:01:34 -04:00
guillep2k 5a44be627c Convert files to utf-8 for indexing (#7814)
* Convert files to utf-8 for indexing

* Move utf8 functions to modules/base

* Bump repoIndexerLatestVersion to 3

* Add tests for base/encoding.go

* Changes to pass gosimple

* Move UTF8 funcs into new modules/charset package
2019-08-15 20:07:28 +08:00