Address EditorConfig violations#158550
Conversation
|
Some changes occurred in HTML/CSS/JS. |
|
Thanks for the pull request, and welcome! The Rust Project is excited to review your changes, and you should hear from @jdno (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
Why was this reviewer chosen?The reviewer was selected based on:
|
| # For example, the indentation in lists depend on the length of the list marker | ||
| # (such as '- ' or '100. '), at least in the GitHub Markdown flavor: | ||
| # https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#nested-lists | ||
| # Therefore, pressing tab should insert a single space to encourage situative accurate indentations. |
There was a problem hiding this comment.
It is maybe a matter of taste if inserting only one space with a tab is a nice solution for this or not. Another one would be to keep the default indent_size = 4 for Markdown files and ignore them when checking for EditorConfig violations.
| # these specific files have no trailing newline to test on these | ||
| [tests/ui/{lint/unused_parens_multibyte_recovery.rs,parser/{macro/macro-missing-right-paren.rs,missing_right_paren.rs}}] | ||
| insert_final_newline = false | ||
| [tests/{pretty/issue-74745.rs,ui/{parser/issues/issue-{62524,68730,58094-missing-right-square-bracket}.rs,type/issue-91268.rs}}] |
There was a problem hiding this comment.
These files all have the name pattern issue-*.rs. Using the pattern would reduce the maintenance effort, if new such files without final newlines gets added to the project. However, not all files with that pattern are without a final newline.
| tests/run-make-cargo/thumb-none-qemu/example/memory.x | ||
| src/bootstrap/mk/Makefile.in | ||
| src/etc/rust_analyzer_eglot.el | ||
| src/librustdoc/html/static/fonts/README.txt |
There was a problem hiding this comment.
The content of this README looks almost like Markdown. Only some formatting bugs may need to be fixed. Should that be changed to a Markdown file? If the content and file ending would be changed, it would not be necessary to ignore it here.
| library/std/src/sys/pal/sgx/abi/entry.S | ||
| tests/pretty/block-comment-wchar.pp | ||
| tests/run-make-cargo/thumb-none-qemu/example/memory.x | ||
| src/bootstrap/mk/Makefile.in |
There was a problem hiding this comment.
This Makefile uses two styles of indentation. First, for targets of a Makefile you need to use tab characters for indentation. Otherwise, it is syntactically wrong. Second, there is some further code at the beginning of this file and there 2 spaces are in use for indentation. I do not know if using only tab character for indentation would be fine for this file or not.
There was a problem hiding this comment.
Is that "according to POSIX" or "according to what is accepted by GNU Make"? Are you quite sure we are otherwise compatible with versions of make that are not GNU Make and that this is the only thing holding us back?
There was a problem hiding this comment.
Most of the *.check files had no final newline. Although, tests (./x test) still pass on my machine after adding these final newlines, I want to ask if you may prefer to have no final newlines in these files. If so I could remove them from all the *.check files and set insert_final_newline = false for them.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
In tidy we ignore all test files for style lints. At the same time it is still useful to not include it in the editorconfig ignore to increase the chance of new or changed tests having a proper format. Maybe the test changes should be reverted? |
| # these specific files have an UTF-8 BOM to test with that charset | ||
| [tests/ui/codemap_tests/utf8-bom.rs] | ||
| charset = utf-8-bom | ||
| # these specific files have CRLF line endings to test on them | ||
| [tests/{rustdoc-ui/intra-doc/warning-crlf.rs,ui/{frontmatter/frontmatter-crlf.rs,lexer/lexer-crlf-*.rs}}] | ||
| end_of_line = crlf | ||
| # these specific files have an UTF-8 BOM and CRLF line endings to test on these | ||
| [tests/ui/{include-macros/data.bin,json/json-bom-plus-crlf*.rs}] | ||
| charset = utf-8-bom | ||
| end_of_line = crlf |
There was a problem hiding this comment.
This is now a third location (next to .gitattributes and // ignore-tidy-cr in the test itself) where CRLF containing tests are placed. This is bound to go out of sync: #157184
| @@ -1,4 +1,4 @@ | |||
| <?xml version="1.0" encoding="utf-8"?> | |||
| <?xml version="1.0" encoding="utf-8"?> | |||
There was a problem hiding this comment.
What changed here? Github is showing mojibake in the original file. Was that a UTF-8 BOM or something?
There was a problem hiding this comment.
This file had the charset utf-8-bom instead of utf-8. Therefore, I removed the BOM, which is not visible in the diff here. (I do not see any mojibake here. GitHub shows me no visible difference in this line.)
This could be an approach, but the options |
|
Editor autoformat can also result in unintended changes as well as editor defaults that already insert a final newline and trim trailing whitespace like I have myself. You have to be careful what changes you commit either way. Unsetting |
|
You are right, better would be to set |
This comment has been minimized.
This comment has been minimized.
As Cargo.lock and yarn.lock are normally created automatically, we do not want using a different format when editing it manually.
The auxiliary vector files are binary files.
They use all two spaces for indentation. The intrinsic.natvis differed in two ways from the others: - It had a UTF-8 BOM. - It had two multiline comments in a different format. (No new line after <!--)
There are two file extensions for YAML and the '.yaml' currently also use two spaces for indentation.
All JSON and JSON5 files in the repostiory use two spaces for indentation.
Mostly fixing the indentation size (use 4 spaces everywhere and not in some file 2 and some 4 for the same situations). Furthermore, insert final newline for one of them.
In several file types the mixing of different formats (mostly indent_size) seems to be intentional.
All other Cargo.toml files already use 4 spaces for indentation.
There is often alignment which "breaks" the indentation rules. Furthermore, some files use 4 spaces and some 2 spaces, but these indentation errors cannot be separated from the alignments.
All three nix file use 2 spaces for indentation.
The most shell scripts use 4 spaces for indentation. The set of scripts using (only) 2 spaces is smaller.
When git creates or modifies this file, it uses tab characters.
./x test failed early due to too long line
Only one of them had already a final newline. All tests still pass after these insertions (executed with ./x test).
Adjust shell scripts, without .sh extension, to 4 spaces, ignore remaining mixed files and adjust EditorConfig for some files.
Remove all added final newlines as test like tests/ui/argfile/commandline-argfile-badutf8.rs fail in the CI job aarch64-gnu-llvm-21-1 Therefore, set insert_final_newline = false for all the args files for the tests as they seem to have no final newline intentional.
Removing the final newlines in this two files did not undo all changes in these files, which lead to similar errors as before.
4a53e21 to
b269a27
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
☔ The latest upstream changes (presumably #158169) made this pull request unmergeable. Please resolve the merge conflicts by rebasing. |
There was a problem hiding this comment.
I do not think we should adopt ecformat_check. It does not seem helpful to have a tool that promises to speculate constantly as to the "true identity" of all files added to our repo and edit them so they conform. Our tidy checks, as currently implemented, govern specific directories in specific ways for the precise reason that bringing the entire repo "in line" is not that helpful.
| [tests/ui/{frontmatter/frontmatter-whitespace-3.rs,parser/shebang/shebang-space.rs}] | ||
| [tests/ui/{frontmatter/{fence-whitespace-trailing-1.rs,frontmatter-whitespace-3.rs},parser/shebang/shebang-space.rs}] | ||
| trim_trailing_whitespace = false | ||
| # these specific files have an UTF-8 BOM to test with that charset |
There was a problem hiding this comment.
The use of the article "a" or "an" is governed by the sound, and the next sound remains the same regardless of whether of whether you pronounce it /ˈju.tɪ.ɪˈf/ (or similar) or /ˈjunɪˌkoʊd/ (transformation format...)
| # these specific files have an UTF-8 BOM to test with that charset | |
| # these specific files have a UTF-8 BOM to test with that charset |
| # You should NOT modify these files with your editor, | ||
| # but if you do it anyway use the correct formatting. | ||
| [Cargo.lock] | ||
| indent_size = 1 | ||
| [yarn.lock] | ||
| indent_size = 2 |
There was a problem hiding this comment.
I do not believe this indentation is a stable detail that is mandated as part of the format of these lockfiles, so no, I do not think there should be an admonishment to "use the correct formatting" as it can become incorrect with a small diff to Cargo or Yarn and there is no real reason to prevent such.
| indent_size = 1 | ||
|
|
||
| [*.yml] | ||
| [*.{json,json5,yml,yaml,askama,ll,natvis,nix}] |
There was a problem hiding this comment.
If we were to add such an indentation rule, I do not think it should group all these unrelated kinds of files such that they would all get changed together. This opinion generalizes across other changes this PR makes, as it enthusiastically uses expansion rules in ways that don't necessarily align with natural groupings.
| library/std/src/sys/pal/sgx/abi/entry.S | ||
| tests/pretty/block-comment-wchar.pp | ||
| tests/run-make-cargo/thumb-none-qemu/example/memory.x | ||
| src/bootstrap/mk/Makefile.in |
There was a problem hiding this comment.
Is that "according to POSIX" or "according to what is accepted by GNU Make"? Are you quite sure we are otherwise compatible with versions of make that are not GNU Make and that this is the only thing holding us back?
| jnz .Lelf_entry_call | ||
|
|
||
| .Lelf_exit: | ||
| .Lelf_exit: |
There was a problem hiding this comment.
Please do not change so much as a single character of an assembly file if there is no technical need and you are not willing to verify the SGX assembly still works as-intended on the platform it is used on, which is not something our test suite covers.
View all comments
This Pull Request is part of an academic research project. More preciously it is for my master's thesis, more background information here.
Definition: EditorConfig violations
The goal of this Pull Request is to address all EditorConfig violations in the project. With EditorConfig you have set for example
insert_final_newline = truefor most files, but there were files without a final newline. I call this an EditorConfig violation as a file content violates the respective EditorConfig property. They can have different kind of impacts. For example trailing spaces, which get remove with the next save, may only make the next diff / Pull Request to this file less readable. Some other may lead to bugs, such as adding final newlines to test files, which are not allowed to have them.How to address them?
To check for such violations and fix them I developed the CLI tool ecformat. However, there are different ways to address the violations:
insert_final_newline = falsefor them, which avoids automatic insert of a final newline on save.To do so
ecformat fixcan be helpful.ecformat checkfinds. I added a.ecformat_ignorefile for this. It has the common ignore syntax such as a.gitignore.To use such ignore files, provide the name of them with an CLI switch,
ecformat check --ignore-file .ecformat_ignore. (The.gitignorefiles are automatically considered inside a git repository.) There are different reasons to ignore files:indent_size = 4set, but to align some code with the previous line an indentation / alignment uses for example 5 spaces. Detecting such cases as alignment and not as indentation needs language specific parsing, which is out of scope for a generic EditorConfig utility. Such files need to be ignored fully byecformatas it currently (version 0.2.0) has no support to ignore only a single property for some files.Changes in this Pull Request
For this Pull Request, I addressed all violations in the project in the way, which looks the most correct one to me. (I focused on the Properties of the current EditorConfig specification 0.17.2.)
If I fixed file contents, I manually reviewed all of them, making adjustments in the
ecformat fixchanges were necessary. Furthermore,ecformatdetects the charset wrong for some files. Therefore, I would suggest disabling this property for whole project, with--disable-charsetor-c. (I plan to improve the charset detection in a later version.)If I picked the wrong way to address the violations for some files, please let me know with a respective Review comment.
Then I will address it in the correct way. I add Review comments myself for special parts, which are especially worthy of discussion. To support the Review of this big Pull Request, I tried to group the changes in meaningful commits with respective commit messages. Therefore, may take a look at the commit history.
Avoid EditorConfig violations in the future
Such EditorConfig violations can creep in again. Possible reasons can be that the editor of some contributors has no EditorConfig support, or it would be required to install a respective plugin for the support.
If you want to avoid them in the future, I would suggest integrating a respective utility. For example as part of the CI, some pre-commit scripts or similar places. One possibility is to use
ecformatwithecformat check --disable-charset --ignore-file .ecformat_ignore. Another one is editorconfig-checker, which is a more sophisticated tool without the possibility to fix violations. It provides more configuration options, such as ignoring violations in single lines with marker comments.Let me know if you are interested using one of them. I could add one in this Pull Request or a follow-up Pull Request.
If you not consider to use
ecformatin your project, the.ecformat_ignorefile is maybe not worth to merge into main. I can remove it if desired.