Skip to content

Commit 2292e5f

Browse files
committed
update doc
1 parent 15dfe10 commit 2292e5f

1 file changed

Lines changed: 45 additions & 4 deletions

File tree

Rev2-Notes.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,59 @@
44

55
The first version looked at a window of tokens and rule context information to predict: injecting newlines, injecting whitespace, alignment, and indentation. It works surprisingly well but it makes no guarantees that things will line up properly. For example we had to add a new feature so that {...} had both curlies on the same line if the statements in the block were on the same line.
66

7-
The new approach is to look at subtree structures and identify the most common patterns.
7+
The new approach is to look at subtree structures and identify the most common patterns. The idea is to figure out what whitespace, if any, to inject in between siblings of a parse subtree. As we walk down the tree, we have `currentColumn` information. We process each subtree after processing all of its children, so that we have information about whether lists got broken across lines etc.
88

99
## Token dependencies
1010

11-
Here is an example where we want the `}` to line up with the `void`, but those tokens are in a subtree. On the other hand, we can always ask whether or not the last token for a subtree, `}` here, aligns with another token.
11+
Here is a simple java method definition:
1212

1313
```java
1414
void f(int i, int j) {
1515
}
1616
```
1717

18-
<img src="images/method-def.png" width=400>
18+
and associated parse tree:
19+
20+
<img src="images/method-def.png" width=500>
21+
22+
The parentheses around the parameter list are codependent and the decisions to inject white space after the `(` and before the `)` often depend on whether the `formalParameterList` child gets split across multiple lines. In this case, the parameters are all on a single line so we might train:
23+
24+
| Features | Prediction |
25+
| ------------- |:-------------:|
26+
| (root=formalParameters,`(`, formalParameterList, formalParameterList-same-line) | none |
27+
| (root=formalParameters, formalParameterList, formalParameterList-same-line, `)`) | none |
28+
29+
If we decide to split the parameters across lines, it would not force white space before and after the parentheses; e.g.,
30+
31+
```java
32+
void f(int i,
33+
int j) {
34+
}
35+
```
36+
37+
But, we might see examples like this:
38+
39+
```java
40+
void f(
41+
int i,
42+
int j
43+
) {
44+
}
45+
```
46+
47+
| Features | Prediction |
48+
| ------------- |:-------------:|
49+
| (root=formalParameters,`(`, formalParameterList, formalParameterList-split-lines) | inject \n, indent |
50+
| (root=formalParameters, formalParameterList, formalParameterList-split-lines, `)`) | inject \n, no indent |
51+
52+
When processing the `formalParameterList` child, we only decided to align but did not make the decision to indent. That is the decision for `formalParameters`. We treat the output of `formalParameterList` almost like a big character.
53+
54+
This implies that `formalParameterList` does not know a precise starting column. We would feed it the column of the `(` for it to make a decision, but then we might indent it. That implies that we don't get a string back but rather an `V` (BOX terminology for vertical alignment) operator on that child.
55+
56+
old stuff:
57+
58+
Here is an example where we want the `}` to line up with the `void`, but those tokens are in a subtree. On the other hand, we can always ask whether or not the last token for a subtree, `}` here, aligns with another token.
59+
1960

2061
| Features | Prediction |
2162
| ------------- |:-------------:|
@@ -83,7 +124,7 @@ For Java, such as:
83124
}
84125
```
85126

86-
<img src="images/method-body.png" width=400>
127+
<img src="images/method-body.png" width=500>
87128

88129
We get:
89130

0 commit comments

Comments
 (0)