问题
I just tried the new text block feature in Java 13 and encountered a small issue.
I have read this article from Jaxcenter.
The closing triple quotation marks will affect the format.
String query = """
select firstName,
lastName,
email
from User
where id= ?
""";
System.out.println("SQL or JPL like query string :\n" + query);
This above format works well. To align with the the closing delimiter ("""), the multiline string left spaces before every lines.
But when I tried to compare the following two text block string, they are same format in the output console, but they are not equals, even after stripIntent
.
String hello = """
Hello,
Java 13
""";
String hello2 = """
Hello,
Java 13
""";
System.out.println("Hello1:\n" + hello);
System.out.println("Hello2:\n" + hello);
System.out.println("hello is equals hello2:" + hello.equals(hello2));
System.out.println("hello is equals hello2 after stripIndent():" + hello.stripIndent().equals(hello2.stripIndent()));
The output console is like:
hello is equals hello2:false
hello is equals hello2 after stripIndent():false
I am not sure where is wrong, or this is a text block design purpose?
Update: Just print hello2 stripIntent,
System.out.println("hello2 after stripIntent():\n" + hello2.stripIndent());
The whitespaces before every lines are NOT removed by stripIntent
as expected.
Updated: After read the related java doc, I think after the text block is compiled, it should has stripped the left intents of the lines in the block. What is the purpose of stripIntent
for text block? I know it is easy to understand when use it on a normal string.
The complete code is here.
回答1:
There is a concept of incidental white space.
JEP 355: Text Blocks (Preview)
Compile-time processing
A text block is a constant expression of type String, just like a string literal. However, unlike a string literal, the content of a text block is processed by the Java compiler in three distinct steps:
Line terminators in the content are translated to LF (\u000A). The purpose of this translation is to follow the principle of least surprise when moving Java source code across platforms.
Incidental white space surrounding the content, introduced to match the indentation of Java source code, is removed.
Escape sequences in the content are interpreted. Performing interpretation as the final step means developers can write escape sequences such as \n without them being modified or deleted by earlier steps.
...
Incidental white space
Here is the HTML example using dots to visualize the spaces that the developer added for indentation:
String html = """ ..............<html> .............. <body> .............. <p>Hello, world</p> .............. </body> ..............</html> ..............""";
Since the opening delimiter is generally positioned to appear on the same line as the statement or expression which consumes the text block, there is no real significance to the fact that 14 visualized spaces start each line. Including those spaces in the content would mean the text block denotes a string different from the one denoted by the concatenated string literals. This would hurt migration, and be a recurring source of surprise: it is overwhelmingly likely that the developer does not want those spaces in the string. Also, the closing delimiter is generally positioned to align with the content, which further suggests that the 14 visualized spaces are insignificant.
...
Accordingly, an appropriate interpretation for the content of a text block is to differentiate incidental white space at the start and end of each line, from essential white space. The Java compiler processes the content by removing incidental white space to yield what the developer intended.
Your assumption that
Hello,
Java 13
<empty line>
equals
....Hello,
....Java 13
<empty line>
is inaccurate since those are essential white spaces and they will not be removed by either the compiler or String#stripIndent.
To make it clear, let's keep representing an incidental white space as a dot.
String hello = """
....Hello,
....Java 13
....""";
String hello2 = """
Hello,
Java 13
""";
Let's print them.
Hello,
Java 13
<empty line>
Hello,
Java 13
<empty line>
Let's call String#stripIndent on both and print the results.
Hello,
Java 13
<empty line>
Hello,
Java 13
<empty line>
To understand why nothing has changed, we need to look into the documentation.
String#stripIndent
Returns a string whose value is this string, with incidental white space removed from the beginning and end of every line.
Then, the minimum indentation (min) is determined as follows. For each non-blank line (as defined by isBlank()), the leading white space characters are counted. The leading white space characters on the last line are also counted even if blank. The min value is the smallest of these counts.
For each non-blank line, min leading white space characters are removed, and any trailing white space characters are removed. Blank lines are replaced with the empty string.
For both String
s, the minimum indentation is 0
.
Hello, // 0
Java 13 // 0 min(0, 0, 0) = 0
<empty line> // 0
Hello, // 4
Java 13 // 4 min(4, 4, 0) = 0
<empty line> // 0
String#stripIndent gives developers access to a Java version of the re-indentation algorithm used by the compiler.
JEP 355
The re-indentation algorithm will be normative in The Java Language Specification. Developers will have access to it via
String::stripIndent
, a new instance method.Specification for JEP 355
The string represented by a text block is not the literal sequence of characters in the content. Instead, the string represented by a text block is the result of applying the following transformations to the content, in order:
Line terminators are normalized to the ASCII LF character (...)
Incidental white space is removed, as if by execution of
String::stripIndent
on the characters in the content.Escape sequences are interpreted, as in a string literal.
回答2:
TLDR. Your example strings are not equal and it is correct that Java tells you that they are not equal.
Consider reading a description of the String.stripIndent
method.
Here is a paraphrase from a jaxenter.com post:
The stripIndent method removes whitespace in front of multi-line strings that all lines have in common, i.e. moves the entire text to the left without changing the formatting.
Note the words "that all lines have in common".
Now, apply "that all lines have in common" to the following literal string:
String hello2 = """
Hello,
First, notice that the final line of this example has zero spaces.
Next, notice that all other lines of this example have non-zero spaces.
"""; // <--- This is a line in the text block.
The key take away is "0 != 3".
回答3:
Testing with jshell
:
String hello = """
Hello,
Java 13
""";
hello.replace(" ", ".");
results in
"Hello\nJava13\n"
note: no spaces at all
String hello2 = """
Hello,
Java 13
""";
hello2.replace(" ", ".");
results in
"....Hello\n....Java13\n"
Note that both results do NOT have spaces in the last line, after the last \n
, so stripIndent()
does not strip any spaces
stripIndent()
does the same as the compiler does with text blocks. Example
String hello3 = ""
+ " Hello\n"
+ " Java13\n"
+ " ";
hello3.stripIndent().replace(" ", ".");
results in
"..Hello\n..Java13\n"
that is, two spaces removed from all 3 lines; two spaces since the last line has 2 spaces (the other lines have more, so at most 2 spaces can be removed from all lines)
来源:https://stackoverflow.com/questions/58030419/how-the-intents-processed-in-a-text-blockjava-13