I thought that wasn\'t that hard to do, but I want to remove all empty lines (or lines just containing blanks and tabs in Java) with String.replaceAll.
My regex look
I'm not a day-to-day Java programmer, so I'm surprised there isn't a simpler way to do this in the JDK than a regex.
Anyway,
s = s.replaceAll("\n+", "\n");
would be a bit simpler.
Update:
Sorry I missed that you wanted to also remove spaces and tabs.
s = s.replaceAll("\n[ \t]*\n", "\n");
Would work if you have consistent newlines. If not, you may want to consider making them consistent. E.g.:
s = s.replaceAll("[\n\r]+", "\n");
s = s.replaceAll("\n[ \t]*\n", "\n");
If want to remove the lines from Microsoft Office, Windows or an text editor which supports regular expression rendering:
1. Press <kbd>Ctrl</kbd> + <kbd>F</kbd>.
2. Check the regular expression checkbox
3. Enter Expression ^\s*\n into the find box as it is.
You will see all you black spaces into your editor disappears...
I have some code without using regexp, just import org.apache.commons.lang3.StringUtils;
File temporaire = new File("temp.txt");
try {
Scanner scanner = new Scanner(yourfile);
BufferedWriter bw = new BufferedWriter(new FileWriter(temporaire));
while (scanner.hasNextLine()) {
String line = StringUtils.stripEnd(scanner.nextLine(),null); // Clean blanks at the end of the line
if (StringUtils.isNotBlank(line)) {
bw.write(line); // Keep the line only if not blank
if (scanner.hasNextLine()){
// Go to next line (Win,Mac,Unix) if there is one
bw.write(System.getProperty("line.separator"));
}
}
bw.flush();
}
scanner.close();
bw.close();
fichier.delete();
temporaire.renameTo(fichier);
}
catch (FileNotFoundException e) {
System.out.println(e.getMessage());
}
catch (IOException e) {
System.out.println(e.getMessage());
}
}
Bart Kiers's answer is missing the edge case where the last line of the string is empty or contains whitespaces.
If you try
String text = "line 1\n\nline 3\n\n\nline 5\n "; // <-- Mind the \n plus space at the end!
String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");
you'll get a String that equals this
"line 1\nline 3\nline 5\n " // <-- MIND the \n plus space at the end!
as result.
I expanded Bart Kiers' answer to also cover this case.
My regex pattern is:
String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
A little explanation:
The first part of the pattern is basically the same as Bart Kiers'. It is fine, but it does not remove an "empty" last line or a last line containing whitespaces.
That is because a last line containing just whitespaces does not end with \\r?\\n
and would therefore not be matched/replaced. We need something to express this edge case. That's where the second part (after the |
) comes in.
It uses a regular expression speciality: negative lookahead. That's the (?!.*\\r?\\n)
part of the pattern. (?!
marks the beginning of the lookahead. You could read it as: Match the regular expression before the lookahead if it is not followed by whatever is defined as string that must not follow. In our case: not any character (zero or more times) followed by a carriage-return (0 or 1 times) and a newline: .*\\r?\\n
. The )
closes the lookahead. The lookahead itself is not part of the match.
If I execute the following code snippet:
String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
String replacement = "";
String inputString =
"\n" +
"Line 2 - above line is empty without spaces\n" +
"Line 3 - next is empty without whitespaces\n" +
"\n" +
"Line 5 - next line is with whitespaces\n" +
" \n" +
"Line 7 - next 2 lines are \"empty\". First one with whitespaces.\n" +
" \r\n" +
"\n" +
"Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line " +
"\n" +
" \n" +
"\n";
String ajdustedString = inputString.replaceAll(pattern, replacement);
System.out.println("inputString:");
System.out.println("+----");
System.out.println(inputString);
System.out.println("----+");
System.out.println("ajdustedString:");
System.out.println("+----");
System.out.print(ajdustedString); //MIND the "print" instead of "println"
System.out.println("|EOS"); //String to clearly mark the _E_nd _O_f the adjusted_S_tring
System.out.println("----+");
I get:
inputString: +---- Line 2 - above line is empty without spaces Line 3 - next is empty without whitespaces Line 5 - next line is with whitespaces Line 7 - next 2 lines are "empty". First one with whitespaces. Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line ----+ ajdustedString: +---- Line 2 - above line is empty without spaces Line 3 - next is empty without whitespaces Line 5 - next line is with whitespaces Line 7 - next 2 lines are "empty". First one with whitespaces. Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line |EOS ----+
If you want to learn more about lookahead/lookbehind see Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions:
You can remove empty lines from your code using the following code:
String test = plainTextWithEmptyLines.replaceAll("[\\\r\\\n]+","");
Here, plainTextWithEmptyLines
denotes the string having the empty lines. [\\\r\\\n]
is the regex pattern which is used to identify empty line breaks.
Try this:
String text = "line 1\n\nline 3\n\n\nline 5";
String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");
// ...
Note that the regex [ |\t]
matches a space, a tab or a pipe char!
B.t.w., the regex (?m)^\s+$
would also do the trick.