Remove all empty lines

后端 未结 7 1517
误落风尘
误落风尘 2020-11-27 17:22

I thought that wasn\'t that hard to do, but I want to remove all empty lines (or lines just containing blanks and tabs in Java) with String.replaceAll.

My regex look

相关标签:
7条回答
  • 2020-11-27 17:52

    I'm not a day-to-day Java programmer, so I'm surprised there isn't a simpler way to do this in the JDK than a regex.

    Anyway,

    s = s.replaceAll("\n+", "\n");
    

    would be a bit simpler.

    Update:

    Sorry I missed that you wanted to also remove spaces and tabs.

    s = s.replaceAll("\n[ \t]*\n", "\n");
    

    Would work if you have consistent newlines. If not, you may want to consider making them consistent. E.g.:

    s = s.replaceAll("[\n\r]+", "\n");
    s = s.replaceAll("\n[ \t]*\n", "\n");
    
    0 讨论(0)
  • 2020-11-27 17:52

    If want to remove the lines from Microsoft Office, Windows or an text editor which supports regular expression rendering:

     1. Press <kbd>Ctrl</kbd> + <kbd>F</kbd>.
     2. Check the regular expression checkbox
     3. Enter Expression ^\s*\n into the find box as it is.
    

    You will see all you black spaces into your editor disappears...

    0 讨论(0)
  • 2020-11-27 18:03

    I have some code without using regexp, just import org.apache.commons.lang3.StringUtils;

      File temporaire = new File("temp.txt");
      try {
        Scanner scanner = new Scanner(yourfile);
        BufferedWriter bw = new BufferedWriter(new FileWriter(temporaire));
        while (scanner.hasNextLine()) {
          String line = StringUtils.stripEnd(scanner.nextLine(),null); // Clean blanks at the end of the line
          if (StringUtils.isNotBlank(line)) {
            bw.write(line); // Keep the line only if not blank
            if (scanner.hasNextLine()){
              // Go to next line (Win,Mac,Unix) if there is one
              bw.write(System.getProperty("line.separator"));
            }
          }
          bw.flush();
        }
        scanner.close();
        bw.close();
        fichier.delete();
        temporaire.renameTo(fichier);
      }
      catch (FileNotFoundException e) {
        System.out.println(e.getMessage());
      }
      catch (IOException e) {
        System.out.println(e.getMessage());
      }
    }
    
    0 讨论(0)
  • 2020-11-27 18:04

    Bart Kiers's answer is missing the edge case where the last line of the string is empty or contains whitespaces.

    If you try

    String text = "line 1\n\nline 3\n\n\nline 5\n "; // <-- Mind the \n plus space at the end!
    String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");
    

    you'll get a String that equals this

    "line 1\nline 3\nline 5\n " // <-- MIND the \n plus space at the end!
    

    as result.

    I expanded Bart Kiers' answer to also cover this case.

    My regex pattern is:

    String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
    

    A little explanation:

    The first part of the pattern is basically the same as Bart Kiers'. It is fine, but it does not remove an "empty" last line or a last line containing whitespaces.

    That is because a last line containing just whitespaces does not end with \\r?\\n and would therefore not be matched/replaced. We need something to express this edge case. That's where the second part (after the |) comes in.

    It uses a regular expression speciality: negative lookahead. That's the (?!.*\\r?\\n) part of the pattern. (?! marks the beginning of the lookahead. You could read it as: Match the regular expression before the lookahead if it is not followed by whatever is defined as string that must not follow. In our case: not any character (zero or more times) followed by a carriage-return (0 or 1 times) and a newline: .*\\r?\\n. The ) closes the lookahead. The lookahead itself is not part of the match.

    If I execute the following code snippet:

    String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
    String replacement = "";
    String inputString =
            "\n" +
            "Line  2 - above line is empty without spaces\n" +
            "Line  3 - next is empty without whitespaces\n" +
            "\n" +
            "Line  5 - next line is with whitespaces\n" +
            "        \n" +
            "Line  7 - next 2 lines are \"empty\". First one with whitespaces.\n" +
            "        \r\n" +
            "\n" +
            "Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line " +
            "\n" +
            "          \n" +
            "\n";
    
    String ajdustedString = inputString.replaceAll(pattern, replacement);
    System.out.println("inputString:");
    System.out.println("+----");
    System.out.println(inputString);
    System.out.println("----+");
    System.out.println("ajdustedString:");
    System.out.println("+----");
    System.out.print(ajdustedString); //MIND the "print" instead of "println"
    System.out.println("|EOS"); //String to clearly mark the _E_nd _O_f the adjusted_S_tring
    System.out.println("----+");
    

    I get:

    inputString:
    +----
    
    Line  2 - above line is empty without spaces
    Line  3 - next is empty without whitespaces
    
    Line  5 - next line is with whitespaces
    
    Line  7 - next 2 lines are "empty". First one with whitespaces.
    
    
    Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line
    
    
    
    ----+
    ajdustedString:
    +----
    Line  2 - above line is empty without spaces
    Line  3 - next is empty without whitespaces
    Line  5 - next line is with whitespaces
    Line  7 - next 2 lines are "empty". First one with whitespaces.
    Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line |EOS
    ----+
    

    If you want to learn more about lookahead/lookbehind see Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions:

    0 讨论(0)
  • 2020-11-27 18:14

    You can remove empty lines from your code using the following code:

    String test = plainTextWithEmptyLines.replaceAll("[\\\r\\\n]+","");
    

    Here, plainTextWithEmptyLines denotes the string having the empty lines. [\\\r\\\n] is the regex pattern which is used to identify empty line breaks.

    0 讨论(0)
  • 2020-11-27 18:17

    Try this:

    String text = "line 1\n\nline 3\n\n\nline 5";
    String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");
    // ...
    

    Note that the regex [ |\t] matches a space, a tab or a pipe char!

    EDIT

    B.t.w., the regex (?m)^\s+$ would also do the trick.

    0 讨论(0)
提交回复
热议问题