Deleting all regex instances starting with char '[' and ending with char ']' from a String

社会主义新天地 提交于 2020-01-06 02:14:06

问题


I need to take a String and deleting all the regexes in it starting with character '[' and ending with character ']'.

Now i don't know how to tackle this problem. I tried to convert the String to character array and then putting empty characters from any starting '[' till his closing ']' and then convert it back to a String using toString() method.

MyCode:

char[] lyricsArray = lyricsParagraphElements.get(1).text().toCharArray();
                for (int i = 0;i < lyricsArray.length;i++)
                {
                    if (lyricsArray[i] == '[')
                    {
                        lyricsArray[i] = ' ';
                        for (int j = i + 1;j < lyricsArray.length;j++)
                        {
                            if (lyricsArray[j] == ']')
                            {
                                lyricsArray[j] = ' ';
                                i = j + 1;
                                break;
                            }
                            lyricsArray[j] = ' ';
                        }   
                    }
                }
                String songLyrics = lyricsArray.toString();
                System.out.println(songLyrics);

But in the print line of songLyrics i get weird stuff like

[C@71bc1ae4
[C@6ed3ef1
[C@2437c6dc
[C@1f89ab83
[C@e73f9ac
[C@61064425
[C@7b1d7fff
[C@299a06ac
[C@383534aa
[C@6bc168e5

I guess there is a simple method for it. Any help will be very appreciated.

For clarification: converting "abcd[dsadsadsa]efg[adf%@1]d" Into "abcdefgd"


回答1:


Or simply use a regular expression to replace all occurences of \\[.*\\] with nothing:

String songLyrics = text.replaceAll("\\[.*?\\]", "");

Where text is ofcourse:

String text = lyricsParagraphElements.get(1).text();

What does \\[.*\\] mean?

The first parameter of replaceAll is a string describing a regular expression. A regular expression defines a pattern to match in a string.

So let's split it up:

\\[ matches exactly the character [. Since [ has a special meaning within a regular expression, it needs to be escaped (twice!).

. matches any character, combine this with the (lazy) zero-or-more operator *?, and it will match any character until it finally finds:

\\], which matches the character ]. Note the escaping again.




回答2:


Your code below is referencing to the string object and you are then printing the reference songLyrics.

String songLyrics = lyricsArray.toString();
System.out.println(songLyrics);

Replace above two lines with

String songLyrics = new String(lyricsArray);
System.out.println(songLyrics);

Ideone1

Other way without converting it into char array and again to string.

String lyricsParagraphElements = "asdasd[asd]";

String songLyrics = lyricsParagraphElements.replaceAll("\\[.*\\]", "");

System.out.println(songLyrics);

Ideone2




回答3:


You're printing a char[] and Java char[] does not override toString(). And, a Java String is immutable, but Java does have StringBuilder which is mutable (and StringBuilder.delete(int, int) can remove arbitrary substrings). You could use it like,

String songLyrics = lyricsParagraphElements.get(1).text();
StringBuilder sb = new StringBuilder(songLyrics);
int p = 0;
while ((p = sb.indexOf("[", p)) >= 0) {
    int e = sb.indexOf("]", p + 1);
    if (e > p) {
        sb.delete(p, e + 1);
    }
    p++;
}
System.out.println(sb);



回答4:


You are getting "weird stuff" because you are printing the string representation of the array, not converting the array to a String.

Instead of lyricsArray.toString(), use

new String(lyricsArray);

But if you do this, you will find that you are not actually removing characters from the string, just replacing them with spaces.

Instead, you can shift all of the characters left in the array, and construct the new String only up to the right number of characters:

int src = 0, dst = 0;
while (src < lyricsArray.length) {
  while (src < lyricsArray.length && lyricsArray[src] != '[') {
    lyricsArray[dst++] = lyricsArray[src++];
  }
  if (src < lyricsArray.length) {
    ++src;
    while (src - 1 < lyricsArray.length && lyricsArray[src - 1] != ']') {
      src++;
    }
  }
}
String lyricsString = new String(lyricsArray, 0, dst);



回答5:


This is exactly regex string for your case:

\\[([\\w\\%\\@]+)\\]

It's very hard when your plant string is contain special symbol. I can't find shorter regex, without explain special symbol like an exception. reference: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#cg

================

I'm read your new case, a string contain symbol "-" or something else in !"#$%&'()*+,-./:;<=>?@\^_`{|}~ add them (with prefix "\\") after \\@ on my regex string.



来源:https://stackoverflow.com/questions/36136162/deleting-all-regex-instances-starting-with-char-and-ending-with-char-fro

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!