Extracting bbcode quote using java and Android but not extracting the content within the quote tag

无人久伴 提交于 2021-02-16 15:18:09

问题


I am going to extract the bbcode with the quotes but to no avail when the actual output is coming.

I would like to implement the bbcode parsing module for extracting the quotes as desired output . The number of quotes shall be a recursive method or some else..

INput : 

Testing [quote]http://www.yourube.com?watch?v=asasdsadsa [url] aisa [/url] [/quote] Testing 

   Desired Output

Testing http://www.yourube.com?watch?v=asasdsadsa [url] aisa [/url] aisa Testing

Actual Output:

http://www.yourube.com?watch?v=asasdsadsa [url] aisa [/url]
http://www.yourube.com?watch?v=asasdsadsa  aisa 

The below is my code

        String s = "[quote]http://www.yourube.com?watch?v=asasdsadsa [url] aisa [/url][/quote]";
        String t = bbcode(s);
        System.out.println(t);
        String u = bbcode2(t);
        System.out.println(u);

 public static String bbcode(String text) {
        String html = text;

        HashMap<String,String> bbMap = new HashMap<String , String>();


        bbMap.put("\\[quote\\](.+?)\\[/quote\\]", "$1");


        for (Map.Entry entry: bbMap.entrySet()) {
            html = html.replaceAll(entry.getKey().toString(), entry.getValue().toString());
        }

        return html;
    }

       public static String bbcode2(String text) {
        String html = text;

        HashMap<String,String> bbMap = new HashMap<String , String>();



        bbMap.put("\\[quote\\](.+?)\\[/quote\\]", "$1");

        bbMap.put("\\[url\\](.+?)\\[/url\\]", "$1");

        for (Map.Entry entry: bbMap.entrySet()) {
            html = html.replaceAll(entry.getKey().toString(), entry.getValue().toString());
        }

        return html;
    }

回答1:


This is the general Java regex to match pairs of BB Code tags:

\\[([^\\]]+)\\](.+?)\\[/\\1\\]

This will grab top level matches e.g. in [a][b] hi [/b] hello [/a][c] yo [/c], group 2 will match [b] hi [\b] hello and yo. (Demonstrated here)


Any regex solution is in my opinion going to require you to use recursion (outside of the regex) to find all matches. You're going to have to find all top level matches (add them to some array), then recursively use the same regex on each of the matches (adding them all to the same result array) until eventually no matches more matches can be found.

In that example you can see you'd need to then run the regex again on [b] hi [\b] hello to return the content of the [b] hi [/b] which is hi.

For example, for an input of :

[A] outer [B] [C] last one left [/C] middle [/B] [/A]  [A] out [B] in [/B] [/A]

First you, run the regex against that string and look at the group 2 matches:

outer [B] [C] last one left [/C] middle [/B]
out [B] in [/B]

Add those to the result array, then you run the regex against those matches and get:

 [C] last one left [/C] middle
 in

Add those to the result array, and again run it against those matches and get:

 last one left
 [no matches]

And finally you'd run it against last one left and get no more matches, so you're done.

Raju, if you're unfamiliar with recursion it would be very beneficial for you to stop reading at this point and attempt to solve the problem yourself - come back if you give up. That said...


A Java solution to this problem is:

public static void getAllMatches(Pattern p, String in, List<String> out) {
  Matcher m = p.matcher(in);           // get matches in input
  while (m.find()) {                   // for each match
    out.add(m.group(2));               // add match to result array
    getAllMatches(p, m.group(2), out); // call function again with match as input
  }
}

And here is a working example on ideone

ideone output:

[A]outer[B][C]last one left[/C]middle[/B][/A] [A]out[B]in[/B][/A]
-----------
- outer[B][C]last one left[/C]middle[/B]
- [C]last one left[/C]middle
- last one left
- out[B]in[/B]
- in

[quote]http://www.yourube.com?watch?v=asasdsadsa [url]aisa[/url] [/quote]
-----------
- http://www.yourube.com?watch?v=asasdsadsa [url]aisa[/url] 
- aisa



回答2:


Not neatest way, but a non reg-ex way...

int lastIndex = 0;
String startString = "[quote]";
String endString = "[/quote]";
int start;
int end;
while (lastIndex != -1) {
   start = string.indexOf(startString, lastIndex);
   lastIndex = start;
   if (lastIndex == -1) {
      break;
   }
   end   = string.indexOf(endString, lastIndex);
   lastIndex = end;
   if (lastIndex == -1) {
      break;
   }
   System.out.println(string.substring(
       start  + startString.length,
       end + 1));
}


来源:https://stackoverflow.com/questions/20313496/extracting-bbcode-quote-using-java-and-android-but-not-extracting-the-content-wi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!