How to create article spinner regex in Java?

こ雲淡風輕ζ 提交于 2019-12-10 00:03:35

问题


Say for example I want to take this phrase:

{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}

and randomly make it into one of the following:

Hello world
Goodbye people
What's Up word
What's Up planet
Later citizens
etc.

The basic idea is that enclosed within every pair of braces will be an unlimited number of choices separated by "|". The program needs to go through and randomly choose one choice for each set of braces. Keep in mind that braces can be nested endlessly within each other. I found a thread about this and tried to convert it to Java, but it did not work. Here is the python code that supposedly worked:

import re
from random import randint

def select(m):
    choices = m.group(1).split('|')
    return choices[randint(0, len(choices)-1)]

def spinner(s):
    r = re.compile('{([^{}]*)}')
    while True:
        s, n = r.subn(select, s)
        if n == 0: break
    return s.strip()

Here is my attempt to convert that Python code to Java.

public String generateSpun(String text){
    String spun = new String(text);
    Pattern reg = Pattern.compile("{([^{}]*)}");
    Matcher matcher = reg.matcher(spun);
    while (matcher.find()){
       spun = matcher.replaceFirst(select(matcher.group()));
    }
    return spun;
}

private String select(String m){
    String[] choices = m.split("|");
    Random random = new Random();
    int index = random.nextInt(choices.length - 1);
    return choices[index];
}

Unfortunately, when I try to test this by calling

generateAd("{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}");

In the main of my program, it gives me an error in the line in generateSpun where Pattern reg is declared, giving me a PatternSyntaxException.

java.util.regex.PatternSyntaxException: Illegal repetition
{([^{}]*)}

Can someone try to create a Java method that will do what I am trying to do?


回答1:


Here are some of the problems with your current code:

  • You should reuse your compiled Pattern, instead of Pattern.compile every time
  • You should reuse your Random, instead of new Random every time
  • Be aware that String.split is regex-based, so you must split("\\|")
  • Be aware that curly braces in Java regex must be escaped to match literally, so Pattern.compile("\\{([^{}]*)\\}");
  • You should query group(1), not group() which defaults to group 0
  • You're using replaceFirst wrong, look up Matcher.appendReplacement/Tail instead
  • Random.nextInt(int n) has exclusive upper bound (like many such methods in Java)
  • The algorithm itself actually does not handle arbitrarily nested braces properly

Note that escaping is done by preceding with \, and as a Java string literal it needs to be doubled (i.e. "\\" contains a single character, the backslash).

Attachment

  • Source code and output with above fix but no major change to algorithm



回答2:


To fix the regex, add backslashes before the outer { and }. These are meta-characters in Java regexes. However, I don't think that will result in a working program. You are modifying the variable spun after it has been bound to the regex, and I do not think the returned Matcher will reflect the updated value.

I also don't think the python code will work for nested choices. Have you actually tried the python code? You say it "supposedly works", but it would be wise to verify that before you spend a lot of time porting it to Java.




回答3:


Well , I just created one in PHP & Python , demo here http://spin.developerscrib.com , its at a very early stage so might not work to expectation , the source code is on github : https://github.com/razzbee/razzy-spinner




回答4:


Use this, will work... I did, and working great

Pattern p = Pattern.compile("cat");
 Matcher m = p.matcher("one cat two cats in the yard");
 StringBuffer sb = new StringBuffer();
 while (m.find()) {
     m.appendReplacement(sb, "dog");
 }
 m.appendTail(sb);
 System.out.println(sb.toString());

and here

private String select(String m){
    String[] choices = m.split("|");
    Random random = new Random();
    int index = random.nextInt(choices.length - 1);
    return choices[index];
}

m.split("|") use m.split("\\|")

Other wise it splits each an every character

and use Pattern.compile("\\{([^{}]*)\\}");



来源:https://stackoverflow.com/questions/3393420/how-to-create-article-spinner-regex-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!