Java: Split string when an uppercase letter is found

前端 未结 7 935
醉梦人生
醉梦人生 2020-11-27 15:29

I think this is an easy question, but I am not able to find a simple solution (say, less than 10 lines of code :)

I have a String such as \"thisIs

相关标签:
7条回答
  • 2020-11-27 15:42

    You may use a regexp with zero-width positive lookahead - it finds uppercase letters but doesn't include them into delimiter:

    String s = "thisIsMyString";
    String[] r = s.split("(?=\\p{Upper})");
    

    Y(?=X) matches Y followed by X, but doesn't include X into match. So (?=\\p{Upper}) matches an empty sequence followed by a uppercase letter, and split uses it as a delimiter.

    See javadoc for more info on Java regexp syntax.

    EDIT: By the way, it doesn't work with thisIsMyÜberString. For non-ASCII uppercase letters you need a Unicode uppercase character class instead of POSIX one:

    String[] r = s.split("(?=\\p{Lu})");
    
    0 讨论(0)
  • 2020-11-27 15:48
    String[] camelCaseWords = s.split("(?=[A-Z])");
    
    0 讨论(0)
  • 2020-11-27 15:51

    Try this;

    static Pattern p = Pattern.compile("(?=\\p{Lu})");
    String[] s1 = p.split("thisIsMyFirstString");
    String[] s2 = p.split("thisIsMySecondString");
    
    ...
    
    0 讨论(0)
  • 2020-11-27 15:55

    A simple scala/java suggestion that does not split at entire uppercase strings like NYC:

    def splitAtMiddleUppercase(token: String): Iterator[String] = {
       val regex = """[\p{Lu}]*[^\p{Lu}]*""".r
       regex.findAllIn(token).filter(_ != "") // did not find a way not to produce empty strings in the regex. Open to suggestions.
    }
    

    test with:

    val examples = List("catch22", "iPhone", "eReplacement", "TotalRecall", "NYC", "JGHSD87", "interÜber")
    for( example <- examples) {
       println(example + " -> "  + splitAtMiddleUppercase(example).mkString("[", ", ", "]"))
    }
    

    it produces:

        catch22 -> [catch22]
        iPhone -> [i, Phone]
        eReplacement -> [e, Replacement]
        TotalRecall -> [Total, Recall]
        NYC -> [NYC]
        JGHSD87 -> [JGHSD87]
        interÜber -> [inter, Über]
    

    Modify the regex to cut at digits too.

    0 讨论(0)
  • 2020-11-27 15:59

    For anyone that wonders how the Pattern is when the String to split might start with an upper case character:

    String s = "ThisIsMyString";
    String[] r = s.split("(?<=.)(?=\\p{Lu})");
    System.out.println(Arrays.toString(r));
    

    gives: [This, Is, My, String]

    0 讨论(0)
  • 2020-11-27 16:00

    This regex will split on Caps, omitting the first. So it should work for camel-case and proper-case.

    (?<=.)(?=(\\p{Upper}))
    
    TestText = Test, Text
    thisIsATest = this, Is, A, Test
    
    0 讨论(0)
提交回复
热议问题