Splitting CamelCase with regex

前端 未结 4 709
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-11 20:18

I have this code to split CamelCase by regular expression:

Regex.Replace(input, \"(?<=[a-z])([A-Z])\", \" $1\", RegexOptions.Compiled).Trim();


        
相关标签:
4条回答
  • 2021-01-11 20:49

    .NET DEMO

    You can use something like this :

    (?<=[a-z])([A-Z])|(?<=[A-Z])([A-Z][a-z])
    

    CODE :

    string strRegex = @"(?<=[a-z])([A-Z])|(?<=[A-Z])([A-Z][a-z])";
    Regex myRegex = new Regex(strRegex, RegexOptions.None);
    string strTargetString = @"ShowXYZColours";
    string strReplace = @" $1$2";
    
    return myRegex.Replace(strTargetString, strReplace);
    

    OUTPUT :

    Show XYZ Colours
    

    Demo and Explanation

    0 讨论(0)
  • 2021-01-11 21:02

    using Tomalak's regex with .NET System.Text.RegularExpressions creates an empty entry in position 0 of the resulting array:

    Regex.Split("ShowXYZColors", @"(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})")
    
    {string[4]}
        [0]: ""
        [1]: "Show"
        [2]: "XYZ"
        [3]: "Colors"
    

    It works for caMelCase though (as opposed to PascalCase):

    Regex.Split("showXYZColors", @"(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})")
    
    {string[3]}
        [0]: "show"
        [1]: "XYZ"
        [2]: "Colors"
    
    0 讨论(0)
  • 2021-01-11 21:03

    You can try this :

    Regex.Replace(input, "((?<!^)([A-Z][a-z]|(?<=[a-z])[A-Z]))", " $1").Trim();
    

    Example :

    Regex.Replace("TheCapitalOfTheUAEIsAbuDhabi", "((?<!^)([A-Z][a-z]|(?<=[a-z])[A-Z]))", " $1").Trim();
    

    Output : The Capital Of The UAE Is Abu Dhabi

    0 讨论(0)
  • 2021-01-11 21:14

    Unicode-aware

    (?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})
    

    Breakdown:

    (?=               # look-ahead: a position followed by...
      \p{Lu}\p{Ll}    #   an uppercase and a lowercase
    )                 #
    |                 # or
    (?<=              # look-behind: a position after...
      \p{Ll}          #   an uppercase
    )                 #
    (?=               # look-ahead: a position followed by...
      \p{Lu}          #   a lowercase
    )                 #
    

    Use with your regex split function.


    EDIT: Of course you can replace \p{Lu} with [A-Z] and \p{Ll} with [a-z] if that's what you need or your regex engine does not understand Unicode categories.

    0 讨论(0)
提交回复
热议问题