Splitting CamelCase with regex

时光总嘲笑我的痴心妄想 提交于 2019-12-09 02:51:02

问题


I have this code to split CamelCase by regular expression:

Regex.Replace(input, "(?<=[a-z])([A-Z])", " $1", RegexOptions.Compiled).Trim();

However, it doesn't split this correctly: ShowXYZColours

It produces Show XYZColours instead of Show XYZ Colours

How do I get the desired result?


回答1:


Unicode-aware

(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})

Breakdown:

(?=               # look-ahead: a position followed by...
  \p{Lu}\p{Ll}    #   an uppercase and a lowercase
)                 #
|                 # or
(?<=              # look-behind: a position after...
  \p{Ll}          #   an uppercase
)                 #
(?=               # look-ahead: a position followed by...
  \p{Lu}          #   a lowercase
)                 #

Use with your regex split function.


EDIT: Of course you can replace \p{Lu} with [A-Z] and \p{Ll} with [a-z] if that's what you need or your regex engine does not understand Unicode categories.




回答2:


.NET DEMO

You can use something like this :

(?<=[a-z])([A-Z])|(?<=[A-Z])([A-Z][a-z])

CODE :

string strRegex = @"(?<=[a-z])([A-Z])|(?<=[A-Z])([A-Z][a-z])";
Regex myRegex = new Regex(strRegex, RegexOptions.None);
string strTargetString = @"ShowXYZColours";
string strReplace = @" $1$2";

return myRegex.Replace(strTargetString, strReplace);

OUTPUT :

Show XYZ Colours

Demo and Explanation




回答3:


using Tomalak's regex with .NET System.Text.RegularExpressions creates an empty entry in position 0 of the resulting array:

Regex.Split("ShowXYZColors", @"(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})")

{string[4]}
    [0]: ""
    [1]: "Show"
    [2]: "XYZ"
    [3]: "Colors"

It works for caMelCase though (as opposed to PascalCase):

Regex.Split("showXYZColors", @"(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})")

{string[3]}
    [0]: "show"
    [1]: "XYZ"
    [2]: "Colors"



回答4:


You can try this :

Regex.Replace(input, "((?<!^)([A-Z][a-z]|(?<=[a-z])[A-Z]))", " $1").Trim();

Example :

Regex.Replace("TheCapitalOfTheUAEIsAbuDhabi", "((?<!^)([A-Z][a-z]|(?<=[a-z])[A-Z]))", " $1").Trim();

Output : The Capital Of The UAE Is Abu Dhabi



来源:https://stackoverflow.com/questions/21326963/splitting-camelcase-with-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!