Regular expression for checking if capital letters are found consecutively in a string?

南笙酒味 提交于 2019-12-17 15:13:14

问题


I want to know the regexp for the following case:

The string should contain only alphabetic letters. It must start with a capital letter followed by small letter. Then it can be small letters or capital letters.

^[A-Z][a-z][A-Za-z]*$

But the string must also not contain any consecutive capital letters. How do I add that logic to the regexp?

That is, HttpHandler is correct, but HTTPHandler is wrong.


回答1:


Edit: 2015-10-26: thanks for the upvotes - but take a look at tchrist's answer, especially if you develop for the web or something more "international".

Oren Trutners answer isn't quite right (see sample input of "RightHerE" which must be matched but isn't)

Here is the correct solution:

(?!^.*[A-Z]{2,}.*$)^[A-Za-z]*$

edit:

(?!^.*[A-Z]{2,}.*$)  // don't match the whole expression if there are two or more consecutive uppercase letters
^[A-Za-z]*$          // match uppercase and lowercase letters

/edit

the key for the solution is a negative lookahead see: http://www.regular-expressions.info/lookaround.html




回答2:


Whenever one writes [A-Z] or [a-z], one commits to processing nothing but 7-bit ASCII data. If that's really ok, then fine. But if it's not, the Unicode properties exist to help with this.

There are three cases in Unicode, not two. Furthermore, you also have noncased letters. Letters in general are specified by the \pL property, and each of these also belongs to exactly one of five subcategories:

  1. uppercase letters, specified with \p{Lu}; eg: AÇDZÞΣSSὩΙST
  2. titlecase letters, specified with \p{Lt}; eg: LjDzSsᾨSt (actually Ss and St are an upper- and then a lowercase letter, but they are what you get if you ask for the titlecase of ß and , respectively)
  3. lowercase letters, specified with \p{Ll}; eg: aαçdzςσþßᾡſt
  4. modifier letters, specified with \p{Lm}; eg: ʰʲᴴᴭʺˈˠᵠꜞ
  5. other letters, specified with \p{Lo}; eg: ƻאᎯᚦ京

You can take the complement of any of these, but be careful, because something like \P{Lu} does not mean a letter that isn't uppercase. It means any character that isn't an uppercase letter.

For letter that's either of uppercase or titlecase, use [\p{Lu}\p{Lt}]. So you could use for your pattern:

      ^([\p{Lu}\p{Lt}]\p{Ll}+)+$

If those you don't mean to limit the letters following the first to the casing letters alone, then you might prefer:

     ^([\p{Lu}\p{Lt}][\p{Ll}\p{Lm}\p{Lo}]+)+$

If you're trying to match so-called "CamelCase" identifiers, then the actual rules depend on the programming language, but usually include the underscore character and the decimal numbers (\p{Nd}), and may include a literal dollar sign. If this is so, you may wish to add some of these to the one or the other of the two character classes above. For example, you may wish to add underscore to both but digits only to the second, leaving you with:

     ^([_\p{Lu}\p{Lt}][_\p{Nd}\p{Ll}\p{Lm}\p{Lo}]+)+$

If, though, you are dealing with certain words from various RFCs and ISO standards, these are often specified as containing ASCII only. If so, you can get by with the literal [A-Z] idea. It's just not kind to impose that restriction if it doesn't actually exist.




回答3:


^([A-Z][a-z]+)+$

This looks for sequences of an uppercase letter followed by one or more lowercase letters. Consecutive uppercase letters will not match, as only one is allowed at a time, and it must be followed by a lowercase one.




回答4:


Aside from tchrists excellent post concerning unicode, I think you don't need the complex solution with a negative lookahead... Your definition requires an Uppercase-letter followed by at least one group of (a lowercase letter optionally followed by an Uppercase-letter)

^
[A-Z]    // Start with an uppercase Letter
(        // A Group of:
  [a-z]  // mandatory lowercase letter
  [A-Z]? // an optional Uppercase Letter at the end
         // or in between lowercase letters
)+       // This group at least one time
$

Just a bit more compact and easier to read I think...




回答5:


If you want to get all Employee name in mysql which having at least one uppercase letter than apply this query.

SELECT * FROM registration WHERE `name` REGEXP BINARY '[A-Z]';


来源:https://stackoverflow.com/questions/4050381/regular-expression-for-checking-if-capital-letters-are-found-consecutively-in-a

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!