I am trying to find if there is better way to check if the string has special characters. In my case, anything other than alphanumeric and a \'_\' is considered a special charac
I think I'd do the job just a bit differently, treating the std::string
as a collection, and using an algorithm. Using a C++0x lambda, it would look something like this:
bool has_special_char(std::string const &str) {
return std::find_if(str.begin(), str.end(),
[](char ch) { return !(isalnum(ch) || ch == '_'); }) != str.end();
}
At least when you're dealing with char
(not wchar_t
), isalnum
will typically use a table look up, so it'll usually be (quite a bit) faster than anything based on find_first_of
(which will normally use a linear search instead). IOW, this is O(N) (N=str.size()), where something based on find_first_of
will be O(N*M), (N=str.size(), M=pattern.size()).
If you want to do the job with pure C, you can use scanf
with a scanset conversion that's theoretically non-portable, but supported by essentially all recent/popular compilers:
char junk;
if (sscanf(str, "%*[A-Za-z0-9_]%c", &junk))
/* it has at least one "special" character
else
/* no special characters */
The basic idea here is pretty simple: the scanset skips across all consecutive non-special characters (but doesn't assign the result to anything, because of the *
), then we try to read one more character. If that succeeds, it means there was at least one character that was not skipped, so we must have at least one special character. If it fails, it means the scanset conversion matched the whole string, so all the characters were "non-special".
Officially, the C standard says that trying to put a range in a scanset conversion like this isn't portable (a '-' anywhere but the beginning or end of the scanset gives implementation defined behavior). There have even been a few compilers (from Borland) that would fail for this -- they would treat A-Z
as matching exactly three possible characters, 'A', '-' and 'Z'. Most current compilers (or, more accurately, standard library implementations) take the approach this assumes: "A-Z" matches any upper-case character.