问题
I'm using the following code to detect an email in the string. It works fine except dealing with email having pure number prefix, such as "536264846@gmail.com". Is it possible to overcome this bug of apple? Any help will be appreciated!
NSString *string = @"536264846@gmail.com";
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
NSArray *matches = [detector matchesInString:string
options:0
range:NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches) {
if ([match.URL.scheme isEqualToString:@"mailto"]) {
NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1];
NSLog(@"email :%@",email);
}else{
NSLog(@"[match URL] :%@",[match URL]);
}
}
Edit: log result is: [match URL] :http://gmail.com
回答1:
What I did in the past:
tokenize the input, e.g., separate tokens using spaces (since most other common separators may be valid within an email). However, this may not be necessary if the regular expression is not anchored - but not sure how it would work without the "^" and "$" anchors (which I added to what was shown on the web site).
keep in mind that addresses may take the form '"string"' as well as just address
in each token, look for '@', as it's probably the best indicator you have that its an email address
run the token through the regular expression shown on this Email Detector comparison site (I found in testing that the one marked #1 as of 3/21/2013 worked best)
What I did was put the regular expression in a text file, so I didn't need to escape it:
^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22))(?:.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22)))@(?:(?:(?!.[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+).){1,126}){1,}(?:(?:[a-z][a-z0-9])|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+))|(?:[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.[a-f0-9][:]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))]))$
Defined an ivar:
NSRegularExpression *reg
Created the regular expression:
NSString *fullPath = [[NSBundle mainBundle] pathForResource:@"EMailRegExp" ofType:@"txt"];
NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
NSError *error = nil;
reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
assert(reg && !error);
Then wrote a method to do the comparison:
- (BOOL)isValidEmail:(NSString *)string
{
NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
return match ? YES : NO;
}
EDIT: I've turned the above into a project on github
EDIT2: for an alterate, less rigorous but faster, see the comment section of this question
来源:https://stackoverflow.com/questions/15525117/how-to-detect-email-addresses-within-arbitrary-strings