How to detect email addresses within arbitrary strings

本秂侑毒 提交于 2019-12-30 12:51:07

问题


I'm using the following code to detect an email in the string. It works fine except dealing with email having pure number prefix, such as "536264846@gmail.com". Is it possible to overcome this bug of apple? Any help will be appreciated!

NSString *string = @"536264846@gmail.com";
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
NSArray *matches = [detector matchesInString:string
                                     options:0
                                       range:NSMakeRange(0, [string length])];    
for (NSTextCheckingResult *match in matches) {
    if ([match.URL.scheme isEqualToString:@"mailto"]) {
        NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1];
        NSLog(@"email :%@",email);

    }else{
        NSLog(@"[match URL] :%@",[match URL]);
    }

}

Edit: log result is: [match URL] :http://gmail.com


回答1:


What I did in the past:

  • tokenize the input, e.g., separate tokens using spaces (since most other common separators may be valid within an email). However, this may not be necessary if the regular expression is not anchored - but not sure how it would work without the "^" and "$" anchors (which I added to what was shown on the web site).

  • keep in mind that addresses may take the form '"string"' as well as just address

  • in each token, look for '@', as it's probably the best indicator you have that its an email address

  • run the token through the regular expression shown on this Email Detector comparison site (I found in testing that the one marked #1 as of 3/21/2013 worked best)

What I did was put the regular expression in a text file, so I didn't need to escape it:

^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22))(?:.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22)))@(?:(?:(?!.[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+).){1,126}){1,}(?:(?:[a-z][a-z0-9])|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+))|(?:[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.[a-f0-9][:]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))]))$

Defined an ivar:

NSRegularExpression *reg

Created the regular expression:

NSString *fullPath = [[NSBundle mainBundle] pathForResource:@"EMailRegExp" ofType:@"txt"];
NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
NSError *error = nil;
reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
assert(reg && !error);

Then wrote a method to do the comparison:

- (BOOL)isValidEmail:(NSString *)string
{
    NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
    return match ? YES : NO;
}

EDIT: I've turned the above into a project on github

EDIT2: for an alterate, less rigorous but faster, see the comment section of this question



来源:https://stackoverflow.com/questions/15525117/how-to-detect-email-addresses-within-arbitrary-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!