Regex - Return First and Last Name

前端 未结 7 1110
一整个雨季
一整个雨季 2021-01-15 08:51

I\'m looking for the best reliable way to return the first and last name of a person given the full name, so far the best I could think of is the following

相关标签:
7条回答
  • 2021-01-15 09:08

    As is, you're requiring a last name -- which, of course, your first example doesn't have.

    Use clustered grouping, (?:...), and 0-or-1 count, ?, for the middle and last names as a whole to allow them to be optional:

    '~\b(\p{L}+)\b (?: .+\b(\p{L}+)\b )?~ix'  # x for spacing
    

    This should allow the first name to be captured whether middle/last names are given or not.

    $name = preg_replace('~\b(\p{L}+)\b(?:.+\b(\p{L}+)\b)?~i', '$1 $2', $name);
    
    0 讨论(0)
  • 2021-01-15 09:09

    This might not be what you want to hear, but I don't think this problem is suited to a regular expression since names are not regular. I don't think they are even context-sensitive or context-free. If anything, they are unrestricted (I would have to sit down and think that through more than I did before I say that for sure, though) and no regular expression engine can parse an unrestricted grammar.

    0 讨论(0)
  • 2021-01-15 09:17

    Depending on how clean your data is, I think you are going to have a tough time finding a single regex that does what you want. What different formats do you expect the names to be in? I've had to write similar code and there can be a lot of variations: - first last - last, first - first middle last - last, first middle

    And then you have things like suffixes (Junior, senior, III, etc.) and prefixes ( Mr., Mrs, etc), combined names (e.g. John and Mary Smith). As some others have already mentioned you also have to deal with multi-part last names (e.g. Victor de la Hoya) as well.

    I found I had to deal with all of those possibilities before I could reliably pull out the first and last names.

    0 讨论(0)
  • 2021-01-15 09:17

    I think your best option is to simply treat everything after the first name as the surname i.e.

    William Henry Gates
    Forename: William
    Surname: Henry Gates

    Its the safest mechanism as not everyone will enter their middle name anyway. You can't simply extract William - ignore Henry - and extract Gates as for all you know, Henry is part of the Surname.

    0 讨论(0)
  • 2021-01-15 09:20

    Instead of a regex you might find it easier to do something like:

    $parts = explode(" ", $name);
    $first = $parts[0];
    $last = ""
    if (count($parts) > 1) {
        $last = $parts[count($parts) - 1];
    }
    

    You might want to replace multiple consecutive bits of whitespace with a single space first, so you don't get empty bits, and get rid of trailing/leading whitespace:

    $name = ereg_replace("[ \t\r\n]+", " ", trim($name));
    
    0 讨论(0)
  • 2021-01-15 09:21

    Here is simple non regex way

    $name=explode(" ",$name);
    $first_name=reset($name);
    $last_name=end($name);
    $result=$first_name.' '.$last_name;
    
    0 讨论(0)
提交回复
热议问题