Human Name parsing

后端 未结 5 1040
醉话见心
醉话见心 2021-01-05 13:32

I have a bunch of human names. They are all \"Western\" names and I only need American conventions/abbreviations (e.g., Mr. instead of Sr. for señor). Unfortunately, the pe

5条回答
  •  悲哀的现实
    2021-01-05 14:15

    Have you tried the Ruby gem Namae?

    It should deal with most western names well and comes with a couple of configuration options for tricky scenarios (multiple last names, comma used both to separate names in a list and name parts). Having said that, it's a deterministic parser (using this grammar) and there are some cases it won't cover.

    Here is your example:

    require('namae')
    
    Namae.parse 'John Smith and John Smith, Jr. and John Smith Jr and John Smith XIV'
    #=> [
      #,
      #,
      #,
      #
    ]
    

    It struggles with the doctor's title, but that's something we might be able to fix.

提交回复
热议问题