Is anyone aware of some code/rules on how to capitalize the names of people correctly?
- John Smith
- Johan van Rensburg
- Derrick von Gogh
- Ruby de La Fuente
- Peter Maclaurin
- Garry McDonald
(these may not be correct, just some sample names and how the capitalization could be/work)
This seems like a losing battle...
If anyone has some code or rules on when and how to capitalize names, let me know :)
Cheers, Albert
The only sensible way to handle it, in my opinion, is to let the users tell you how their name should be capitalized. Any automatic scheme is going to annoy someone.
Just tell them you're OLD SCHOOL. That makes it simple and 100% correct:
- JOHN SMITH
- JOHAN VAN RENSBURG
- DERRICK VON GOGH
- RUBY DE LA FUENTE
- PETER MACLAURIN
- GARRY MCDONALD
The same logic also helps with many i18n problems.
Wikipedia seems to have decent coverage of this:
I'm not sure if Ruby is of use but you might want to take a look at NameCase. Even if you're not working with Ruby, you might be able to port this (it's open source) to your language.
There's also this implementation in Python which is based on this algorithm. The basic idea is convert the name to title case then check the name against a giant look-up table of exceptions.
But really what Jonathan Leffler said is spot on: unless you have some requirement to convert pre-existing unformatted data, automated capitalization is going to do something wrong (especially as capitalization rules vary across language divides, cultural divides, name changes that result from emigration or people just preferring to capitalize their name in some particular fashion).
I kept a lookup of names that needed special handling. When a case-insensitive match was found, I used the lookup value. This did not resolve people who used case that did not match the "accepted" capitalization. It allowed me/user to add names as needed. I can't find my code, but I did get the surnames from http://www.census.gov/.
来源:https://stackoverflow.com/questions/2466706/capitalization-of-person-names-in-programming