发表新帖

发表新帖

Identifying a person's name vs. a dictionary word

前端未结

关注

 3  1563

再見小時候 2021-02-08 02:39

Is there some way to recognize that a word is likely to be/is not likely to be a person\'s name?

So if I see the word \"understanding\" I would get a probability of 0.01

3条回答

谎友^ (楼主)

2021-02-08 02:58

My quick hack would be this:

Get the list from the census bureau of names in order of popularity, it's freely available. Give each name a normalized popularity score (1.0 = most popular, 0.0 = least).

Then, get an opensource dictionary, and do some research to pull together a frequency score for every word. You can find one here, at wiktionary. Assign every word a popularity score, 1.0 to 0.0. The convenient thing is that if you can't find a word on the frequency list, you get to assume it's a pretty uncommon word.

Look for a word on both lists. If it's on just one or the other, you're done. If it's on both, use a formula to compute a weighted probability... something like (Name Popularity) / (Name Popularity + Other Popularity). If it's not on either list, it's probably a name.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题