Normalize data according to business entity (Legal name, class of business, DNS domain, company type) [closed]

左心房为你撑大大i 提交于 2019-12-01 01:56:34

I'm not sure this is the best place to ask your question. Maybe your local librarian could help. Anyway, I'm answering because I've done a lot of work along these lines in the past, and because I've found that programmers and database designers often know where to find data--especially authoritative and standard data.

At the local level (in the USA), we accepted whatever the local Chamber of Commerce gave us. At the national level, we bought lists from InfoUSA. Chamber of Commerce data can be pretty flaky; InfoUSA data is very clean.

Dun & Bradstreet is the closest I know of to a one-stop global business registry. They're not cheap.

RBA, a company in the UK, seems to have a really useful introduction with a global perspective. See Official Company Registers. Much of the data there is free.

I have been doing some research in this area and found a recent paper which discusses an approach to extract, discover (via clustering) and normalize (by an enhanced edit-distance calculation) organization names. NEMO

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!