Freebase WEX dumps contain a wpid
column corresponding to the page_id
from the source MediaWiki database in the freebase_wpid
table. This table provides a mapping between Wikipedia numeric article/redirect IDs and Freebase GUIDs (Global Unique IDs).
guid
use as foreign keys is deprecated by mid
for lots of good reasons, but that doesn't change the fact that guids are still used at a system level so I'm going to call mid
an accessor from here on.
Using the mid
accessor is flexible in MQL. One can query using "mid": null
and using "mid":[]
depending on whether one needs the current mid
or every mid
.
Finding a list of wpid
values per mid
is straightforward in MQL:
[{
"mid": null
"key": [{"namespace":"/wikipedia/en_id", "value":null}]
}]
But if all is well in the universe, each current mid
should have only one current wpid
, so is there a way to do something like "wpid": null
like one can with the mql
accessor?
If you only want one wpid value per mid you could do something like this:
[{
"mid": null,
"key": {
"namespace": "/wikipedia/en_id",
"value": null,
"limit": 1
}
}]
Bare in mind that it is entirely possible that a Freebase topic would have more than one wmid. This happens whenever we need to merge duplicate topics that we've imported from Wikipedia, or if we import them before they get merged in Wikipedia.
If you're looking for links to Wikipedia pages you might also be interested in the /wikipedia/en_title namepace:
[{
"mid": null,
"key": {
"namespace": "/wikipedia/en_title",
"value": null,
"limit": 1
}
}]
来源:https://stackoverflow.com/questions/8002937/getting-wikipedia-ids-in-mql