Duplicate triple in RDF, authoritative view?

此生再无相见时 提交于 2019-12-12 09:49:43

问题


if a triple store contains twice the same triple, what is (if any exist) the authoritative position about this redundancy ?

Additionally, should a triplestore be allowed to store twice the same triple within the same context ?

I ask this because in rdflib apparently you can store the same triple twice (or more). This is the reader

import rdflib
from rdflib import store

s = rdflib.plugin.get('MySQL', store.Store)('rdfstore')

config_string = "host=localhost,password=foo,user=foo,db=foo"
rt = s.open(config_string,create=False)
if rt != store.VALID_STORE:
    s.open(config_string,create=True)

graph = rdflib.ConjunctiveGraph(s, identifier = rdflib.URIRef("urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52"))
rows = graph.query("SELECT ?id ?value { ?id <http://localhost#ha> ?value . }")
for r in rows:
    print r[0], r[1]

and this is the writer

import rdflib
from rdflib import store

s = rdflib.plugin.get('MySQL', store.Store)('rdfstore')

config_string = "host=localhost,password=foo,user=foo,db=foo"
rt = s.open(config_string,create=False)
if rt != store.VALID_STORE:
    s.open(config_string,create=True)

graph = rdflib.ConjunctiveGraph(s, identifier = rdflib.URIRef("urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52"))
graph.add( ( rdflib.URIRef("http://localhost/1000"), rdflib.URIRef("http://localhost#ha"), rdflib.Literal("18")) )
graph.commit()

This is what I obtain

sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
table kb_7b066eca61_relations Doesn't exist
table kb_7b066eca61_relations Doesn't exist
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
sbo@dhcp-045:~/tmp/gd $ python ./writer2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
http://localhost/1000 18
sbo@dhcp-045:~/tmp/gd $ python ./writer2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
http://localhost/1000 18
http://localhost/1000 18

To me it appears as a bug. A modified version shows me that both triples belong to the same context, and there are indeed two triples as well

len : 2
http://localhost/1000 18
http://localhost/1000 18
(rdflib.URIRef('http://localhost/1000'), rdflib.URIRef('http://localhost#ha'), rdflib.Literal(u'18'), <Graph identifier=urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52 (<class 'rdflib.Graph.Graph'>)>)
(rdflib.URIRef('http://localhost/1000'), rdflib.URIRef('http://localhost#ha'), rdflib.Literal(u'18'), <Graph identifier=urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52 (<class 'rdflib.Graph.Graph'>)>)

回答1:


An RDF triple store is a set of triples, so the same triple cannot be present twice, by definition. However, most rdf stores are actually quad stores (sets of rdf graphs also known as datasets) and in that case, the triple may appear multiple times. That is sometimes called context, depending on the store (eg mine, Redland). Authority is really up to the user to define what meaning a particular graph name/context name has.




回答2:


One should keep in mind that any particular triple may have different metadata than other - otherwise identical - triples. Metadata such as the original source of the triple, possible strength of connection information, etcetera. It may also be feasible to merely count the number of copies of a triple in order to judge the relative strength of a connection compared to other possible contradictory connections. So, as always, it all depends upon what you intend to do with your data.




回答3:


RDF is a language for expressing factual claims, organized and grouped into graphs. If a graph contains "Alice is a Person" twice, that's just redundant. So within a graph, triples are normalised; there's no point in repeating them. However applications, stores and SPARQL-queriable systems will often collect factual claims from different sources. The SPARQL language has the 'GRAPH' keyword for when you want to take a multi-graph perspective and look for the same triple in different sources.



来源:https://stackoverflow.com/questions/1864941/duplicate-triple-in-rdf-authoritative-view

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!