How does one compare two RDF graphs with SPARQL? If I have graphs :a and :b, I want to find all the times :a appears in :b. I can query for all of :a\'s subjects, predicates,
I set up a Jena Fuseki endpoint with two named graphs, http://a
and http://b
, which we'll call A and B. A contains one triple, and B contains two. A, (viewed) as a set of triples, is a subset of B, which the following query confirms:
select * where {
graph ?g { ?s ?p ?o }
}
-----------------------------------------------------------
| s | p | o | g |
===========================================================
| <urn:uuid:b> | <urn:uuid:p> | <urn:uuid:b> | <http://b> |
| <urn:uuid:a> | <urn:uuid:p> | <urn:uuid:b> | <http://b> |
| <urn:uuid:a> | <urn:uuid:p> | <urn:uuid:b> | <http://a> |
-----------------------------------------------------------
Now, we can ask for triples that appear in one and not in the other. To ask for triple in B that are not in A, we can use this query:
select * where {
graph <http://a> { ?s ?p ?o }
FILTER NOT EXISTS { graph <http://b> { ?s ?p ?o } }
}
-------------
| s | p | o |
=============
-------------
We can also ask for triples that appear in B, but not in A. We expect and receive one triple.
select * where {
graph <http://b> { ?s ?p ?o }
FILTER NOT EXISTS { graph <http://a> { ?s ?p ?o } }
}
----------------------------------------------
| s | p | o |
==============================================
| <urn:uuid:b> | <urn:uuid:p> | <urn:uuid:b> |
----------------------------------------------
In general, if X contains no triples that are not also in Y, then X is a subset of Y. Using queries like the above, we can find such triples that are in one and not in another.
If we don't care about the particular triples, we can use an ASK
query to check whether any exist, without finding out what they are. For instance,
ask {
graph <http://a> { ?s ?p ?o }
NOT EXISTS { graph <http://b> { ?s ?p ?o } }
}
no
because there are no such triples. However, since we're trying to ask whether A is a subgraph of B, which is indicated by their being no triples, we need to invert the truth value here. So we use:
ask {
NOT EXISTS {
graph <http://a> { ?s ?p ?o }
NOT EXISTS { graph <http://b> { ?s ?p ?o } }
}
}
yes
Similarly, if we ask whether B is a subgraph of A, we get no
:
ask {
NOT EXISTS {
graph <http://b> { ?s ?p ?o }
NOT EXISTS { graph <http://a> { ?s ?p ?o } }
}
}
no