问题
To store some data in Apache Jena from python I'd like to have a generic conversion from a list of Dicts to RDF and possibly back on query.
For the list of Dict to RDF part I tried implementing "insertListofDicts" (see below) and tested it with "testListOfDictInsert" (see below). The result is below which leads to a 400: Bad Request when tried with an Apache Jena Fuseki server.
What needs to be fixed for simple string types - and may be for other primitive Python types to get this working?
Please also find the source code at:
- https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/dg/jena.py
- https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/tests/testJena.py
@prefix foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#name "Elizabeth Alexandra Mary Windsor".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#born "1926-04-21".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q9682".
foaf:Person/George+of+Cambridge foaf:Person#name "George of Cambridge".
foaf:Person/George+of+Cambridge foaf:Person#born "2013-07-22".
foaf:Person/George+of+Cambridge foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#name "Harry Duke of Sussex".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#born "1984-09-15".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q152316".
}
testListOfDictInsert
def testListOfDictInsert(self):
'''
test inserting a list of Dicts using FOAF example
https://en.wikipedia.org/wiki/FOAF_(ontology)
'''
listofDicts=[
{'name': 'Elizabeth Alexandra Mary Windsor', 'born': '1926-04-21', 'age': 94, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
{'name': 'George of Cambridge', 'born': '2013-07-22', 'age': 7, 'ofAge': False, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
{'name': 'Harry Duke of Sussex', 'born': '1984-09-15', 'age': 36, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
]
jena=self.getJena(mode='update',debug=True)
jena.insertListOfDicts(listofDicts,'foaf:Person','name','@prefix foaf: <http://xmlns.com/foaf/0.1/>')
insertListofDicts
def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
'''
insert the given list of dicts mapping datatypes according to
https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
mapped from
https://docs.python.org/3/library/stdtypes.html
compare to
https://www.w3.org/2001/sw/rdb2rdf/directGraph/
http://www.bobdc.com/blog/json2rdf/
https://www.w3.org/TR/json-ld11-api/#data-round-tripping
https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
'''
errors=[]
insertCommand='%s\nINSERT DATA {\n' % prefixes
for index,record in enumerate(listOfDicts):
if not primaryKey in record:
errors.append["missing primary key %s in record %d",index]
else:
primaryValue=record[primaryKey]
encodedPrimaryValue=urllib.parse.quote_plus(primaryValue)
tSubject="%s/%s" %(entityType,encodedPrimaryValue)
for keyValue in record.items():
key,value=keyValue
valueType=type(value)
if self.debug:
print("%s(%s)=%s" % (key,valueType,value))
tPredicate="%s#%s" % (entityType,key)
tObject=value
if valueType == str:
insertCommand+=' %s %s "%s".\n' % (tSubject,tPredicate,tObject)
insertCommand+="\n}"
if self.debug:
print (insertCommand)
self.insert(insertCommand)
return errors
回答1:
+
is the special character in HTTP Form encoding for a space but it should only be used in application/x-www-form-urlencoded
.
For URIs, use %20
or decide on a replacement character such as _
for space because it looks a bit like a space.
In all these cases, there is not a space character in the URI - there is a +
, %20
(three characters) or _
. It is encoding, not an escape mechanism.
回答2:
The following code at least works and has a correct "round-trip" behavior. The data inserted from a list of Dicts can be retrieved with a corresponding quer. Please comment for more improvements or add a better answer.
If you'd always like to get typedLiterals you can specify this now in the constructor of the Jena wrapper class.
in typed literal mode the unit test insert is:
the types
- integer
- decimal
are used for numeric literals for proper "round-trip" behavior.
PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine "0"^^<http://www.w3.org/2001/XMLSchema#integer>.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age "94.32637220476806"^^<http://www.w3.org/2001/XMLSchema#decimal>.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine "1"^^<http://www.w3.org/2001/XMLSchema#integer>.
foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
foafo:Person_CharlesPrinceofWales foafo:Person_age "71.7578047461618"^^<http://www.w3.org/2001/XMLSchema#decimal>.
foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_GeorgeofCambridge foafo:Person_numberInLine "3"^^<http://www.w3.org/2001/XMLSchema#integer>.
foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foafo:Person_GeorgeofCambridge foafo:Person_age "7.072013799051315"^^<http://www.w3.org/2001/XMLSchema#decimal>.
foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_HarryDukeofSussex foafo:Person_numberInLine "5"^^<http://www.w3.org/2001/XMLSchema#integer>.
foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
foafo:Person_HarryDukeofSussex foafo:Person_age "35.92133993168922"^^<http://www.w3.org/2001/XMLSchema#decimal>.
foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.
}
when the literal mode is off type literals are only used for dates:
PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine 0.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age 94.32637220476806.
foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine 1.
foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
foafo:Person_CharlesPrinceofWales foafo:Person_age 71.7578047461618.
foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_GeorgeofCambridge foafo:Person_numberInLine 3.
foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foafo:Person_GeorgeofCambridge foafo:Person_age 7.072013799051315.
foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.
foafo:Person_HarryDukeofSussex foafo:Person_numberInLine 5.
foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
foafo:Person_HarryDukeofSussex foafo:Person_age 35.92133993168922.
foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.
}
testListOfDictInsert
def testListOfDictInsert(self):
'''
test inserting a list of Dicts and retrieving the values again
using a person based example
instead of
https://en.wikipedia.org/wiki/FOAF_(ontology)
we use an object oriented derivate of FOAF with a focus on datatypes
'''
listofDicts=[
{'name': 'Elizabeth Alexandra Mary Windsor', 'born': self.dob('1926-04-21'), 'numberInLine': 0, 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
{'name': 'Charles, Prince of Wales', 'born': self.dob('1948-11-14'), 'numberInLine': 1, 'wikidataurl': 'https://www.wikidata.org/wiki/Q43274' },
{'name': 'George of Cambridge', 'born': self.dob('2013-07-22'), 'numberInLine': 3, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
{'name': 'Harry Duke of Sussex', 'born': self.dob('1984-09-15'), 'numberInLine': 5, 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
]
today=date.today()
for person in listofDicts:
born=person['born']
age=(today - born).days / 365.2425
person['age']=age
person['ofAge']=age>=18
typedLiteralModes=[True,False]
entityType='foafo:Person'
primaryKey='name'
prefixes='PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>'
for typedLiteralMode in typedLiteralModes:
jena=self.getJena(mode='update',typedLiterals=typedLiteralMode,debug=True)
errors=jena.insertListOfDicts(listofDicts,entityType,primaryKey,prefixes)
self.checkErrors(errors)
jena=self.getJena(mode="query")
queryString = """
PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
SELECT ?name ?born ?numberInLine ?wikidataurl ?ofAge ?age WHERE {
?person foafo:Person_name ?name.
?person foafo:Person_born ?born.
?person foafo:Person_numberInLine ?numberInLine.
?person foafo:Person_wikidataurl ?wikidataurl.
?person foafo:Person_ofAge ?ofAge.
?person foafo:Person_age ?age.
}"""
personResults=jena.query(queryString)
self.assertEqual(len(listofDicts),len(personResults))
personList=jena.asListOfDicts(personResults)
for index,person in enumerate(personList):
print("%d: %s" %(index,person))
# check the correct round-trip behavior
self.assertEqual(listofDicts,personList)
insertListOfDicts
def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
'''
insert the given list of dicts mapping datatypes according to
https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
mapped from
https://docs.python.org/3/library/stdtypes.html
compare to
https://www.w3.org/2001/sw/rdb2rdf/directGraph/
http://www.bobdc.com/blog/json2rdf/
https://www.w3.org/TR/json-ld11-api/#data-round-tripping
https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
'''
errors=[]
insertCommand='%s\nINSERT DATA {\n' % prefixes
for index,record in enumerate(listOfDicts):
if not primaryKey in record:
errors.append["missing primary key %s in record %d",index]
else:
primaryValue=record[primaryKey]
encodedPrimaryValue=self.getLocalName(primaryValue)
tSubject="%s_%s" %(entityType,encodedPrimaryValue)
for keyValue in record.items():
key,value=keyValue
valueType=type(value)
if self.debug:
print("%s(%s)=%s" % (key,valueType,value))
tPredicate="%s_%s" % (entityType,key)
tObject=value
if valueType == str:
tObject='"%s"' % value
elif valueType==int:
if self.typedLiterals:
tObject='"%d"^^<http://www.w3.org/2001/XMLSchema#integer>' %value
pass
elif valueType==float:
if self.typedLiterals:
tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#decimal>' %value
pass
elif valueType==bool:
pass
elif valueType==datetime.date:
#if self.typedLiterals:
tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#date>' %value
pass
else:
errors.append("can't handle type %s in record %d" % (valueType,index))
tObject=None
if tObject is not None:
insertCommand+=' %s %s %s.\n' % (tSubject,tPredicate,tObject)
insertCommand+="\n}"
if self.debug:
print (insertCommand)
self.insert(insertCommand)
return errors
来源:https://stackoverflow.com/questions/63435157/listofdict-to-rdf-conversion-in-python-targeting-apache-jena-fuseki