RDF4J Workbench: Why is One SPIN Constructor Very Slow?

女生的网名这么多〃 提交于 2019-12-24 08:00:36

问题


I apologize for the length of this posting. I am trying to make this slow rule issue reproducible.

I'm using TopBraid Composer FE to create an RDF file with ontology and SPIN constructors. He purpose of the SPIN constructors is to check for compliance in the instantiations of individuals of classes defined in the ontology. I find that the execution of the SPIN constructor is very slow, and I'd like to know why.

Ontology Including SPIN Constructors SXXIComplianceCheck18.rdf

I Modify/Clear my repository (an In Memory Store with RDFS+SPIN support) and load this ontology into RDF4J workbench:

Next, I use two SPARQL Update queries in sequence to created individuals of classes defined in the ontology (RDF file above) and thus stimulate running SPIN constructors.

First SPARQL Update query (instantiates individual data items and invokes parsing constructors as needed... runs quickly):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sxxicc: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheck#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX smf: <http://topbraid.org/sparqlmotionfunctions#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX spl: <http://spinrdf.org/spl#>
PREFIX spin: <http://spinrdf.org/spin#>
PREFIX arg: <http://spinrdf.org/arg#>
PREFIX SXXIComplianceCheckIndividuals: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX sxxicci: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>


INSERT DATA
{
   sxxicci:testPub7Proposal_DataItem110 a sxxicc:Pub7DataItem110 ;
           sxxicc:pub7DataItemHasRawStringValue "(C) M221.5"^^xsd:string .  

   sxxicci:testPub7Proposal_DataItem500 a sxxicc:Pub7DataItem500 ;
           sxxicc:pub7DataItemHasRawStringValue "S181"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem300 a sxxicc:Pub7DataItem300 ;
           sxxicc:pub7DataItemHasRawStringValue "DC"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem113 a sxxicc:Pub7DataItem113 ;
           sxxicc:pub7DataItemHasRawStringValue "FX"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem511 a sxxicc:Pub7DataItem511 ;
           sxxicc:pub7DataItemHasRawStringValue "SIGHT SEEING"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem501 a sxxicc:Pub7DataItem501 ;
#            sxxicc:pub7DataItemHasRawStringValue "M002"^^xsd:string .
           sxxicc:pub7DataItemHasRawStringValue "M018,     160727"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem503 a sxxicc:Pub7DataItem503 ;
           sxxicc:pub7DataItemHasRawStringValue "CUBESAT, TOM"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem504a a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 1"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "1"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504b a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 2"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "2"^^xsd:integer .           

   sxxicci:testPub7Proposal_DataItem504c a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 3"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "3"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504d a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 4"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "4"^^xsd:integer .         

   sxxicci:testPub7Proposal_DataItem504e a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 5"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "5"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem504f a sxxicc:Pub7DataItem504 ;
           sxxicc:pub7DataItemHasRawStringValue "FAS Agenda line 6"^^xsd:string ;
           sxxicc:pub7DataItemHasOrdinalNumber "6"^^xsd:integer .

   sxxicci:testPub7Proposal_DataItem144 a sxxicc:Pub7DataItem144 ;
           sxxicc:pub7DataItemHasRawStringValue "Y"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem005 a sxxicc:Pub7DataItem005 ;
           sxxicc:pub7DataItemHasRawStringValue "SF"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem102 a sxxicc:Pub7DataItem102 ;
           sxxicc:pub7DataItemHasRawStringValue "S   881234"^^xsd:string .

   sxxicci:testPub7Proposal_DataItem017 a sxxicc:Pub7DataItem017 ;
           sxxicc:pub7DataItemHasRawStringValue "C"^^xsd:string .
}

Second SPARQL Update query (instantiates the proposal which ties together the data items instantiated by the first query and runs the compliance check constructor... runs very slowly, around 20 seconds on my computer):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sxxicc: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheck#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX sp: <http://spinrdf.org/sp#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX smf: <http://topbraid.org/sparqlmotionfunctions#>
PREFIX fn: <http://www.w3.org/2005/xpath-functions#>
PREFIX spl: <http://spinrdf.org/spl#>
PREFIX spin: <http://spinrdf.org/spin#>
PREFIX arg: <http://spinrdf.org/arg#>
PREFIX SXXIComplianceCheckIndividuals: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX sxxicci: <http://www.disa.mil/dso/a2i/ontologies/PBSM/Interface/SXXIComplianceCheckIndividuals#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

INSERT DATA
{

   sxxicci:TestPub7Proposal a sxxicc:Pub7Proposal ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem005 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem017 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem102 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem110 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem113 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem144 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem300 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem500 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem501 ; 
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem503 ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504a ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504b ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504c ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504d ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504e ;
#           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem504f ;
           sxxicc:pub7ProposalHasDataItem sxxicci:testPub7Proposal_DataItem511 .


}

This second query takes a long time to execute, around 20 seconds. This is not consistent with the other compliance checks (not included in this RDF). I've isolated this one rule from among 13 other similar rules (string parsing and comparisons mostly) because it dominates time consumption.

The (correct but delayed) results:

The SPIN constructor in question (for sxxicc:Pub7Proposal class):

# NEED MINUTE <M> NOTE M002 IN CIRCUIT REMARKS LISTING IRAC DOCUMENT GIVING APPROVAL FOR THIS ASSIGNMENT. (501 08)
CONSTRUCT {
    ?this sxxicc:pub7ProposalHasComplianceMessage "NEED MINUTE <M> NOTE M002 IN CIRCUIT REMARKS LISTING IRAC DOCUMENT GIVING APPROVAL FOR THIS ASSIGNMENT. (501 08)"^^xsd:string .
}
WHERE {
    ?this a sxxicc:Pub7Proposal .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem102 .
    ?dataItem102 a sxxicc:Pub7DataItem102 .
    ?dataItem102 sxxicc:pub7DataItemHasStringValue ?serialNumString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem500 .
    ?dataItem500 a sxxicc:Pub7DataItem500 .
    ?dataItem500 sxxicc:pub7DataItemHasStringValue ?iracNotesString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem501 .
    ?dataItem501 a sxxicc:Pub7DataItem501 .
    ?dataItem501 sxxicc:pub7DataItemHasStringValue ?notesFreeTextCommentsString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem113 .
    ?dataItem113 a sxxicc:Pub7DataItem113 .
    ?dataItem113 sxxicc:pub7DataItemHasStringValue ?stationClassString .
    ?this sxxicc:pub7ProposalHasDataItem ?dataItem300 .
    ?dataItem300 a sxxicc:Pub7DataItem300 .
    ?dataItem300 sxxicc:pub7DataItemHasStringValue ?stateCountryForTransmittingStation .
    BIND (SUBSTR(?serialNumString, 1, 4) AS ?orgString) .
    BIND (SUBSTR(?stationClassString, 1, 2) AS ?stationClassCode) .
    FILTER (((((?orgString = "S   "^^xsd:string) && (?iracNotesString = "S181"^^xsd:string)) && (?notesFreeTextCommentsString != "M002"^^xsd:string)) && (?stationClassCode = "FX"^^xsd:string)) && (?stateCountryForTransmittingStation = "DC"^^xsd:string)) .
}

Why does this constructor run so slowly on a modern PC (AMD quad core 2.3 GHz running Windows 8 with 16 GB physical memory and no significant additional application loading)? Other constructors run quickly on the same machine and using the same facts doing apparently similar things.

Here's a Jave VisualVM Sampler output from executing this example:

The RDF4J org.eclipse.rdf4j.common.concurrent.locks.LockManager$1.release() and org.eclipse.rdf4j.common.concurrent.locks.LockManager.createLock() dominate the Self Time. Why?? Is there something I can do to rewrite my rules to avoid this time consumption?

NOTES:

  1. The first triple of the WHERE clause is not necessary in a SPIN constructor because ?this is set automatically. However, I include it in order to ease debugging by copying this constructor into a SPARQL query (Explore/Query) in the workbench. I also find it handy to replace the CONSTRUCT clause by "SELECT DISTINCT *" while leaving the WHERE clause in tact for constructor debugging.
  2. The sole purpose of the WHERE clause in this constructor is to provide the graph pattern match that shows the error condition for the fixed error message in the CONSTRUCT clause exists. No bindings carry over from the WHERE clause into the CONSTRUCT clause, but the WHERE clause still gates assertion of the triple in the CONSTRUCT clause.

UPDATE

I have modified the constructor by removing one FILTER and the associated triples from the constructor:

FILTER (?iracNotesString = "S181"^^xsd:string) .

?this sxxicc:pub7ProposalHasDataItem ?dataItem500 .
?dataItem500 a sxxicc:Pub7DataItem500 .
?dataItem500 sxxicc:pub7DataItemHasStringValue ?iracNotesString .

This results in the constructor shown below in TBC FE:

Running the same test with the same 2 SPARQL update queries, the second query has a very non-linear reduction in execution time from over 20 seconds to under 2 seconds. Again, this does not seem right.

来源:https://stackoverflow.com/questions/39212922/rdf4j-workbench-why-is-one-spin-constructor-very-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!