wiki:ExpressivityIssues

Version 8 (modified by detwiler, 9 years ago) (diff)

--

Expressivity (or other) Issues in (V)SparQL

The intent of this page is to catalogue some of the possible expressivity issues of SparQL and/or VSparQL.

Querying OWL as plain RDF

Issues arising from treating OWL ontologies as simple RDF graphs.

Lack of inferred facts:

SparQL (and VSparQL) are RDF query languages. OWL graphs frequently contain logical implications not explicitly represented in the serialized RDF graph (no "fma:Heart hasMass True", this can be inferred as it is a property of a superclass). The statements available to a (V)SparQL query engine must either be explicit in the original graph, or an inference process must produce the additional facts implied by the explicit statements. This inference process could be done by a description logic reasoner (or similar mechanism), and the implicit facts added to the queryable data source before the query is run, or the query itself could be written to do its own reasoning. Naturally there are pros and cons of each approach:

  • Using a DL reasoner: The main pro of using a DL reasoner is that it was researched, built, and refined for doing just this operation (inferring logical entailments). All of the rules for doing such logical inference are already encoded and efficiently implemented. A reasoner is likely to be both fast and correct at performing this task. The main con is that the reasoner figures out all logical entailments. Even though this step can be done off-line, when the query service is being initialized, this can still be a resource intensive operation. Many of the entailments may be irrelevant to a particular query.
  • In query reasoning: If a DL reasoner is using only facts available in the original graph in order to perform its inference, then these same facts should be available to a query. (This may be a big "if". There may be things that can be known about an OWL graph, because it is an instance of the OWL model, that are used by a DL reasoner but needn't be recorded in an OWL file.) The logical rules for processing entailments could potentially be expressed in the query itself. NOTE: It remains to be determined if all such rules can be expressed in VSparQL, clearly they cannot be in SparQL alone. The main gain of this approach is more targeted reasoning. If the query needs to calculate the transitive closure of a transitive property, but does not need to compute subsumption, then it doesn't have to. The cons are that such queries could be very verbose, complex, and error-prone.

Problems with determining an object's type:

This is another potential problem arising from treating an OWL ontology as a simple RDF graph. Suppose we want to know if the object of a triple is an instance (simple instance which is not itself a class) or a class. This may be difficult to express in (V)SparQL. Again this is something that could be ascertained by computing inferred facts (as above), but is otherwise not easily determined from the graph. For example, a resource may know it is a class because somewhere it stands in the object position of a property whose range is OWL:Class. But this may not happen. Instead it may be a subClassOf (transitively) OWL:Class. One would have to query for all information that could be used to determine if an object is a class before its object type could be determined.

SparQL's one binding at a time evaluation

Issues arising from SparQL's evaluation approach

Determining if ANY bindings meet a specified criteria:

Suppose we have several resources that would all bind to the variable ?x given the current patterns in our query's WHERE clause. Now suppose we wish to further stipulate that valid ?x bindings must also satisfy a condition like "for all triples where a valid binding for ?x occurs in the subject, the object must be a valid binding for some ?y". To say it another way, all properties of a valid ?x must be valid ?y, there can be no properties of ?x which are not valid ?y. Because SparQL engines determines validity of one complete substitution at a time, you cannot ask if all substitutions for a particular variable satisfy a criteria. You can only ask if this particular substitution works. So, you can get all triples ?x ?property ?y, but you cannot determine whether ?x has any other properties which connect to non-?y. The crux of this problems is that you can only look at the bindings to ?x (or ?y) one at a time. SparQL doesn't allow you to ask questions about the entire set of bindings.


return to View Defintion Language page