SPARQL Queries for Competency Questions
Introduction
Competency questions (CQs) are used to drive the ontology development and test the ontology capabilities to solve concrete data challenges. In order to answer a CQ such as "Which substances have the common active moiety <active moiety x>?" we need to creat a SPARQL query that can be evaluated on an IDMP Knowledge Graph (IDMP Ontology + IDMP-O aligned data).
TODO: Diagram
SPARQL Query Guidelines
In order to develop SPARQL queries in a consistent manner, the following guidelines should be applied.
Use standard prefix notations
Use CamelCase for the variables
Nodes and Individuals starting with upper case
Properties starting with lower case
Write all SPARQL key words in capital letters ("SELECT" not "select" or "Select")
Use BIND to assign all relevant variables a the top of the WHERE statement
For the CQ parameters the "$" is used for the corresponding variable
Provide at least one example in the comments for the variable BINDing
Do not require unique labels of resources (as the queries should be able to run on many different datasets)
Use inverse property pairs to be agnostic to their use e.g., ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety
Use of subproperties?
Use of full vs. shortcut pattern: substance/ingredient role pattern vs. direct "includedIn"
Comment on Entailment Regime
If CQs require any inference then this should be explicitly noted.
Example Query
# UC1-CQ 1: What substances have a common active moiety <M>?
prefix idmp-sub: <https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?Substance (SAMPLE(?SubstanceLabel) AS ?SubstanceName)
WHERE {
# Bind Variable ActiveMoiety <M>
# Example: <https://gsrs.ncats.nih.gov/api/v1/substances/1J444QC288> for GSRS Amlodipine
BIND(uc1_cq1_parameter_1 AS $M )
# Get the Entities that have the defined active moiety
?Substance ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety $M .
# Make sure that we only return actual substances
?Substance a/rdfs:subClassOf* idmp-sub:Substance .
# Optionally, get the name of the substance
OPTIONAL{?Substance rdfs:label ?SubstanceLabel }
} GROUP BY ?Substance