Competency questions (CQs) are used to drive the ontology development and test the ontology capabilities to solve concrete data challenges. In order to answer a CQ such as "Which substances have the common active moiety <active moiety x>?" we need to creat a SPARQL query that can be evaluated on an IDMP Knowledge Graph (IDMP Ontology + IDMP-O aligned data).
TODO: Diagram
SPARQL Query Guidelines
In order to develop SPARQL queries in a consistent manner, the following guidelines should be applied.
- Use standard prefix notations
- Use CamelCase for the variables
- Nodes and Individuals starting with upper case
- Properties starting with lower case
- Write all SPARQL key words in capital letters ("SELECT" not "select" or "Select")
- Use BIND to assign all relevant variables a the top of the WHERE statement
- For the CQ parameters the "$" is used for the corresponding variable
- Provide at least one example in the comments for the variable BINDing
- Do not require unique labels of resources (as the queries should be able to run on many different datasets)
- Use inverse property pairs to be agnostic to their use e.g., ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety
- Use of subproperties?
- Use of full vs. shortcut pattern: substance/ingredient role pattern vs. direct "includedIn"
Comment on Entailment Regime
If CQs require any inference then this should be explicitly noted.
Example Query
# UC1-CQ 1: What substances have a common active moiety <M>? prefix idmp-sub: <https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?Substance (SAMPLE(?SubstanceLabel) AS ?SubstanceName) WHERE { # Bind Variable ActiveMoiety <M> # Example: <https://gsrs.ncats.nih.gov/api/v1/substances/1J444QC288> for GSRS Amlodipine BIND(uc1_cq1_parameter_1 AS ?M ) # Get the Entities that have the defined active moiety ?Substance ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety ?M . # Make sure that we only return actual substances ?Substance a/rdfs:subClassOf* idmp-sub:Substance . # Optionally, get the name of the substance OPTIONAL{?Substance rdfs:label ?SubstanceLabel } } GROUP BY ?Substance