SPARQL Queries for Competency Questions

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Introduction

Competency questions (CQs) are used to drive the ontology development and test the ontology capabilities to solve concrete data challenges. In order to answer a CQ such as "Which substances have the common active moiety <active moiety x>?" we need to creat a SPARQL query that can be evaluated on an IDMP Knowledge Graph (IDMP Ontology + IDMP-O aligned data).

TODO: Diagram

SPARQL Query Guidelines

In order to develop SPARQL queries in a consistent manner, the following guidelines should be applied.

  1. Use standard prefix notations
  2. Use CamelCase for the variables
    1. Nodes and Individuals starting with upper case
    2. Properties starting with lower case
  3. Write all SPARQL key words in capital letters ("SELECT" not "select" or "Select") 
  4. Use BIND to assign all relevant variables a the top of the WHERE statement
  5. For the CQ parameters the "$" is used for the corresponding variable
  6. Provide at least one example in the comments for the variable BINDing
  7. Do not require unique labels of resources (as the queries should be able to run on many different datasets)
  8. Use inverse property pairs to be agnostic to their use e.g., ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety
  9. Use of subproperties?
  10. Use of full vs. shortcut pattern: substance/ingredient role pattern vs. direct "includedIn"

Comment on Entailment Regime

If CQs require any inference then this should be explicitly noted.

Example Query


# UC1-CQ 1: What substances have a common active moiety <M>?
prefix idmp-sub:    <https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/>
prefix rdf:         <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs:        <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?Substance (SAMPLE(?SubstanceLabel) AS ?SubstanceName)
WHERE {
	# Bind Variable ActiveMoiety <M>
    # Example: <https://gsrs.ncats.nih.gov/api/v1/substances/1J444QC288> for GSRS Amlodipine
	BIND(uc1_cq1_parameter_1 AS $M )
  	
    # Get the Entities that have the defined active moiety
    ?Substance ^idmp-sub:isActiveMoietyOf|idmp-sub:hasActiveMoiety $M .
    
    # Make sure that we only return actual substances 
  	?Substance a/rdfs:subClassOf* idmp-sub:Substance .
	
    # Optionally, get the name of the substance
	OPTIONAL{?Substance  rdfs:label  ?SubstanceLabel }

} GROUP BY ?Substance