Automation of SHACL generation - current estimate
After an implementation of one bit of Automation of SHACL generation - initial design the following problem was revealed.
Problem
It is a known feature of IDMP Ontology that some OWL restrictions are of conceptual nature and for this reason do not lend themselves to being transformed as SHACL shapes.
Consider the following example: https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/Molecule
<owl:Class rdf:about="https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/Molecule">
...
<skos:definition>electrically neutral entity consisting of more than one atom (n>1)</skos:definition>
<rdfs:subClassOf rdf:resource="https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/MolecularEntity"/>
...
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="https://www.omg.org/spec/Commons/Collections/comprises"/>
<owl:onClass rdf:resource="https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11238-Substances/Atom"/>
<owl:minQualifiedCardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger"
>2</owl:minQualifiedCardinality>
</owl:Restriction>
</rdfs:subClassOf>
...
<rdfs:label>molecule</rdfs:label>
</owl:Class>
The restriction in question should not be transformed as a SHACL shape because one cannot expect that a dataset about molecules contains information about atoms that made up these molecules.
Initially, @Pawel Garbacz believed that we can mark out those conceptual restrictions so that the automation process could know what to ignore.
However, the analysis from @Thomas Weber on a small part of IDMPO (https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11615-MedicinalProducts/) showed that the ratio of such conceptual restrictions may be as high as 70%.
Possible solutions
Semi-automation
We start with a set of SHACL shapes that were manually crafted (or with manually truncated set of automation). The automation infrastructure would support the review process so that during ontology development the SHACL engineer will be notified which shapes need to be reviewed to be aligned with the recent ontology developments.
Partial automation
We can automate shaclisation of only some, manually selected, classes (e.g., https://spec.pistoiaalliance.org/idmp/ontology/ISO/ISO11615-MedicinalProducts/MedicinalProduct), which do not involve any conceptual restrictions.