Pattern: Metadata and Annotations - APPROVED

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

The following paragraphs and corresponding tables provide guidelines on metadata and related annotations to be used for the IDMP Project.

Dependencies on ontologies that are external to the Pistoia Alliance / IDMP project are given below.

Canonical PrefixOntology IRIDescription
dcthttp://purl.org/dc/terms/Dublin Core Metadata Initiative's (DCMI) Metadata Terms - see https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ 
skoshttp://www.w3.org/2004/02/skos/coreWorldwide Web Consortium (W3C) Simple Knowledge Organization System (SKOS) Recommendation - see https://www.w3.org/2004/02/skos/
cmns-avhttps://www.omg.org/spec/Commons/AnnotationVocabulary/Object Management Group (OMG)'s Commons Library Annotation Vocabulary - part of an emerging standard whose RFP is available at https://www.omg.org/techprocess/meetings/schedule/MVF_RFP.html, scheduled for adoption in June 2022 

Ontology Resource (Model Element) Labeling

The annotations for labels given below are primarily used with for the content of an ontology, although a label at the ontology level is also required.  See below for ontology-level metadata requirements.

ConceptAnnotation PropertyDefinitionSourceNotes
labelrdfs:labelinstance of rdf:Property that may be used to provide a human-readable version of a resource's namehttps://www.w3.org/TR/rdf-schema/#ch_labelEvery class, property, and named individual in an IDMP ontology must have an rdfs:label - exactly one if the default of American English is intended which may optionally have a language tag; if more than one instance of rdfs:label occurs, each must have a unique language tag, including the English label.
preferred labelskos:prefLabelpreferred lexical label for a resource, in a given languagehttps://www.w3.org/TR/skos-reference/#labelsAt most one skos:prefLabel may occur as a label, in addition to and possibly the same as the base rdfs:label in order to distinguish it from other labels given to a particular concept.
alternative labelskos:altLabelan alternative lexical label for a resourcehttps://www.w3.org/TR/skos-reference/#labelsskos:altLabel may be used as needed in cases where one cannot be more specific, such as that the label is a synonym for the term.  skos:altLabel should not be used for abbreviations or symbols, where a more specific property is available.
synonymcmns-av:synonymdesignation that can be substituted for the primary representation of somethingISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09cmns-av:synonym should be used, rather than skos:altLabel (it's parent property) as appropriate.  Most alternative labels are considered synonyms for a concept, aside from abbreviations or symbols.
abbreviationcmns-av:abbreviationdesignation formed by omitting parts from the full form of a term that denotes the same concept

ISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09;

ISO 31-0 Quantities and units - General principles

For IDMP, although we have the option of finer granularity with respect to acronyms and symbols, it may be sufficient to simply use abbreviation for the MVP.  Symbol is the proper terminology for chemical symbols per ISO 31, even those that "look like" simpler abbreviations.  The EDMC infrastructure can be tuned to allow special characters in symbols that would be disallowed in general abbreviations, so this is something we should discuss.
symbolcmns-av:symbolabbreviation that is a design or mark, or other non-alpha-numeric character(s) conventionally used to represent something, such as a currency or mathematical sign or operator

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Industrial Ontology Foundry (IOF)

The use of this annotation property is optional. The OMG distinguishes non-natural language symbols from abbreviations that are expressed in natural language.  This may cause issues for having a common representation of chemical symbols and units of measure, but is available for use if it makes sense for IDMP.
acronymcmns-av:acronymabbreviation that is made up of the initial letters of the components of the full form of a term or proper name or from syllables of the full formISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09The use of this annotation property is optional.  It has been successfully used at the OMG and for enterprise glossary applications as the basis for generating acronym lists for specifications and other documents. It may be overkill for the IDMP, and is lower priority.
titledct:titleformal name given to the resource (artifact, such as a controlled vocabulary or ontology)https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/titleThe use of this annotation property is optional. It is typically used to describe an artifact such as a controlled vocabulary, document, ontology, or other similar resource.

Ontology Resource (Model Element) Definitions

Definitions are required for every class, property and nominal (individual that is key to understanding the ontology, rather than generated reference data or example data) in an IDMP ontology.  In cases where a definition does not make sense, such as for generated code lists, a description, using dct:description, is often helpful to users. 

Definitions for the IDMP project should generally follow the guidance provided in

  • ISO 704: Terminology work — Principles and methods for terminology development,
  • ISO 1087: Terminology work and terminology science — Vocabulary.


In other words, the following minimal metadata is required for each class, property, and nominal:

  • a label,
  • a definition,
  • explanatory and other notes, such as examples, as well as source information when applicable.

Definitions MUST follow ISO 704 recommendations for establishing good definitions.

We anticipate that many commercial and government organizations will use the IDMP ontologies as a reference for enterprise glossary development, interoperability, natural language processing, machine learning, and other applications that need both the mathematics and logic incorporated in the ontology as well as the terminology, which is essential for enterprise glossary work and natural language processing applications. We recommend and are using ISO 704 for IDMP primarily because the principles it defines help ensure the consistency and quality of our definitions:

  • Every class, property, and individual (nominal) MUST have a skos:definition (exactly 1 per natural language, with the default being American English).
  • Definitions MUST be ISO 704 conformant, meaning; they must be expressed as partial sentences that can be used to replace the ontology element (concept, relationship, attribute, nominal) in a sentence.
  • Any additional clarification, scope notes, explanatory notes, or other comments on the use of a given concept should be incorporated in other annotations.
  • Definitions MUST not be circular, i.e., the class, property, or individual name must not be used in the definition itself.

Additional requirements with respect to definition development and supporting annotations include:

  • Every first-class element (class, property, and defining individual (nominal)) must have a definition, expressed using the skos:definition annotation property rather than rdfs:comment.
  • ISO 704 suggests a "genus / differentia" structure for definitions, meaning, it recommends identifying one or more ancestral concepts as well as relationships and characteristics that differentiate the concept in question from sibling concepts.
    • E.g. A legal entity is a
      • (GENUS) legal person
      • (DIFFERENTIA SPECIFICA) that is a partnership, corporation, or other organization having the capacity to negotiate contracts, assume financial obligations, and pay off debts, organized under the laws of some jurisdiction
    • E.g. A debt instrument is a
      • (GENUS) financial instrument
      • (DIFFERENTIA SPECIFICA) that enables the issuing party to raise funds by accepting the obligation to repay a lender by a particular time in accordance with the terms of a contract
    • For classes (nouns), most definitions should be phrased <parent class>" that …", naming the parent(s) and including text that relates that class to others through relationships (object properties) and characteristics (attributes / data properties).
    • For properties, most definitions should be phrased <parent property>" that …" – in a similar form as the definitions for classes; all property definitions must begin with a verb.
  • Definitions should not include content that is or can be modeled via restrictions unless that content is inherently required to define the concept. For example, a contract is defined as an agreement between competent parties (and other things).  Because the fact that at least two parties are required to define a contract, it is included in the intensional definition even though we also have a restriction on having at least two parties in FIBO. 
  • Definitions ideally should be sourced from sanctioned references, such as government glossaries, ISO standards, etc. and such sources should be noted in annotations that specify their source (see below for annotations for references).
  • Additional information, such as examples, scope details, explanations, and so forth, should be captured using the appropriate annotation from SKOS, Dublin Core, or the CMNS Annotation Vocabulary, where possible. If an additional annotation is required and not present in one of these vocabularies, it should be presented to the governance team for inclusion.
  • Reference data individuals may use dct:description to provide something similar to a definition, if a formal definition is infeasible, but any defining individual (nominal) MUST use skos:definition as the annotation property linking that individual to its definition.
ConceptAnnotation PropertyDefinitionSourceNotes
definitionskos:definitionformal statement of the meaning of a resourcehttps://www.w3.org/TR/skos-reference/#notesEvery element in an IDMP ontology must have a skos:definition - exactly one if the default of American English is intended which may optionally have a language tag; if more than one instance of skos:definition occurs, each must have a unique language tag, including the English definition.
logical definitioncmns-av:logicalDefinitiondefinition in the form of a formal expression, such as the mathematical or logic representation, for the resource

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Industrial Ontology Foundry (IOF)

The use of this annotation property is optional.  There may be cases where representing the defining characteristics for a class, e.g., necessary and/or sufficient conditions for membership is made clearer through the description logics or first order representation, when this annotation may be used in addition to the skos:definition (but not instead of it).

Example:  There is no definition of molecular graph in the ISO standards.  We need to be able to incorporate that to link to Chemantics data, for example.  A rough definition for molecular graph is: "The Molecular Graph is a compound concept, which represents a single chemical structure, which is normalized to conform, as much as possible, with the IUPAC specification for drawing chemical structures."  The resulting skos:definition, in ISO 704 format, is: "single chemical structure that is an unambiguous representation of the arrangement of atoms, normalized to conform with the International Union of Pure and Applied Chemistry (IUPAC) specification for drawing chemical structures to the degree possible".

Example: Properties relating a substance to other substances, and to its structure, do not have explicit definitions in the ISO standards, though there are associations present in the relevant diagrams.  The skos:definition, in ISO 704 format, for isRelatedSubstanceTo is "specifies a target substance with which the source has some relationship", and for hasStructure is "indicates any arrangement and/or organization of interrelated elements in a substance" at the highest level, so that it can be used generally to describe the structure of single or more complex substances.

Citations and References

Several annotation properties are useful for referring to the source for terminology, definitions, additional details or other information about a resource.  For the purposes of IDMP, the following annotations may be used, as appropriate. We may investigate using a more complete ontology for bibliographic references if that becomes necessary over the course of the project.

ConceptAnnotation PropertyDefinitionSourceNotes
referencesdct:referencesindicates a related resource that is referenced, cited, or otherwise pointed to by the described resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/referencesUse of references is optional, and may be used at the ontology or element (resource) level as appropriate.
sourcedct:sourcerelated resource from which the described resource is derivedhttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/sourceUse of source is optional, and may be used at the ontology or element (resource) level as appropriate. It is useful for citing the source for a definition or term, but we recommend use one of its subproperties ( direct source or adapted from) whenever possible.
direct sourcecmns-av:directSourcequoted reference for the subject resource; the range for this annotation can be a string, URI, or bibliographic citation

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF)


adapted fromcmns-av:adaptedFromdocument or other source from which a given term (or its definition) was adapted (i.e., is compatible with but not quoted); the range for this annotation can be a string, URI, or citation

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF)

This annotation should be used to indicate that a reference was used, for example, as input to the development of a definition or term but would not be considered infringing on a copyright.
see alsordfs:seeAlsoindicates a resource that might provide additional information about the subject resourcehttps://www.w3.org/TR/rdf-schema/#ch_seealso

Notes

A number of other annotations are useful for explaining the classes and properties in the IDMP ontology. We use a combination of Dublin Core Metadata Terms, Simple Knowledge Organization System (SKOS) annotations, and additional annotations defined in the Commons Annotation Vocabulary, as needed.  These include the following:

ConceptAnnotation PropertyDefinitionSourceNotes
noteskos:notegeneral remark, for any purposehttps://www.w3.org/TR/skos-reference/#note
explanatory notecmns:explanatoryNotenote that provides additional explanatory material for a resource

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF)


descriptiondct:descriptionaccount of the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/descriptionThis annotation is typically used to describe individuals in reference data, where skos:definition may not work as well.
abstractdct:abstractsummary of the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/abstractThis annotation is typically used to describe an artifact such as a controlled vocabulary, ontology, or other similar resource.  Every IDMP ontology must have an abstract (rather than a skos:definition).  See ontology-level annotations.
change noteskos:changeNotenote describing a modification to a resourcehttps://www.w3.org/TR/skos-reference/#changeNoteSee change management and versioning, below.
editorial noteskos:editorialNotenote for an editor, translator, or maintainer of the controlled vocabulary or ontologyhttps://www.w3.org/TR/skos-reference/#editorialNoteUse of skos:editorialNote is reserved for editors' notes in provisional ontologies and that are intended to be deleted prior to release.
exampleskos:exampleillustration of the use of some resourcehttps://www.w3.org/TR/skos-reference/#example

scope note

skos:scopeNotenote that helps to clarify the meaning of something within the context of the intended use of the resourcehttps://www.w3.org/TR/skos-reference/#scopeNote
usage notecmns-av:usageNotenote that provides information about how a given resource is used or may be extended

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF)

Usage note overlaps to some degree with editorialNote, but is retained on release.  The intent is to provide notes for any user of the ontology that may be needed to help explain how to use the ontology element for extension purposes or in specific patterns.  It is not used frequently in FIBO or IOF, but has been found to be quite useful in cases where the notes need to be persistent.

Licensing and Copyright Information

Every IDMP ontology MUST include exactly one (1) license statement and at least one copyright statement.  The license will be an open-source license as determined prior to release, likely either the MIT license or CC by 4 from Creative Commons.  FIBO and other OMG ontologies as well as the IOF ontologies use the MIT license based on feedback from corporate attorneys for the participating organizations who preferred it over CC by 4.  Thus, the MIT license is the default for the IDMP ontologies until / unless some other decision is reached by the board.

ConceptAnnotation PropertyDefinitionSourceNotes
licensedct:licenselegal document giving official permission to do something with the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/license This annotation is required and applies to each ontology or controlled vocabulary. The value for the license annotation MUST be https://opensource.org/licenses/MIT unless another choice is made by the IDMP team.
copyrightcmns-av:copyrightexclusive legal right, given to an originator or an assignee to print, publish, perform, film, or record literary, artistic, or musical material, and to authorize others to do the same

Object Management Group, Analysis & Design Task Force recommendation

Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF)

This annotation is required (at least one, as follows) and applies to each ontology or controlled vocabulary.

        <cmns-av:copyright>Copyright (c) 2022 Pistoia Alliance, Inc.</cmns-av:copyright>

Change Management and Versioning

Every IDMP ontology MUST include a non-versioned and versioned ontology IRI, as specified in Modelling Policy and Pattern: Internationalized Resource Identifier (IRI) Structure, Format, and Ontology Naming Conventions for IDMP.  For released ontologies, the ontology version IRI will include either a version <YYYYMMDD> or possibly follow the mechanism used in FIBO for quarterly releases, which is <QxYYYY>, as determined as we approach a final release for the MVP.  Released ontologies managed in GitHub will use the <YYYYMMDD> convention, however. Pre-release ontologies will include the owl:versionIRI annotation but the IRI will be the same as the non-versioned IRI until such time as the ontology is released. The following additional annotations are required or optional as described below.

ConceptAnnotation PropertyDefinitionSourceNotes
change noteskos:changeNotenote describing a modification to a resourcehttps://www.w3.org/TR/skos-reference/#changeNoteThe recommendation is one change note per ontology for each release/version, though we could use finer granularity and require a change note per pull request - requires discussion.
deprecationowl:deprecated


date issueddct:issueddate of formal issuance of the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/issuedThis annotation could be either (1) the first time an ontology is released, or (2) for ontologies that reflect reference data, we could use it to state the date of issuance of the information used to generate the content (which is what we do in FIBO for reference data, such as currency codes, market codes, FpML interest rates, etc.).  Format should follow ISO 8601.
date modifieddct:modifieddate on which the resource was most recently revisedhttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/modifiedFormat should follow ISO 8601.
prior versionowl:priorVersionspecifies the IRI of a prior version of the containing ontologyhttps://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_priorVersionThis annotation is one that we are discussing as potentially required for released ontologies, and could be generated.  Currently the infrastructure does not support adding this as part of the release process but could be modified to do so.
is defined byrdfs:isDefinedByindicates a resource defining the subject resourcehttps://www.w3.org/TR/rdf-schema/#ch_isdefinedbyThis annotation is used to state which RDF vocabulary / OWL ontology a resource is explicitly defined in and the values for this can be automatically generated on release by the infrastructure.
backwards compatible withowl:backwardsCompatibleWithspecifies the IRI of a prior version of the containing ontology that is compatible with the current version of the containing ontologyhttps://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_backwardCompatibleWithThis annotation is optional, but if used, the containing ontology must be a monotonic extension, with no deletions, of the prior version.
incompatible withowl:incompatibleWithspecifies the IRI of a prior version of the containing ontology that is incompatible with the current version of the containing ontologyhttps://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_incompatibleWithThis annotation is optional.

Contributors

A number of organizations are participating in the IDMP effort.  We could include the names of the people that were creators / contributors to a given ontology as desired, though each company should have a copyright statement rather than duplicating company names as contributors.  The following annotations may be used for this purpose.

ConceptAnnotation PropertyDefinitionSourceNotes
contributordct:contributorparty that has made contributions to the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/contributor
creatordct:creatorparty that originated the resourcehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/creator
publisherdct:publisherprimary party responsible for making the resource availablehttps://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/publisherIf we use this for the IDMP ontologies, we will need to agree on what this refers to and then it could be added automatically by the infrastructure on release.