Pattern: Metadata and Annotations - APPROVED
The following paragraphs and corresponding tables provide guidelines on metadata and related annotations to be used for the IDMP Project.
Dependencies on ontologies that are external to the Pistoia Alliance / IDMP project are given below.
Canonical Prefix | Ontology IRI | Description |
---|---|---|
dct | http://purl.org/dc/terms/ | Dublin Core Metadata Initiative's (DCMI) Metadata Terms - see https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ |
skos | http://www.w3.org/2004/02/skos/core | Worldwide Web Consortium (W3C) Simple Knowledge Organization System (SKOS) Recommendation - see https://www.w3.org/2004/02/skos/ |
cmns-av | https://www.omg.org/spec/Commons/AnnotationVocabulary/ | Object Management Group (OMG)'s Commons Library Annotation Vocabulary - part of an emerging standard whose RFP is available at https://www.omg.org/techprocess/meetings/schedule/MVF_RFP.html, scheduled for adoption in June 2022 |
Ontology Resource (Model Element) Labeling
The annotations for labels given below are primarily used with for the content of an ontology, although a label at the ontology level is also required. See below for ontology-level metadata requirements.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
label | rdfs:label | instance of rdf:Property that may be used to provide a human-readable version of a resource's name | https://www.w3.org/TR/rdf-schema/#ch_label | Every class, property, and named individual in an IDMP ontology must have an rdfs:label - exactly one if the default of American English is intended which may optionally have a language tag; if more than one instance of rdfs:label occurs, each must have a unique language tag, including the English label. |
preferred label | skos:prefLabel | preferred lexical label for a resource, in a given language | https://www.w3.org/TR/skos-reference/#labels | At most one skos:prefLabel may occur as a label, in addition to and possibly the same as the base rdfs:label in order to distinguish it from other labels given to a particular concept. |
alternative label | skos:altLabel | an alternative lexical label for a resource | https://www.w3.org/TR/skos-reference/#labels | skos:altLabel may be used as needed in cases where one cannot be more specific, such as that the label is a synonym for the term. skos:altLabel should not be used for abbreviations or symbols, where a more specific property is available. |
synonym | cmns-av:synonym | designation that can be substituted for the primary representation of something | ISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09 | cmns-av:synonym should be used, rather than skos:altLabel (it's parent property) as appropriate. Most alternative labels are considered synonyms for a concept, aside from abbreviations or symbols. |
abbreviation | cmns-av:abbreviation | designation formed by omitting parts from the full form of a term that denotes the same concept | ISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09; ISO 31-0 Quantities and units - General principles | For IDMP, although we have the option of finer granularity with respect to acronyms and symbols, it may be sufficient to simply use abbreviation for the MVP. Symbol is the proper terminology for chemical symbols per ISO 31, even those that "look like" simpler abbreviations. The EDMC infrastructure can be tuned to allow special characters in symbols that would be disallowed in general abbreviations, so this is something we should discuss. |
symbol | cmns-av:symbol | abbreviation that is a design or mark, or other non-alpha-numeric character(s) conventionally used to represent something, such as a currency or mathematical sign or operator | Object Management Group, Analysis & Design Task Force recommendation Also used in the Industrial Ontology Foundry (IOF) | The use of this annotation property is optional. The OMG distinguishes non-natural language symbols from abbreviations that are expressed in natural language. This may cause issues for having a common representation of chemical symbols and units of measure, but is available for use if it makes sense for IDMP. |
acronym | cmns-av:acronym | abbreviation that is made up of the initial letters of the components of the full form of a term or proper name or from syllables of the full form | ISO 1087 Terminology work and terminology science - Vocabulary, Second edition, 2019-09 | The use of this annotation property is optional. It has been successfully used at the OMG and for enterprise glossary applications as the basis for generating acronym lists for specifications and other documents. It may be overkill for the IDMP, and is lower priority. |
title | dct:title | formal name given to the resource (artifact, such as a controlled vocabulary or ontology) | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/title | The use of this annotation property is optional. It is typically used to describe an artifact such as a controlled vocabulary, document, ontology, or other similar resource. |
Ontology Resource (Model Element) Definitions
Definitions are required for every class, property and nominal (individual that is key to understanding the ontology, rather than generated reference data or example data) in an IDMP ontology. In cases where a definition does not make sense, such as for generated code lists, a description, using dct:description, is often helpful to users.
Definitions for the IDMP project should generally follow the guidance provided in
- ISO 704: Terminology work — Principles and methods for terminology development,
- ISO 1087: Terminology work and terminology science — Vocabulary.
In other words, the following minimal metadata is required for each class, property, and nominal:
- a label,
- a definition,
- explanatory and other notes, such as examples, as well as source information when applicable.
Definitions MUST follow ISO 704 recommendations for establishing good definitions.
We anticipate that many commercial and government organizations will use the IDMP ontologies as a reference for enterprise glossary development, interoperability, natural language processing, machine learning, and other applications that need both the mathematics and logic incorporated in the ontology as well as the terminology, which is essential for enterprise glossary work and natural language processing applications. We recommend and are using ISO 704 for IDMP primarily because the principles it defines help ensure the consistency and quality of our definitions:
- Every class, property, and individual (nominal) MUST have a skos:definition (exactly 1 per natural language, with the default being American English).
- Definitions MUST be ISO 704 conformant, meaning; they must be expressed as partial sentences that can be used to replace the ontology element (concept, relationship, attribute, nominal) in a sentence.
- Any additional clarification, scope notes, explanatory notes, or other comments on the use of a given concept should be incorporated in other annotations.
- Definitions MUST not be circular, i.e., the class, property, or individual name must not be used in the definition itself.
Additional requirements with respect to definition development and supporting annotations include:
- Every first-class element (class, property, and defining individual (nominal)) must have a definition, expressed using the skos:definition annotation property rather than rdfs:comment.
- ISO 704 suggests a "genus / differentia" structure for definitions, meaning, it recommends identifying one or more ancestral concepts as well as relationships and characteristics that differentiate the concept in question from sibling concepts.
- E.g. A legal entity is a
- (GENUS) legal person
- (DIFFERENTIA SPECIFICA) that is a partnership, corporation, or other organization having the capacity to negotiate contracts, assume financial obligations, and pay off debts, organized under the laws of some jurisdiction
- E.g. A debt instrument is a
- (GENUS) financial instrument
- (DIFFERENTIA SPECIFICA) that enables the issuing party to raise funds by accepting the obligation to repay a lender by a particular time in accordance with the terms of a contract
- For classes (nouns), most definitions should be phrased <parent class>" that …", naming the parent(s) and including text that relates that class to others through relationships (object properties) and characteristics (attributes / data properties).
- For properties, most definitions should be phrased <parent property>" that …" – in a similar form as the definitions for classes; all property definitions must begin with a verb.
- E.g. A legal entity is a
- Definitions should not include content that is or can be modeled via restrictions unless that content is inherently required to define the concept. For example, a contract is defined as an agreement between competent parties (and other things). Because the fact that at least two parties are required to define a contract, it is included in the intensional definition even though we also have a restriction on having at least two parties in FIBO.
- Definitions ideally should be sourced from sanctioned references, such as government glossaries, ISO standards, etc. and such sources should be noted in annotations that specify their source (see below for annotations for references).
- Additional information, such as examples, scope details, explanations, and so forth, should be captured using the appropriate annotation from SKOS, Dublin Core, or the CMNS Annotation Vocabulary, where possible. If an additional annotation is required and not present in one of these vocabularies, it should be presented to the governance team for inclusion.
- Reference data individuals may use dct:description to provide something similar to a definition, if a formal definition is infeasible, but any defining individual (nominal) MUST use skos:definition as the annotation property linking that individual to its definition.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
definition | skos:definition | formal statement of the meaning of a resource | https://www.w3.org/TR/skos-reference/#notes | Every element in an IDMP ontology must have a skos:definition - exactly one if the default of American English is intended which may optionally have a language tag; if more than one instance of skos:definition occurs, each must have a unique language tag, including the English definition. |
logical definition | cmns-av:logicalDefinition | definition in the form of a formal expression, such as the mathematical or logic representation, for the resource | Object Management Group, Analysis & Design Task Force recommendation Also used in the Industrial Ontology Foundry (IOF) | The use of this annotation property is optional. There may be cases where representing the defining characteristics for a class, e.g., necessary and/or sufficient conditions for membership is made clearer through the description logics or first order representation, when this annotation may be used in addition to the skos:definition (but not instead of it). |
Example: There is no definition of molecular graph in the ISO standards. We need to be able to incorporate that to link to Chemantics data, for example. A rough definition for molecular graph is: "The Molecular Graph is a compound concept, which represents a single chemical structure, which is normalized to conform, as much as possible, with the IUPAC specification for drawing chemical structures." The resulting skos:definition, in ISO 704 format, is: "single chemical structure that is an unambiguous representation of the arrangement of atoms, normalized to conform with the International Union of Pure and Applied Chemistry (IUPAC) specification for drawing chemical structures to the degree possible".
Example: Properties relating a substance to other substances, and to its structure, do not have explicit definitions in the ISO standards, though there are associations present in the relevant diagrams. The skos:definition, in ISO 704 format, for isRelatedSubstanceTo is "specifies a target substance with which the source has some relationship", and for hasStructure is "indicates any arrangement and/or organization of interrelated elements in a substance" at the highest level, so that it can be used generally to describe the structure of single or more complex substances.
Citations and References
Several annotation properties are useful for referring to the source for terminology, definitions, additional details or other information about a resource. For the purposes of IDMP, the following annotations may be used, as appropriate. We may investigate using a more complete ontology for bibliographic references if that becomes necessary over the course of the project.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
references | dct:references | indicates a related resource that is referenced, cited, or otherwise pointed to by the described resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/references | Use of references is optional, and may be used at the ontology or element (resource) level as appropriate. |
source | dct:source | related resource from which the described resource is derived | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/source | Use of source is optional, and may be used at the ontology or element (resource) level as appropriate. It is useful for citing the source for a definition or term, but we recommend use one of its subproperties ( direct source or adapted from) whenever possible. |
direct source | cmns-av:directSource | quoted reference for the subject resource; the range for this annotation can be a string, URI, or bibliographic citation | Object Management Group, Analysis & Design Task Force recommendation Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF) | |
adapted from | cmns-av:adaptedFrom | document or other source from which a given term (or its definition) was adapted (i.e., is compatible with but not quoted); the range for this annotation can be a string, URI, or citation | Object Management Group, Analysis & Design Task Force recommendation Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF) | This annotation should be used to indicate that a reference was used, for example, as input to the development of a definition or term but would not be considered infringing on a copyright. |
see also | rdfs:seeAlso | indicates a resource that might provide additional information about the subject resource | https://www.w3.org/TR/rdf-schema/#ch_seealso |
Notes
A number of other annotations are useful for explaining the classes and properties in the IDMP ontology. We use a combination of Dublin Core Metadata Terms, Simple Knowledge Organization System (SKOS) annotations, and additional annotations defined in the Commons Annotation Vocabulary, as needed. These include the following:
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
note | skos:note | general remark, for any purpose | https://www.w3.org/TR/skos-reference/#note | |
explanatory note | cmns:explanatoryNote | note that provides additional explanatory material for a resource | Object Management Group, Analysis & Design Task Force recommendation Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF) | |
description | dct:description | account of the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/description | This annotation is typically used to describe individuals in reference data, where skos:definition may not work as well. |
abstract | dct:abstract | summary of the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/abstract | This annotation is typically used to describe an artifact such as a controlled vocabulary, ontology, or other similar resource. Every IDMP ontology must have an abstract (rather than a skos:definition). See ontology-level annotations. |
change note | skos:changeNote | note describing a modification to a resource | https://www.w3.org/TR/skos-reference/#changeNote | See change management and versioning, below. |
editorial note | skos:editorialNote | note for an editor, translator, or maintainer of the controlled vocabulary or ontology | https://www.w3.org/TR/skos-reference/#editorialNote | Use of skos:editorialNote is reserved for editors' notes in provisional ontologies and that are intended to be deleted prior to release. |
example | skos:example | illustration of the use of some resource | https://www.w3.org/TR/skos-reference/#example | |
scope note | skos:scopeNote | note that helps to clarify the meaning of something within the context of the intended use of the resource | https://www.w3.org/TR/skos-reference/#scopeNote | |
usage note | cmns-av:usageNote | note that provides information about how a given resource is used or may be extended | Object Management Group, Analysis & Design Task Force recommendation Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF) | Usage note overlaps to some degree with editorialNote, but is retained on release. The intent is to provide notes for any user of the ontology that may be needed to help explain how to use the ontology element for extension purposes or in specific patterns. It is not used frequently in FIBO or IOF, but has been found to be quite useful in cases where the notes need to be persistent. |
Licensing and Copyright Information
Every IDMP ontology MUST include exactly one (1) license statement and at least one copyright statement. The license will be an open-source license as determined prior to release, likely either the MIT license or CC by 4 from Creative Commons. FIBO and other OMG ontologies as well as the IOF ontologies use the MIT license based on feedback from corporate attorneys for the participating organizations who preferred it over CC by 4. Thus, the MIT license is the default for the IDMP ontologies until / unless some other decision is reached by the board.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
license | dct:license | legal document giving official permission to do something with the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/license | This annotation is required and applies to each ontology or controlled vocabulary. The value for the license annotation MUST be https://opensource.org/licenses/MIT unless another choice is made by the IDMP team. |
copyright | cmns-av:copyright | exclusive legal right, given to an originator or an assignee to print, publish, perform, film, or record literary, artistic, or musical material, and to authorize others to do the same | Object Management Group, Analysis & Design Task Force recommendation Also used in the Financial Industry Business Ontology (FIBO) and in the Industrial Ontology Foundry (IOF) | This annotation is required (at least one, as follows) and applies to each ontology or controlled vocabulary. <cmns-av:copyright>Copyright (c) 2022 Pistoia Alliance, Inc.</cmns-av:copyright> |
Change Management and Versioning
Every IDMP ontology MUST include a non-versioned and versioned ontology IRI, as specified in Modelling Policy and Pattern: Internationalized Resource Identifier (IRI) Structure, Format, and Ontology Naming Conventions for IDMP. For released ontologies, the ontology version IRI will include either a version <YYYYMMDD> or possibly follow the mechanism used in FIBO for quarterly releases, which is <QxYYYY>, as determined as we approach a final release for the MVP. Released ontologies managed in GitHub will use the <YYYYMMDD> convention, however. Pre-release ontologies will include the owl:versionIRI annotation but the IRI will be the same as the non-versioned IRI until such time as the ontology is released. The following additional annotations are required or optional as described below.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
change note | skos:changeNote | note describing a modification to a resource | https://www.w3.org/TR/skos-reference/#changeNote | The recommendation is one change note per ontology for each release/version, though we could use finer granularity and require a change note per pull request - requires discussion. |
deprecation | owl:deprecated | |||
date issued | dct:issued | date of formal issuance of the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/issued | This annotation could be either (1) the first time an ontology is released, or (2) for ontologies that reflect reference data, we could use it to state the date of issuance of the information used to generate the content (which is what we do in FIBO for reference data, such as currency codes, market codes, FpML interest rates, etc.). Format should follow ISO 8601. |
date modified | dct:modified | date on which the resource was most recently revised | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/modified | Format should follow ISO 8601. |
prior version | owl:priorVersion | specifies the IRI of a prior version of the containing ontology | https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_priorVersion | This annotation is one that we are discussing as potentially required for released ontologies, and could be generated. Currently the infrastructure does not support adding this as part of the release process but could be modified to do so. |
is defined by | rdfs:isDefinedBy | indicates a resource defining the subject resource | https://www.w3.org/TR/rdf-schema/#ch_isdefinedby | This annotation is used to state which RDF vocabulary / OWL ontology a resource is explicitly defined in and the values for this can be automatically generated on release by the infrastructure. |
backwards compatible with | owl:backwardsCompatibleWith | specifies the IRI of a prior version of the containing ontology that is compatible with the current version of the containing ontology | https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_backwardCompatibleWith | This annotation is optional, but if used, the containing ontology must be a monotonic extension, with no deletions, of the prior version. |
incompatible with | owl:incompatibleWith | specifies the IRI of a prior version of the containing ontology that is incompatible with the current version of the containing ontology | https://www.w3.org/TR/2012/REC-owl2-syntax-20121211/#a_incompatibleWith | This annotation is optional. |
Contributors
A number of organizations are participating in the IDMP effort. We could include the names of the people that were creators / contributors to a given ontology as desired, though each company should have a copyright statement rather than duplicating company names as contributors. The following annotations may be used for this purpose.
Concept | Annotation Property | Definition | Source | Notes |
---|---|---|---|---|
contributor | dct:contributor | party that has made contributions to the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/contributor | |
creator | dct:creator | party that originated the resource | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/creator | |
publisher | dct:publisher | primary party responsible for making the resource available | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/publisher | If we use this for the IDMP ontologies, we will need to agree on what this refers to and then it could be added automatically by the infrastructure on release. |