Modelling Policy and Pattern: Naming Conventions for Elements within the Body of an Ontology

Modelling Policy and Pattern: Naming Conventions for Elements within the Body of an Ontology

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

General Naming and Labeling Conventions

For the IDMP project, a number of considerations must be taken into account with respect to naming and labeling of ontology resources. We anticipate that the number and nature of individuals, including reference data for substances and medicinal projects, will be quite large and fairly broadly mapped across multiple classification schemes.  As a consequence, generation of unique IRIs for reference data is anticipated.  An essential component of the IDMP project will be to develop strategies for generation and registration of such IRIs to ensure uniqueness as well as ease of use and mapping to a variety of resources.  Having said this, the ontologies, excluding reference data, are intended to be used by people and machines, with clear definitions, explanatory notes, and other annotations to facilitate understanding and limit, if not eliminate, the confusion that can arise from specifications without the kinds of clarifying axioms that an ontology can provide.

The guidelines outlined herein are designed based on experience with ontology standardization and their use to support a wide variety of use cases, ranging from enterprise glossaries, to data interoperability and management, to natural language applications, to machine learning. Use of human-readable names is preferred for classes and properties so that documentation using diagramming capabilities of UML and other tools that can import ontologies make sense, as well as for use by other visualization tools that help explain the ontologies and application results to a broad user community.  Many such tools do not display model elements by label, but rather by name, and make the use of ontologies with generated identifiers for classes and properties difficult if not impossible to use.

In addition, although the Web Ontology Language (OWL) provides no support for unique naming, names for all ontology resources for the IDMP ontologies MUST be unique with respect to names and labels for standardization and governance purposes. There may be content in provisional (draft, in-work) ontologies that does not adhere to these policies, but for any released ontology, we require unique naming and unique labeling. The following additional guidelines are required for naming and labeling resources:

  • There should be no special characters and no abbreviations in names. There are occasions when we will deviate from this policy, such as for code lists that are automatically generated, but otherwise, abbreviations should not be used despite resulting in lengthy names in some cases.

  • Every class, property, and individual MUST have a label, at a minimum, with additional annotations as described in our metadata policy available at Pattern: Metadata and Annotations. Labels MUST be expressed using natural language (American English is required, other languages are optional as appropriate for the user community), space-separated and appropriately language tagged. If labels and other annotations are not language-tagged, the default language tag is 'en'.  American English spelling MUST be used for all labels and other annotations.

  • Labels MUST be expressed in lower case, with proper spacing as if they were written as text. The only exception to the lower case rule in labels is for proper names, which may be capitalized, as appropriate.

  • Where alternate language equivalents are available, additional language-tagged labels MAY be used, as mentioned above. In cases where languages other than English are used, proper diacritical marks expressed in UTF-8 format should be included in the label (but not in the camel case name).

  • Names for resources in ontologies MUST not be duplicated. That is

    • having a class named "Lifecycle" in two different ontologies, regardless of the namespace, is strictly prohibited. There may be cases where a class introduced in some subject area or at a sub-topic level is needed in another such area, and thus the class should be promoted to a higher, common level in the ontology architecture. In such situations, the lower level resource should be deprecated in released ontologies and simply deleted if the lower-level ontology has not yet been released.

    • having a property named "hasJurisdiction" in two different ontologies, regardless of the namespace, is strictly prohibited. There will likely be cases where, due to legacy naming or moving properties from a lower-level sub-topic to a higher level subject area results in temporary duplication. In such situations, the lower level property should be deprecated. Properties in OWL can be specified so that they are reusable in property restrictions and other axioms on many classes, which limits, if not eliminates, the need to duplicate names. If a property is needed at a higher level in the ontology architecture, or with a different or less constrained domain that that which was used in its initial definition, one should raise an issue.

Naming Conventions for Classes

  • Class names (URI parts) MUST be expressed in upper camel case, each word capitalized, and no separation or punctuation between words, i.e., without intervening underscores, dashes, or other special characters.  This guidance corresponds to widely used conventions for vocabularies and ontologies in the Semantic Web community, including the Financial Industry Business Ontology (FIBO), Industrial Ontology Foundry (IOF) effort, and other ontology standards-oriented communities, despite the fact that often we see underscores and dashes in names in the data models, data dictionaries, and other resources used input.

  • Class names must be singular unless they represent mass nouns or the only form for the concept is plural.

Naming Conventions for Properties

  • Property names (URI parts) for all object and data properties MUST be expressed in lower camel case, with the first word (or only word) not capitalized, and each subsequent word capitalized, again with no separation or punctuation between words, i.e., without intervening underscores, dashes, or other special characters. 

  • Verbs must be used for property naming for all object and data properties, without the inclusion of the name of the domain or range class names, with few exceptions for properties of the form "has x", for readability purposes. 

  • Property names for annotations MAY be nouns or noun phrases, and should conform to the set of annotation properties specified in TBD for use on the IDMP project. If an additional annotation property is needed, it's name, definition, and usage guidelines MUST be discussed by the ontology governance board. Approved annotations will be added to the IDMP annotation vocabulary and documented at TBD.

Naming Conventions for Individuals

  • Individual names (URI parts) for nominals MUST be expressed in upper camel case.

  • ISO 704 promotes the use of the lower case for individuals unless they incorporate proper names, which we will follow in IDMP for labeling of nominals, but not at the URI level.

Naming and labeling conventions for generated reference data are TBD.

External Links

https://en.wikipedia.org/wiki/Camel_case