/
When to use Classes vs. Classifiers

When to use Classes vs. Classifiers

This was spawned by JIRA issue: Use of SKOS Concepts in FIBO 

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

There are two separate but related issues:

  1. How and when should skos:Concept be used in FIBO?
  2. How to represent concepts that are used to classify things, when you do not want to use owl:Class, and what criteria can be used to decide which approach to use.

This started when I thought one way to answer the first part of question 2 is to use skos:Concept. It was decided to not do that, instead, to use 

  • fibo-fnd-arr-cls:Classifier
  • fibo-fnd-rel:isClassifiedBy

There are various ways to put things into buckets.   The table below give examples and is drawn from a blog article summarizing this:  Buckets, Buckets Everywhere, Who Knows What to Think?

 

Kind of BucketExampleRepresenting the BucketPutting something in the Bucket
Individual of a KindJohn Doe is a PersonInstance of owl:Classrdf:type
A bucket with functionally connected things insideSheila Woods is a member of OJ’s JuryInstance of a subclass of fibo-fnd-arr-arr:Collectionfibo-fnd-rel:isGenericMemberOf
An term for categorizing somethingThe book “Winter of our Discontent” has Winter as one of its tagsInstance of a subclass of fibo-fnd-arr-cls:Classifier

fibo-fnd-rel:isClassifiedBy

As an aside: FIBO has a property called fibo-fnd-rel:isMemberOf which is too limted for the second case, due to the use being narrower than what is required here, which is why I propose a new property called isGenericMemberOf which is a superproperty of sMemberOf. 

This note is about the third case.  You could always create a class,e.g. the class of all books with the tag "winter".  The question is when should you create a class, and when should you have a individual that denotes the category, and put things in the category uisng isClassifiedBy.

Other example of the third case are lists of options such as the reason for a loan, say: education, car, house. One could create classes CarLoan, HouseLoan and EducationLoan. OR, one could create an instance of fibo-fnd-arr-cls:Classifier called LoanReaons with instances (using an agreed naming convention): "_LoanReason_Car", _LoanReason_House", and "_LoanReason_Education".  There are other ways to classify loans, for example: ConformingLoan, NonConformingLoan, USDA_Loan, VA_Loan and FHA_LOan.  We could create classes for these too. But then should we also create a class for every valid combination?  This can result in an exponential number of classes, which should generally be avoided. 


The question is, when should you represent the type of loan as a Class and when is it better to us isClassifiedBy and Classifier?  The more general question is when you can think of a concept as a 'bucket' and you know what things go in the bucket vs. stay out, when should that bucket be represented as a Class or as a Classifier.  

Note that it is possible and sometimes necessary to have it both ways, unless you want to go into OWL Full.

In deciding, consider the following questions: 

  1. Is it natural to think of the bucket as a kind or type of thing or is it more a descriptive attribute? 
    1. Is USDA_Loan a kind of loan, or is just a descriptive attribute of a Loan?
    2. Is "winter" a kind of book or does it help describe the book?
  2. Do things that go into the bucket have different properties?
    1. Does a USDA Loan have different properties than a non USDA Loan?
    2. Does a CarLoan have different properties than a HouseLoan?
    3. Does a book about winter have different properties than other books?
  3. Will the set of buckets be governed  by the same people that are governing the ontology?
    1. Does FIBO decide and keep track of what kinds of loans there are, or is that done by other people?
  4. WiIl you ever need to use the bucket as the subject or object of a triple?
    1. Will you ever have to use "USDA_Loan" or "CarLoan" as the subject or object of a Triple?
    2. Will you ever have to use "winter" as the subject or object of a triple?
  5. Will there be a large number (i.e. more than a handful, maybe dozens or hundreds) of possible buckets for a given Classifier?
    1. Is there a lerge number of reasons for getting a loan? (no, just a few)
    2. Is there a large number of Tags? yes, maybe thousands.

 

A Class is indicated under the following circumstances:

  1. Answer to Q1 is kind or type
  2. Answer to Q2 is yes, there are different properties that you care about for the purpose of the ontology?
  3. Answer to Q3 is yes, same people governing
  4. Answer to Q4 is no, the bucket will be subject or object in a triple
  5. Answer to Q5 is: no, just a few 

A Classifier is indicated under the following circumstances:

  1. Answer to Q1 is descriptive attribute
  2. Answer t oQ2 is, no, properties are mostly the same
  3. Answer to Q3 is no, different people will be governing
  4. Answer to Q4 is yes, the bucket will be subject or object in a triple
  5. Answer to Q5 is: yes, lots of Buckets in a single Classifier.

 

Take an example in healthcare:

  1. you an think of it as a kind or type of condition
  2. there probably are different properties for different diseases, but that will not be needed for healthcare delivery, it would matter more in a scientific context studying diseases.
  3. the people building a healthcare delivery ontology will not be governing the set of diseases out there.
  4. a disease will probably be used as a code in a diagnosis field.
  5. there are (tens of?) thousands of diseases out there.

Of course there are gray areas, and some criteria are more important that others.

I recommend avoiding classes unless criteria 1 & 2 are favorable. It should be rare to have this and also a large number of classes to model out (Q5). 

If you are unsure, then start with Classifiers, since it keeps the class hierarchy tidy, and you can always go back and make a class if you need to.

A Tidy Class Hierarchy & Maintaining Facets 

One disadvantage of having a lot of classes, where there are sets of classes, each corresponding to a different facet, is that the information in the facets is lost and the class hierarchy gets large. Compare:

On the left, you have a flat list of subclasses with no information about underlying facets.  On the right, I manually added the colors to show that they belong together. Using classifier, the hierarchy is much simpler, and each facet is very explicit and easy to see, understand and evolve. 

The restrictions on LoanContract tell you what the facets are.

 

Then you can see the values of each facet:

 

All of this valuable domain information is hidden in the first approach. If there were just lots of classes, it would be harder to keep track of the facets. This makes the ontology harder to understand and thus to evolve. There is no place to go to add a new loan reason, say BoatLoanContract.   Perhaps there is a way to add it?

How to do it both ways

Let's say we decided to model LoanReason as a Classifier, and later we realize we really want it to be a class.  There will be a subclass of Classifier called LoanReason, and instances as we said above. Each instance corresponds to a type of loan, but now we will also create a class to represent the same thing.

  • "_LoanReason_Car" new class called CarLoan
  • "_LoanReason_House",  new class called HouseLoan
  • "_LoanReason_Education".  new class called EducationLoan

We link the two in this way.  We make CarLoan equivalent to the restriction: [isClassifiedBy value loan:_LoanReason_Car] in Manchester Syntax. Of course, this is not ideal for two reasons.

  1. we are representing the same information in two different ways (mostly to avoid OWL Full)
  2. hasValue restrictions cause inference delays during ontology development


Ideally, you don't have to do it both ways, and you can deprecate the URIs used for the other way.