There are various ways to put things into buckets. The table below give examples and is drawn from a blog article summarizing this: Buckets, Buckets Everywhere, Who Knows What to Think?
Kind of Bucket | Example | Representing the Bucket | Putting something in the Bucket |
Individual of a Kind | John Doe is a Person | Instance of owl:Class | rdf:type |
A bucket with functionally connected things inside | Sheila Woods is a member of OJ’s Jury | Instance of a subclass of fibo-fnd-arr-arr:Collection | fibo-fnd-rel:isGenericMemberOf |
An term for categorizing something | The book “Winter of our Discontent” has Winter as one of its tags | Instance of a subclass of fibo-fnd-arr-cls:Classifier | fibo-fnd-rel:isClassifiedBy |
As an aside: FIBO has a property called fibo-fnd-rel:isMemberOf which is too limted for the second case, due to the use being narrower than what is required here, which is why I propose a new property called isGenericMemberOf which is a superproperty of sMemberOf.
This note is about the third case. You could always create a class,e.g. the class of all books with the tag "winter". The question is when should you create a class, and when should you have a individual that denotes the category, and put things in the category uisng isClassifiedBy.
Other example of the third case are lists of options such as the reason for a loan, say: education, car, house. One could create classes CarLoan, HouseLoan and EducationLoan. OR, one could create an instance of fibo-fnd-arr-cls:Classifier called LoanReaons with instances (using an agreed naming convention): "_LoanReason_Car", _LoanReason_House", and "_LoanReason_Education". There are other ways to classify loans, for example: ConformingLoan, NonConformingLoan, USDA_Loan, VA_Loan and FHA_LOan. We could create classes for these too. But then should we also create a class for every valid combination? This can result in an exponential number of classes, which should generally be avoided.
The question is, when should you represent the type of loan as a Class and when is it better to us isClassifiedBy and Classifier? The more general question is when you can think of a concept as a 'bucket' and you know what things go in the bucket vs. stay out, when should that bucket be represented as a Class or as a Classifier.
Note that it is possible and sometimes necessary to have it both ways, unless you want to go into OWL Full.
In deciding, consider the following questions:
- Is it natural to think of the bucket as a kind or type of thing or is it more a descriptive attribute?
- Is USDA_Loan a kind of loan, or is just a descriptive attribute of a Loan?
- Is "winter" a kind of book or does it help describe the book?
- Do things that go into the bucket have different properties?
- Does a USDA Loan have different properties than a non USDA Loan?
- Does a CarLoan have different properties than a HouseLoan?
- Does a book about winter have different properties than other books?
- Will the set of buckets be governed by the same people that are governing the ontology?
- Does FIBO decide and keep track of what kinds of loans there are, or is that done by other people?
- WiIl you ever need to use the bucket as the subject or object of a triple?
- Will you ever have to use "USDA_Loan" or "CarLoan" as the subject or object of a Triple?
- Will you ever have to use "winter" as the subject or object of a triple?
- Will there be a large number (i.e. more than a handful, maybe dozens or hundreds) of possible buckets for a given Classifier?
- Is there a lerge number of reasons for getting a loan? (no, just a few)
- Is there a large number of Tags? yes, maybe thousands.
A Class is indicated under the following circumstances:
- Answer to Q1 is kind or type
- Answer to Q2 is yes, there are different properties that you care about for the purpose of the ontology?
- Answer to Q3 is yes, same people governing
- Answer to Q4 is no, the bucket will be subject or object in a triple
- Answer to Q5 is: no, just a few
A Classifier is indicated under the following circumstances:
- Answer to Q1 is descriptive attribute
- Answer t oQ2 is, no, properties are mostly the same
- Answer to Q3 is no, different people will be governing
- Answer to Q4 is yes, the bucket will be subject or object in a triple
- Answer to Q5 is: yes, lots of Buckets in a single Classifier.
Take an example in healthcare:
- you an think of it as a kind or type of condition
- there probably are different properties for different diseases, but that will not be needed for healthcare delivery, it would matter more in a scientific context studying diseases.
- the people building a healthcare delivery ontology will not be governing the set of diseases out there.
- a disease will probably be used as a code in a diagnosis field.
- there are (tens of?) thousands of diseases out there.
Of course there are gray areas, and some criteria are more important that others.
I recommend avoiding classes unless criteria 1 & 2 are favorable. It should be rare to have this and also a large number of classes to model out (Q5).
If you are unsure, then start with Classifiers, since it keeps the class hierarchy tidy, and you can always go back and make a class if you need to.
How to do it both ways
Let's say we decided to model LoanReason as a Classifier, and later we realize we really want it to be a class. There will be a subclass of Classifier called LoanReason, and instances as we said above. Each instance corresponds to a type of loan, but now we will also create a class to represent the same thing.
- "_LoanReason_Car" new class called CarLoan
- "_LoanReason_House", new class called HouseLoan
- "_LoanReason_Education". new class called EducationLoan
We link the two in this way. We make CarLoan equivalent to the restriction: [isClassifiedBy value loan:_LoanReason_Car] in Manchester Syntax. Of course, this is not ideal for two reasons.
- we are representing the same information in two different ways (mostly to avoid OWL Full)
- hasValue restrictions cause inference delays during ontology development
Ideally, you don't have to do it both ways, and you can deprecate the URIs used for the other way.