At the Connected Data London (CDL) conference I attended
last week, ontologies were humorously referred to as the “O” word. The thought
was that, until recently, experts preferred not to mention “ontology,” lest
they alienate their audience, customers, or stakeholders. The word comes across
as too technical. It is a term from philosophy, after all, and it does not help
that it sounds very similar to “oncology” (as “taxonomy” has been confused with
“taxidermy”). The term “knowledge graph” on the other hand, is more user friendly,
and even if it is not perfectly understood, its general meaning can be guessed.
Thus, people would refer to knowledge graphs regardless of whether they meant a
knowledge graph or an ontology.
At the conference, however, it was discussed that there is a
growing acceptance of the word “ontology,” not just among experts but also
among varied stakeholders who need to implement them. This was noted by several
conference speakers, especially in the wrap-up panel session for the Data
Modeling track, which was titled “The ‘O’ Word: How Ontologies Drive Interoperable
Data and Business Innovation.” The panel moderator Katariina Kari explained
that this recent shift has happened because of LLMs, explaining: “We need a
reliable natural language repository. LLMS works on a network of mimicking
language, LLMS are primed for language.” So, now use of the word ontology can
even help a startup get funding from venture capitalists, she observed.
However, there remains some confusion over what an ontology is.
At one end there is the difference between ontologies and taxonomies, and at
the other end the difference between ontologies and knowledge graphs. I clarified
the distinction between taxonomies and ontologies in a prior blog post, “Taxonomiesvs. Ontologies” (January 2023). While knowledge graphs are a relatively
new concept, and ontologies have existed for much longer, it is the varied understanding
of ontologies that has given rise to confusion.
An ontology is defined as a model of a domain of knowledge,
which comprises classes (sets of things), attributes (types of characteristics
of things) and relationships between classes. According to this definition, an ontology
is a somewhat generic model of a domain, and it does not include all of the individual
members or instances of each class (such as the names of individual companies
in the class called Company) nor the specific attributes of each attribute type
(such as the address of each specific company for the attribute type called
Address).
However, the W3C recommendation for ontologies, OWL (Web
Ontology Language) includes the designation “individuals,” and ontology software
tools, such as Protégé, support the inclusion of individuals and their specific
attributes. Thus, it is easy to think that an ontology, by definition, includes
all specific individuals. But just because OWL covers the recommendation for
how to include instances of a class, and software supports the inclusion of instances
of classes does not necessarily mean that the instances or individuals are
actually a component of an ontology. The ontology experts on this CDL
conference panel confirmed that an ontology is the upper-level semantic model.
Then, what do we call an ontology plus all of the individual
members (instances) of classes and their specific attributes? That is essentially
what a knowledge graph is. This is especially true when individuals are
specific to an organization or enterprise, such as names of individual
customers, products, employees, etc., and we call that an “enterprise knowledge
graph.”
The first applications of ontologies in information/data
science were in biomedicine, in which individuals included such things as names
organisms (including bacteria and viruses) and chemicals, etc. Thus, the notion
of an individual in science is not quite the same as in business, which has
also been a source of confusion over what an individual is and the inclusion of
individuals in an ontology. In enterprise knowledge graphs, the instances can
be very numerous and specific, including individual “events,” such as
interactions or transactions.
In conclusion, an ontology is a defining feature and
component of a knowledge graph, but it is not all of what goes into a knowledge
graph. A knowledge graph also includes individuals, which may be named entity
instances or they may be specific taxonomy concepts (abstract things that are not unique
named entities, such as "Data ethics" or "Performance measurement"), and a knowledge graph also includes specific
attributes of individuals. It may be said that a knowledge graph is the instantiation
of an ontology, and an ontology is the knowledge model. Despite the growing acceptance of the word “ontology,”
the designation of “knowledge graph” to refer to what are essentially
ontologies continues, as I found to be the case in another session at this conference.