The question often comes up: how are
taxonomies and ontologies different? While there are some short simple answers
(such as: taxonomies are hierarchies, and ontologies are semantic networks), it
is understandable that the distinction is not that clear. There is considerable
overlap. Ontologies may contain taxonomies, and taxonomies can be semantically
enriched to become ontology-like. The same software tools, for example PoolParty, support the creation
of both.
One of the trends in
data/information/knowledge management in the convergence of systems, methods,
and technologies, including the convergence of taxonomies and ontologies. It’s
gotten to the point that some people will refer to taxonomies and ontologies
almost interchangeably, as if they are essentially the same thing. They are
not, although they are increasingly combined. It’s interesting that one of the most active
discussion channels within the Taxonomy Talk community on Discord is on
ontologies. Taxonomy vs. Ontology (https://graphviews.poolparty.biz/GraphViews)Uses
Although both taxonomies and
ontologies are kinds of knowledge organization systems, which support access to
information, their specific uses tend to differ. The primary use of information
taxonomies is for consistent tagging and accurate and comprehensive retrieval
of content items. These could be documents, components (sections) of
documents, web or intranet pages, or digital assets (image, audio, video files,
etc.). Ontologies, with their inclusion or linkages to instances/individuals, with
their various attributes, are more focused on the specifics of data:
data retrieval, data comparison, and data analysis. Taxonomies are primarily for what a content item is about (although
content/document types may also be part of taxonomy), as in “get me all the
information resources about…,” or “get me a list of products with…” and specifying
set of features and price range as filters. Ontologies, on the other hand, can
support more complex, multistep queries, such as “get me a list of products
with…” a set of features and price range, whose vendors are located in Canada and
have a minimum annual revenue of CAD $50 million.
In comparing retrieval of content and data,
for example, taxonomies can retrieve a spreadsheet file, whereas ontologies can
retrieve data from individual cells in the spreadsheet. Ontologies can traverse
data in a database. While this could be a relational database, increasingly ontologies
are used with graph databases, since ontologies are also structured as graphs.
Origins
Another major difference between taxonomies
and ontologies is their origins. Information taxonomies (not biological
taxonomies) originated in the discipline of library science. Specifically, I would
say that taxonomies have evolved as a kind of flexible hybrid of classification
systems and thesauri. Ontologies, on the other hand, (when not in philosophy)
tend to be taught and researched as a part of computer science. Again, there
has also been convergence of library science and computer science in the field
of information science. Nevertheless, library/information science and
computer/information science are different approaches.
Taxonomies have also become an area of
interest in information architecture, user experience design, content
management, and digital asset management. Taxonomies are also related to
terminology management and information search and retrieval. Ontologies, on the
other had, have become an area of interest in data science, data engineering,
and graph data management. Ontologies also borrow concepts from set theory in
mathematics and logic from philosophy.
Taxonomies and ontologies follow different
standards, but the standards have also converged in a way. Taxonomies have no
standard of their own but follow the thesaurus standards (ANSI/NISO
Z.39.19 and ISO 25964) for recommended best practices. Ontologies are based on
W3C standards of RDF, RDF-Schema, and the formal language of OWL (Web Ontology Language). The W3C then
published a recommendation for taxonomies, thesauri, and other knowledge
organization systems called SKOS (Simple Knowledge Organization System) in 2009,
and since then it has become widely adopted. SKOS is based on RDF, as is the
ontology standards RSF-S. As a result, SKOS and RDF-S statements or namespaes can be combined
in the same knowledge organization system, and taxonomies and ontologies can
thus be combined.
Features
Both taxonomies and ontologies aim to
describe a knowledge domain with collections of entities structured into groups
or types, with relationships between them. Ontologies go further in describing
the relationships in more detail. Attributes are also more extensive in
ontologies. Both support the options for notes or definitions.
Concepts or Entities
Taxonomies are comprised of concepts (sometimes
called terms), which are things. Concepts can be generic or specific and may even
include named entities (unique proper nouns). Taxonomies do not differentiate
between generic concepts and named entities, which correspond to “individuals”
in an ontology. Ontologies, on the other hand, distinguish between two types of
entities: classes and individuals. Classes can be broad or specific, but, as
the name implies, they are intended to contain something, either subclasses or
individuals. By contrast, leaf nodes (the narrowest concepts in a hierarchy) in
a taxonomy could actually be quite broad in meaning.
Individuals, as defined by an ontology, tend
to be named entities (proper nouns), and they should be uniquely individual.
This may not be obvious. A brand name product is a proper noun, but technically
it is not an individual, because there are numerous specific instances of the
product owned by different people. There may be some differences of opinion on
how to define individuals.
Relationships
Taxonomies follow thesaurus standards for relationships.
Thesaurus hierarchical relationships comprise three types: generic-specific or “is
a” kind of relationship, generic-instance (where the instance is a named entity
or proper noun), and whole-part. Ontologies have only generic-specific “is a”
hierarchical relationships, which are between classes and subclasses. The
relationship between an individual and a class is not considered hierarchical
in an ontology but rather a relationships of class-member. Also, the
whole-part relationship is not considered hierarchical in ontologies (but could be created as a semantic relationship).
While generic-instance is a permitted
hierarchical relationship type In a taxonomy, named entity concepts (proper
nouns) are not so often narrower to a corresponding generic concept, but rather
tend to be grouped in their own separate concept scheme to serve as a separate search
facet or filter.
A generic associative (“related”)
relationship may exist in taxonomies, although it is more of a feature of
thesauri. It is bidirectional and reciprocal, and it tends to be used between
concepts within the same concept scheme, which often corresponds to a class in
an ontology. Ontologies do not have a generic associative relationship. Instead,
ontologies have semantic relations which are designated by the ontology creator,
just as the classes are designated, and they are not used within classes but
across a specified pair of classes. Suggestions of what might be of related
interest to the end-user is not within the scope of an ontology’s purpose which
is more structured and based on rules. Ontologies may have other bidirectional
reciprocal relationships, such as “goes with,” “has sibling, “accompanies,”
etc.
Equivalency and alternative labels
In a taxonomy, each concept has a single
preferred label in each language for display and any number of alternative labels
and hidden labels per language to help match on searching or tagging. In the
traditional thesaurus model, “nonpreferred” terms redirect to “preferred”
terms. The alternative labels are sufficiently equivalent in the context of the
taxonomy and content to be used for a given concept, and thus might not be
exact synonyms. Alternative labels include synonyms, near synonyms, and possibly
even narrower terms not deemed needed as concepts with preferred labels.
In ontologies, the OWL element sameAs is intended for
equivalency of individuals, and equivalentClass is for the equivalency of
classes, and they mean exact equivalence. But there is no designation of one name
being preferred and the other alternative. They all are preferred. The use of
sameAs and equivalentClass are not intended for use within a single ontology,
but rather across different ontologies. So, those OWL elements are similar to the
SKOS exactMatch relationship, which is used across concept schemes or taxonomies.
They do not support search within the same data set as alternative labels do.
Enforcement of rules
SKOS is a data model for taxonomies and
thesauri, but it does not specify any rules for usage. Rather, the taxonomy
creator should attempt to follow the guidelines, not exactly rules, in the thesaurus
standards (ANSI/NISO Z39.19 and ISO 25964-1). The quality standards include disjoint
labels (a label can be used only once for a concept, preferred or alternative,
and for only one concept), single relationships (a pair concepts my have hierarchical
or associative relationships between them, but not both), and no hierarchical
cycles. The standard for ontologies, on the other hand, OWL, has many rules
built into it. This makes OWL ontologies more powerful by supporting inferencing and
reasoning.
Conclusions
Taxonomies and ontologies share some features,
but each has its own additional features. Thus, a combination of a SKOS
taxonomy with an OWL ontology combines the features of both. Furthermore, the
combination of a taxonomy with an ontology also enables a combination of uses,
namely the search and retrieval for both content and data together. Rather than
a convergence of taxonomies and ontologies, they are carefully and deliberately
combined to maximize their benefits.