A number of years ago I worked on a project of cleaning up a
large taxonomy on occupations and job titles. My client contact was sometimes
confused between terms to be used as synonyms/variants for a preferred term and
terms to be used as narrower terms to a preferred term. This initially surprised me,
because the difference seemed so obvious. A more recent project raised the
issue again, and I realize challenges.
The word “term” can be confusing, considering the different
types of terms that exist. Both variant terms (also called synonym,
nonpreferred terms, or entry terms) and narrower terms are kinds of terms. By
contrast, focusing on concepts that may have various labels, the
distinctions between a concept’s narrower concepts and its alternative labels
is quite clear. The widely adopted SKOS (Simple Knowledge Organization System)
data model standard follows the concept-based approach. SKOS is now followed by
all dedicated taxonomy management software systems.
Many taxonomies, however, are not yet managed in dedicated
taxonomy management systems but rather in spreadsheets or internally developed
tools, neither of which follow SKOS. This is the case of both my projects in
question. Each “term” in the spreadsheet-based tool had its own row, which resulted
multiple rows for the same concept. Broader categories were in another column
to the right. This format is potentially confusing because the variants
appeared in a column as did the hierarchical levels, and you had to remember
which column was which.
Regardless of the tool used, what makes it even more
confusing is that a narrower concept could be either a variant term or a hierarchically narrower term. What may variously be called synonyms, variants,
nonpreferred terms, entry terms, or alternative labels are not merely literal synonyms,
but they could be any terms or labels that may be used in tagging to
trigger the use of the concept or preferred term. This includes terms whose
meaning is narrower or more specific than the term/concept in question, since
the latter includes more specific terms within its scope. So, tagging
the occurrence of a concept with a “broader” concept is acceptable.
For example, in a medical taxonomy a concept can be
Radiation therapy. Radiotherapy is an alternative label. But then there are
specific types of radiation therapy, such as Brachytherapy, Radioimmunotherapy,
and Radionuclide therapy. These could be added to the taxonomy either as narrower
concepts or as alternative labels to Radiation therapy, depending on how
specific the taxonomy should be.
When creating or editing a taxonomy, it is often difficult
to decide how specific the taxonomy should be in certain places. Terms that are
too specific to warrant use as concepts should then be relegated to the status of
variants/alternative labels. Deciding
what is too specific depends on the concept’s relative specificity within
the entire taxonomy in addition to considering the potential usage of the specific concept.
In sum, if you are not ready to adopt SKOS-based taxonomy
management software, at the very least you should adopt a SKOS-based approach
in conceptualizing and labeling your taxonomy. Call things “concepts” and
“labels”, not “terms.” Concepts are in hierarchical relationships to each
other. Labels are the names for concepts. The “preferred label” is the
displayed form of the name (such as in facets in the fronted application), and
“alternative labels” are variant labels to match against strings of text that may
be used for the concept and trigger tagging with the concept. Furthermore, alternative labels could be
displayed differently from preferred labels, such as in italics and/or a
different colored shaded cell.