The
word “taxonomy” was coined in 1813 by the Swiss botanist A. P. de Candolle, who
developed a new method of classifying plants. The word is derived from the
combination of Greek words τάξις (taxis), meaning “order” or
“arrangement,” and νόμος (nomos), meaning “method” or “law.” The designation
of taxonomy was then applied after-the-fact to Carl Linneaus’ binomial nomenclature
system that had been published under the title Systema
Naturae initially in 1735.
Today’s information taxonomies have
their origins in a combination of classification systems, library subject
heading schemes, and literature retrieval thesauri, and thus have
features that combine all of these. Despite their name, information taxonomies
are closer to subject heading schemes and thesauri, than they are
to classification systems.
Classification
systems
Classification systems have a
multi-level hierarchy of classes, where a subclass is fully contained in its
parent class, and consequently members of a subclass are also members of the
parent class. Members (things) can belong to only one class, though. Historic
examples include:
Linnaean
classification of organisms (1735-1758)Paris
Bookseller's classification (1842)International
Classification of Diseases (originally Bertillon Classification of
Causes of Death, 1860)Dewey
Decimal Classification (1876) and other library classificationsIndustry
classification systems:Standard
Industrial Classification System (U.S) (1937)International
Standard Industrial Classification (U.N.) (1948)
The requirement that a thing (an organism,
book, document, medical diagnosis, economic establishment) can go into only one
class supports various purposes, which are not for information retrieval:
Understanding
and organism’s evolutionary background; identifying potential medicinal
herbsLocating
and reshelving a book on its shelfPerforming
heath data analysis from hospital records; billing health insurance
companies appropriatelyDoing
economic analysis of industries by aggregate establishment data
When it comes to information
resources, classification systems may be used to determine in what (virtual)
file folder a document belongs or, to support machine-learning based
auto-classification.
Classification systems are also useful
for data analysis, since content or records are assigned to only one
classification, and this prevents any double counting. Large, data-heavy organizations
might have developed their own internal classification systems for data
tracking purposes. Such classifications do not serve the same purpose of a
tagging/information retrieval taxonomy and should not substitute for a taxonomy
but rather exist alongside for separate purposes.
Subject
heading schemes
Subject heading schemes were developed
to help people find books and later also articles on various subjects with more
detail and flexibility for growth than classification systems. Subject headings
are used for cataloguing and indexing, not for classification. Unlike
classification (for shelf location) of which an item has only one
classification, an item (book, article, other media) can have multiple
subjects.
Features of subject heading schemes:
Alphabetical
arrangement of a very large number of subjects and/or named entities
(proper nouns)Cross-references
of See (Use) and See also (Related)Headings
with large numbers of citations broken down to group the citations by a
sub-heading or subdivision, in what is also called pre-coordination. For
example, China – Foreign relations.
Back-of-the-book indexes, whose format
evolved over the first half of the 20th century, follow a similar style.
Examples of early subject heading
schemes:
Library of
Congress Subject Headings (1898) and other national library systemsUS.
National Library of Medicine’s Medical Subject Headings (1954)
Library subject headings were adopted
for periodical article indexes early on. The Reader’s Guide to Periodical
Literature published by the H.W, Wilson Company had been using subject
headings, including subdivisions and cross-references, since shortly after its
introduction in 1901 (as can be seen in the 1900 -1905 cumulative index excerpted in the
screenshot below). (The two-digit years are from the prior century.)
Eventually, subject heading schemes adopted thesaurus
features of Broader term, Narrower term, and Related term relationships, as was
the case for Library of Congress Subject Headings, starting in 1985. Thus,
subject heading schemes and thesauri have become very similar. The name
“heading” in subject headings implies that there also exist some
sub-headings/subdivisions, a feature which is not a typical of thesauri,
though.
Thesauri
Information thesauri (in contrast to a
dictionary thesaurus, like Roget’s) emerged in the mid-20th century outside of
libraries for the more specialized subject needs of the federal government,
scientific publishers, and technology companies. The word “thesaurus” was first
used to refer to a controlled vocabulary, as a set of words/terms, not classification
codes, for information retrieval in the 1950s.
Early thesauri include:
E. I.
Dupont de Nemours Company’s thesaurus (1959)Thesaurus
of Armed Services Technical Information Agency (ASTIA) Descriptors, U.S.
Department of Defense (1960)Chemical
Engineering Thesaurus, published by the American
Institute of Chemical Engineers (1961)
Additional professional organization
publishers of scientific journals created their own thesauri in the 1960s. Dialog,
the first online information service for article citations, which also utilized
thesauri of information publishers, was launched in 1966.
Soon thereafter, standards for
thesauri were developed and published:
UNESCO Guidelines
for the establishment and development of monolingual thesauri (1970)DIN 1463
(Deutsches Institut für Normung) Guidelines for the establishment and
development of monolingual thesauri (1972)ISO 2788 Guidelines
for the establishment and development of monolingual thesauri (1974)
(superseded by ISO 25964-1 2011)ANSI American
National Standard for Thesaurus Structure, Construction, and Use
(1974) (superseded by ANSI/NISO Z39.19 1993)
Modern
information taxonomies
The word “taxonomy” for a hierarchical
structure (like a classification scheme) of terms for tagging and retrieval
(like a thesaurus) gradually became popular in the 1990s. These new
taxonomy-like thesauri became popular, largely due to advancements of software
and website user interfaces to enable interactive displays of hierarchies.
Taxonomies had the same primary purpose of thesauri, which is information
findability and retrieval, but taxonomy implementations introduced new designs
for browsing and expanding hierarchies. It was found that “taxonomy” also tended
to resonate with business audiences better than “thesaurus.” A market for
business and commercial taxonomies started to be recognized by software vendors
and by consultants by the end of the 1990s.
Combining an interactive user
interface with a database enabled the introduction of dynamic filters or
refinements of searches by selected taxonomy terms based on different aspects,
and thus faceted taxonomies emerged and have since become a popular, if not dominant,
implementation of taxonomies for many different use cases. Faceted taxonomies,
by combining search terms for refinement, do not need to be as large and
detailed as thesauri.
As for the next chapter in the history
of taxonomies, that involves a convergence with ontologies. You
can read more about that in my past blog article “Taxonomies vs. Ontologies.”
Normal
0
false
false
false
EN-US
X-NONE
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:8.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Aptos",sans-serif;
mso-ascii-font-family:Aptos;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Aptos;
mso-hansi-theme-font:minor-latin;
mso-font-kerning:1.0pt;
mso-ligatures:standardcontextual;}