Classification Process and Taxonomy

7 min readJun 24, 2018

Originally published on February 1, 2017

20th in a series of 50 Knowledge Management Components (Slide 28 in KM 102)

Classification: creating and maintaining a taxonomy that can be used to organize information so that it can be readily found through navigation, search, and links between related content

To ensure that information can be readily found, it is important to use a standard classification terminology when storing it. By defining a taxonomy before beginning to store information in repositories, and then applying the taxonomy through the use of standard metadata and tagging, future problems with inconsistent categories, conflicting and redundant metadata, and difficulty in finding content can be prevented.

According to Bob Bater, “an ontology identifies and distinguishes concepts and their relationships; it describes content and relationships. A taxonomy formalizes the hierarchical relationships among concepts and specifies the term to be used to refer to each; it prescribes structure and terminology. A thesaurus provides an initial entry-point, in the user’s terms, to the structured language of the taxonomy used to index documents. A classification is a taxonomy where a numerical or alphanumerical identifier has been assigned to each node to provide a means of ordering items.”

Examples of taxonomies include the Dewey Decimal System for books and the organization of living things (Kingdom, Phylum, Class, Order, Family, Genus, Species).

A folksonomy “is the exact opposite of a taxonomy in that it is flat (that is, it has no hierarchy no parent-child relationships) and is completely uncontrolled (part of making a taxonomy is deciding what the names of your entities are, but in a folksonomy, there can be a thousand different words for the same thing). Any relationships you see in a folksonomy have to be derived mathematically (statistical clustering). However, a folksonomy is like a taxonomy in that they share the same purpose: classification.” An example of how folksonomies are used is flickr. The problem with a folksonomy as opposed to a taxonomy is that there are no imposed standards, and thus inconsistent tags will likely exist for information that should be tagged uniformly.

A tension exists between the common wish of users to search for information using simple text search and the need of content managers to tag and organize content so that it can located by browsing and searching. Classification of content enables it to be found, read, and understood in the appropriate context. You will need to educate users as to why classification is important to them, how to add metadata when contributing content, and how to use faceted searches (those which use metadata) to more effectively locate information.

Efforts to define an exhaustive taxonomy for an organization can easily become so large and complex that they fail to be completed, implemented, or adopted. A bounded taxonomy for a group within an organization will have a better chance of success. Limit the scope to the key terms which will be used as standard metadata for content classification, and then use this taxonomy for navigation menus, browsing filters, and structured search engines. Give users the option of using free text search, metadata search, navigation, browsing, or a thesaurus.

You may wish to provide users the ability to tag content themselves to create folksonomies, which can be used to complement formal taxonomies. When submitting content to repositories, make it mandatory but easy to add the required metadata, and keep this to the absolute minimum to avoid frustrating users.

5 Steps to Implement Classification

Establish a team that will define and maintain the taxonomy for the organization. It should include the key content owners and the KM team members who manage the repositories.
Define a vocabulary for the knowledge used in your organization. Establish a classification standard that defines the organization’s taxonomy and how it is to be deployed.
Use the taxonomy for metadata, navigation, and searching.
Specify the metadata that will be required for each submitted file. Decide on a structure: hierarchical folders, different list views, faceted taxonomy navigation, or metadata-based search.
Offer faceted navigation, browsing, and searching to guide users based on the standard taxonomy. See Customizing Taxonomy Facets by Heather Hedden.

Insights

1. Lee Romero: Enterprise taxonomy: Six components of a vision

The taxonomy will:

Be adopted for use in all systems that manage content or documents for those facets that are defined within the taxonomy
Be used to tag content within those systems in order to ensure consistent language to describe our content
Enhance the information experience for users through that tagging
Be managed as its own asset, including defining the facets and the values used within those facets
Use appropriate systems of record when possible to define the set of values used for a particular facet
Enable monitoring of changes to the taxonomy values by content managers

2. Patrick Lambe

a. Defining “Taxonomy”

There are three basic characteristics of a taxonomy for knowledge management, and to be any good at its job, it needs to fulfill all three functions:

A taxonomy is a form of classification scheme
Taxonomies are semantic
A taxonomy is a kind of knowledge map

b. The Kingdom of Taxonomy on Video

A video presentation of “the Kingdom of Taxonomy” in two parts, looking at the roles that lists, trees, matrices, facets and folksonomies play in taxonomy design.

Part One: Lists & Trees
Part Two: Matrices, Facets And Folksonomies

3. Seth Earley: Taxonomy, Metadata, and Search Optimization

a. Taxonomy is a foundation

It is a system for classification
It allows for a means to organize documents and web content
Helps us fine tune search tools and mechanisms
Creates a common language for sharing concepts
Allows for a coherent approach to integrate information sources
It is a common language for business processes

b. Goals of a taxonomy

Improve search results and applicability (both precision and recall)
Allow for knowledge discovery
Improve usability of applications as well as learnability of applications
Reduce the cost of delivering services, developing products and conducting operations
Improve operational efficiencies by allowing for reuse of information rather than recreation

4. Heather Hedden: Two main approaches to taxonomies

a. A hierarchy of terms/concepts/topics/categories arranged with narrower topics/subcategories displayed under their broader/parent categories.

To guide users to find the desired topic (and its linked content of pages or documents)
Similar to navigation and site maps, but more topical and not just based on page titles

b. A controlled vocabulary of metadata tags/labels to apply to pages, posts, or documents, so that they can be more precisely and comprehensively retrieved (than by search algorithms alone on keywords in text)