Classification Process and Taxonomy
Originally published on February 1, 2017
20th in a series of 50 Knowledge Management Components (Slide 28 in KM 102)
Classification: creating and maintaining a taxonomy that can be used to organize information so that it can be readily found through navigation, search, and links between related content
To ensure that information can be readily found, it is important to use a standard classification terminology when storing it. By defining a taxonomy before beginning to store information in repositories, and then applying the taxonomy through the use of standard metadata and tagging, future problems with inconsistent categories, conflicting and redundant metadata, and difficulty in finding content can be prevented.
According to Bob Bater, “an ontology identifies and distinguishes concepts and their relationships; it describes content and relationships. A taxonomy formalizes the hierarchical relationships among concepts and specifies the term to be used to refer to each; it prescribes structure and terminology. A thesaurus provides an initial entry-point, in the user’s terms, to the structured language of the taxonomy used to index documents. A classification is a taxonomy where a numerical or alphanumerical identifier has been assigned to each node to provide a means of ordering items.”
Examples of taxonomies include the Dewey Decimal System for books and the organization of living things (Kingdom, Phylum, Class, Order, Family, Genus, Species).
A folksonomy “is the exact opposite of a taxonomy in that it is flat (that is, it has no hierarchy no parent-child relationships) and is completely uncontrolled (part of making a taxonomy is deciding what the names of your entities are, but in a folksonomy, there can be a thousand different words for the same thing). Any relationships you see in a folksonomy have to be derived mathematically (statistical clustering). However, a folksonomy is like a taxonomy in that they share the same purpose: classification.” An example of how folksonomies are used is flickr. The problem with a folksonomy as opposed to a taxonomy is that there are no imposed standards, and thus inconsistent tags will likely exist for information that should be tagged uniformly.
A tension exists between the common wish of users to search for information using simple text search and the need of content managers to tag and organize content so that it can located by browsing and searching. Classification of content enables it to be found, read, and understood in the appropriate context. You will need to educate users as to why classification is important to them, how to add metadata when contributing content, and how to use faceted searches (those which use metadata) to more effectively locate information.
Efforts to define an exhaustive taxonomy for an organization can easily become so large and complex that they fail to be completed, implemented, or adopted. A bounded taxonomy for a group within an organization will have a better chance of success. Limit the scope to the key terms which will be used as standard metadata for content classification, and then use this taxonomy for navigation menus, browsing filters, and structured search engines. Give users the option of using free text search, metadata search, navigation, browsing, or a thesaurus.
You may wish to provide users the ability to tag content themselves to create folksonomies, which can be used to complement formal taxonomies. When submitting content to repositories, make it mandatory but easy to add the required metadata, and keep this to the absolute minimum to avoid frustrating users.
5 Steps to Implement Classification
- Establish a team that will define and maintain the taxonomy for the organization. It should include the key content owners and the KM team members who manage the repositories.
- Define a vocabulary for the knowledge used in your organization. Establish a classification standard that defines the organization’s taxonomy and how it is to be deployed.
- Use the taxonomy for metadata, navigation, and searching.
- Specify the metadata that will be required for each submitted file. Decide on a structure: hierarchical folders, different list views, faceted taxonomy navigation, or metadata-based search.
- Offer faceted navigation, browsing, and searching to guide users based on the standard taxonomy. See Customizing Taxonomy Facets by Heather Hedden.
Insights
1. Lee Romero: Enterprise taxonomy: Six components of a vision
The taxonomy will:
- Be adopted for use in all systems that manage content or documents for those facets that are defined within the taxonomy
- Be used to tag content within those systems in order to ensure consistent language to describe our content
- Enhance the information experience for users through that tagging
- Be managed as its own asset, including defining the facets and the values used within those facets
- Use appropriate systems of record when possible to define the set of values used for a particular facet
- Enable monitoring of changes to the taxonomy values by content managers
There are three basic characteristics of a taxonomy for knowledge management, and to be any good at its job, it needs to fulfill all three functions:
- A taxonomy is a form of classification scheme
- Taxonomies are semantic
- A taxonomy is a kind of knowledge map
b. The Kingdom of Taxonomy on Video
A video presentation of “the Kingdom of Taxonomy” in two parts, looking at the roles that lists, trees, matrices, facets and folksonomies play in taxonomy design.
- Part One: Lists & Trees
- Part Two: Matrices, Facets And Folksonomies
3. Seth Earley: Taxonomy, Metadata, and Search Optimization
a. Taxonomy is a foundation
- It is a system for classification
- It allows for a means to organize documents and web content
- Helps us fine tune search tools and mechanisms
- Creates a common language for sharing concepts
- Allows for a coherent approach to integrate information sources
- It is a common language for business processes
b. Goals of a taxonomy
- Improve search results and applicability (both precision and recall)
- Allow for knowledge discovery
- Improve usability of applications as well as learnability of applications
- Reduce the cost of delivering services, developing products and conducting operations
- Improve operational efficiencies by allowing for reuse of information rather than recreation
4. Heather Hedden: Two main approaches to taxonomies
a. A hierarchy of terms/concepts/topics/categories arranged with narrower topics/subcategories displayed under their broader/parent categories.
- To guide users to find the desired topic (and its linked content of pages or documents)
- Similar to navigation and site maps, but more topical and not just based on page titles
b. A controlled vocabulary of metadata tags/labels to apply to pages, posts, or documents, so that they can be more precisely and comprehensively retrieved (than by search algorithms alone on keywords in text)
- Implemented as search suggestion terms, search refinement filters, or related topics and searches
Examples
1. HP
2. Deloitte
Resources
- Taxonomies and Controlled Vocabularies: Self-Paced Online Course
- Taxonomy Boot Camp
- TaxoTips
- APQC
- SIKM Leaders Community Posts
- Taxonomy 101: The Basics and Getting Started with Taxonomies by Betsy Walli
- Taxonomy 101: Definition, Best Practices, and How It Complements Other IA Work by Page Laubheimer
- What is the Difference between Taxonomy and Ontology? It is a Matter of Complexity by Chantal Schweizer
- Taxonomies by Anthony Hunter
- Typology or Taxonomy? by Dave Snowden
- The Name Game — Where Folksonomy Meets Taxonomy by Luis Suarez
- Taxonomies and Tags: From Trees to Piles of Leaves by David Weinberger
- Hedden Information Management by Heather Hedden
- On Content, Collaboration and Findability by Lee Romero
- Green Chameleon by Patrick Lambe
- Earley Information Science by Seth Earley
- Wordmap® Taxonomy Modeling & Management Solutions
- Stories are a form of taskonomy by Shawn Callahan
- Step Two Designs by James Robertson
- Innotecture by Matt Moore
- Grace Lau — Old Blog — Old Writing
Consultants
- Lou Rosenfeld’s 2005 list
- TaxoTips list
- Dovecot Studio — Stephanie Lemieux
- Earley Information Science
- Enterprise Knowledge LLC
- Hedden Information Management
- Heyman Information Services LLC
- Iknow LLC
- KAPS Group — Tom Reamy
- Semantic Studios — Peter Morville
- Straits Knowledge — Patrick Lambe
- Taxonomy Strategies LLC
People
- Seth Earley
- Heather Hedden
- Marti Heyman
- Patrick Lambe
- Stephanie Lemieux
- Helen Lippell
- Matt Moore
- Peter Morville
- Wendi Pohs
- Tom Reamy
- Lee Romero
- David Weinberger
- Zach Wahl
Books
- Typologies and Taxonomies : An Introduction to Classification Techniques by Kenneth Bailey
- Sorting Things Out: Classification and Its Consequences by Geoffrey Bowker and Susan Leigh Star
- Knowledge Representation: Logical, Philosophical, and Computational Foundations by John Sowa
- The Intellectual Foundation of Information Organization by Elaine Svenonius
- The Organization of Information, 2nd Edition by Arlene G. Taylor
- Wynar’s Introduction to Cataloging and Classification, 9th Edition by Arlene G. Taylor
- The Accidental Taxonomist by Heather Hedden
- Structures for Organizing Knowledge: Exploring Taxonomies, Ontologies, and Other Schema by June Abbas
- Building Enterprise Taxonomies by Darin L. Stewart
- Organising Knowledge: Taxonomies, Knowledge and Organisational Effectiveness by Patrick Lambe
- Information Architecture: For the Web and Beyond by Louis Rosenfeld and Peter Morville
- Ambient Findability: What We Find Changes Who We Become by Peter Morville