Archiving, Document Management, and Records Management

10 min readJul 15, 2018

Originally published on June 15, 2017

38th in a series of 50 Knowledge Management Components (Slides 49, 102, and 103 in KM 102)

Archiving: offline file storage for legal, audit, or historical purposes, using tapes, CDs, or other long-term media. Archiving is the process of moving files that are no longer actively used to a separate storage device for long-term retention. Archived files are still important to the organization and may be needed for future reference or must be retained for regulatory compliance. Archives should be indexed and searchable so that files can be easily located and retrieved.

As part of the lifecycle of information, archiving is an important final stage. Keeping too much old information available online consumes valuable storage which could be better used for newer information, increases the number of irrelevant search results returned, and adds to the effort required to maintain, migrate, and reclassify content.

There are good reasons to archive content rather than simply delete it. Laws may require content to be kept for specific periods. Internal and external audits may require document retention. Preserving information as a part of history is another worthy goal. And there are times when information which had been thought to be no longer useful later turns out to be needed. Archiving information addresses each of these requirements.

Document management: tracking and storing electronic documents and/or images of paper documents, keeping track of the different versions modified by different users, and archiving as needed. A document management system (DMS) is technology that provides a comprehensive solution for managing the creation, capture, indexing, storage, retrieval, and disposition of the records and information assets of an organization.

Records management: maintaining the records of an organization from the time they are created up to their eventual disposal; this may include classifying, storing, securing, archiving, and destroying records. Records management is knowing what you have, where you have it, how long you have to keep it and how secure it is.

Examples of Processes, Policies, and Procedures

1. Document Management Process: The ideal information lifecycle management process provides an easy method for content to be reviewed, with the reusable content preserved and the other content archived on suitable media. For example, at the end of a project, all documents in the project team space are listed, the user checks boxes for the reusable ones, and then clicks on an archive button. The result is that the reusable documents are extracted from the team space and stored in the appropriate repository using the associated metadata, and all other documents are archived to a CD which is then stored in the specified archive library.

2. Records Management Policy: defines the policy for how the organization’s business records are to be managed

3. Archiving Procedure: details the steps to follow in support of the records management policy’s archiving rules

a. Example: HP

b. Example: Deloitte

Insights

1. Information and Records Management Policy Development Guidelines by Patrick Lambe and Marita Keenan

The requirements for a knowledge sharing system, including taxonomy and metadata requirements, will need to be balanced with the need to manage records according to legislative and regulatory requirements. Knowledge, information and records form a continuum that needs to be managed holistically, and an integrated policy framework helps to support this.

2. Knowledge Management and document management/ECM by Nick Milton

Document management is a subset of information management. Document management covers the management of electronic documents, whether they are knowledge or not. Knowledge Management covers the management of knowledge, some of which may be codified within documents. There is an overlap between the two, as well as distinct separate areas.

4. Call me cynical but … by Euan Semple

Document management systems — where knowledge goes to die gracefully.

5. Complex Adaptive Processes on a Wiki? by Martin Cleaver

The problem with wikis has always been the collapse of two essentially different operations: Save and Publish. You edit and save a page. What happens? You publish to the world. No approval. No chance to ensure it marries up with process changes made elsewhere, compare to compliance or governance regulations, to get it buy-in from stakeholders or to get legal approval. As a rule Wikis omit Content Management and Workflow capabilities and, because they don’t deal with Approved Records, they also lack Records Management and Records Retention facilities.

6. Ensuring Future Access to History by David Weinberger

As early as the 1940s, archivists were talking about machine-readable records. The debates and experiments have been going on for many decades. One early approach was to declare that electronic records were not archives, because the archives couldn’t deal with them. (Archivists and records managers have always been at odds, he says, because RM is about retention schedules, i.e., deleting records.) Over time, archivists came up to speed. By 2000, some were dealing with electronic records. In 2010, many do, but many do not. There is a continuing debate.

There are three languages we need: Legal, Records Management, and IT. How do we make the old ways work in the new? We need both new filtering techniques, but also traditional notions of appraisal.

7. SIKM Leaders Community Discussions (Archiving and Content Management)

a. Archiving Content

Question: An IT colleague is proposing that we archive content more than 6 months old (still searchable) and that content more than 12 months be removed (owner will get an email alert). Should we do this?

Answers:

From Howie Cohen: I suggest that you work with your compliance team. There are legal considerations for records management and retention that vary based on company and industry. IT people may have ideas but more often than not we don’t know the laws or the rules. In my practice, I always work with legal, corp comm, HR and compliance.
From Chuck Georgo: Is the desire for archiving driven by storage space limitations or internal retention policies? The former can be resolved with helping to support the purchasing of more storage, the latter will be tricky as they may not want to retain certain content past a certain time period. You might also try filtering the content to only retain that which can be most useful.
From Paul McDowall: Archiving makes a lot of sense for a field that changes and grows, but arbitrary rules for archiving are good for records managers and IT managers and often not so good for practitioners. The rules have to make sense for the practitioners first and foremost, assuming of course that there are no explicit legal reasons for the archiving rules.
From Tammy Bearden: This may depend on the volume, value, and visibility of the content. Can you quantify its value and therefore it’s need for visibility? If you cannot get IT to budge and the retention/re-use value is still high, you may consider converting the “expiring” content into wiki entries or another type record in a knowledge bank or in a work product collection.

b. CoP — Archiving and Retention Policy

Question: We are considering migrating content to a searchable archive after 12 months then purging content at 6 years. Is that reasonable? How are people doing this?

Answer from Lee Romero: Retiring A Community And Capturing Its Knowledge

c. Should a KM group be responsible for Records Management?

d. Answer from Murray Jennex: Email retention policies and email archive best practices

8. Don’t automatically archive content; improve search instead

Knowledge repositories often are configured to automatically archive documents after some predetermined period of time. The intent is that after content has been available for 90 days (or whatever duration is chosen) it is no longer current and thus should be removed from the repository. The assumption is that this old content should not appear in search results or in lists of available documents.

Reasons include:

Old documents are no longer relevant, accurate, or useful.
Searches yield too many results, so weeding out old documents will improve user satisfaction with search.
Content contributors should refresh documents periodically.

Contributed content does not automatically become obsolete after a fixed period of time. It may remain valuable indefinitely.

I offer the analogy that just because Peter Drucker died in 2005, we don’t remove his books from the library. His insights will continue to be useful for a very long time.

One firm where I worked had an automatic archiving process. As a result, I would often receive messages from frustrated users who were searching for content that they had previously found in the repository but could no longer find. I would have to restore this content from the archive to the active repository. This caused users to be annoyed with the KM program, resulted in a lot of wasted time and effort, and sometimes delayed the retrieval of important information needed for client work.

In Knowledge management and innovation, Steve Denning wrote: “The quality of knowledge does not depend on whether it is old or new but rather whether it is relevant, whether it still works. Whether it is old or new hardly matters. The question is: does it work? The dynamic of academia is different. Here the new is celebrated, whether it is useful or not. The old is looked down on, not because it isn’t useful, but because the raison d’etre of academia is to create the new, not the useful. Innovation in industry will often draw on lessons from the past, particularly those that have been forgotten, or those that can be put together in new combinations to achieve new results. The bottom line however is not whether the knowledge is new, but whether it works in practice.”

With the cost of mass storage steadily decreasing, there are few good reasons to remove content from knowledge repositories unless it is known to be outdated, incorrect, or useless. Instead, allow search engines to limit results based on dates and other metadata to help users more easily find the content they need.

Don’t automatically archive content in a knowledge repository, threaded discussion, or other collection of knowledge. Instead, ensure that the search engine can limit results by the date of the knowledge object. Defaults can be set to limit results to the last 90 days, one year, or whatever duration is desired. But it should be easy for users to change the date range to include older content in the search results.

Also, allow content to be tagged with “recommended” or “good example” or “proven practice” by an authoritative source, and by users with “I found this useful” or with a “Like” button. Then allow searching by date, tag attribute, most-liked by users, etc. See Content rating is different behind the firewall for more on recommended tags.

Resources