OAIS Reference Model Part I: Background and Influence

The OAIS model is an international standard that has been adopted for guiding the long term preservation of digital data and documents.  In fact, the OAIS model is an ISO standard (ISO 14721:2003): it was developed by the Consultative Committee for Space Data Systems (CCSDS) in 2002, and was adopted as an ISO standard in 2003.  The document is freely available, despite the fact that most ISO documentation is usually sold as a service.  It’s a hefty 148-pages, available in PDF form here.

Photo by OliBac licensed under Creative Commons

The OAIS model is a standardized model describing a way that digital repositories intended for preservation purposes can be run.  Within this model, you will not find a standard for metadata.  It also does not endorse any particular repository platform, software, protocols or implementation procedure.  The OAIS model is simply a set of standardized guidelines intended to aid the people and systems behind a repository that has been designated with the responsibility of maintaining documents for archival purposes over a long period of time.

OAIS stands for Open Archival Information System, the word open referring to the open and public process under which this model was developed.  Participation in its initial development was encouraged by the CCSDS, and as an ISO standard, it will go under review every five years.

Because the OAIS model is a recognized standard, its users have formed a default sub-community within the digital preservation community.  But it has also been very beneficial to the digital preservation community at large and has helped promote progressive thinking and discussion.  Here are some key reasons why the OAIS model is so helpful to the digital preservation process and community:

  • It has standardized the terminology associated with digital preservation
  • It has outlined the duties and services of a preservation repository
  • It has outlined a way that information should be attributed and managed within a repository
  • It has mobilized community discussions about repository standards and certification
  • It has included preservation metadata as an important part of the preservation process
  • It focuses on long-term preservation, but lets “long-term” be defined by the repository managers
  • OAIS-type archives are committed to a set of defined responsibilities

As a final note, is important to make it clear that the OAIS model is by no means a requirement for a digital repository; while it is a recognized way of running a repository, it is not the only way.  It may not fit for some repositories, depending on their intended size, resources, and designated communities.  But admittedly, when a repository chooses not to follow the OAIS recommendations, it cannot fall under the umbrella of the most widely-used and understood digital archive standard.


Here are some resources that were incredibly useful for me while writing this post and the one to follow:

  • I really benefited from reading this post by John Mark Ockerbloom, the editor of the blog Everybody’s Libraries.  I almost considered forgoing my own entry and just directing readers directly to his!
  • And then I found this post and was blown away by how thorough it is.  It’s really well done and I’d encourage you to check it out.
  • This page is a brief run-down of OAIS from the JISC Standards Catalogue.

Continue on to Part II

ISO Standards

ISO is the commonly used name for the International Organization for Standardization. This is an international, non-governmental organization that creates standards based on a consensus of international committee members.

One ISO standard that is relevant to digital preservation practices is the OAIS model.

Additionally, there is a working group attempting to create an ISO standard for digital repository certification, which I think is an excellent idea. A wiki is maintained here with information related to their regular remote meetups and the documentation they are creating and collecting to assist in the process of writing a standard. A useful glossary of digital preservation terms can also be found on their wiki.

Original publication date: 7/20/09

Cloud Computing

Let’s talk about cloud computing.

At its simplest, things that are in the “cloud” are things that float around in a sort of digital airspace and don’t exist on your computer. They exist on remote servers which can be accessed from many computers.

Photo by mansikka under a Creative Commons license

For this reason, the cloud is a good metaphor for the Internet. For most of us, keeping things in a cloud results in a convenient and logical way to make life simple.

You can access things in the cloud from anywhere that is connected to the Internet…depending on the service and its security (private cloud or public cloud). It’s kind of like your email or Facebook account. You have lots of stuff stored in these accounts that is specific to you, but you can log in from anywhere. And it will always look the same and have all your stuff in it. Your stuff is always just…there.

Photo by AJC1 under a Creative Commons license

Getting a bit more technical, your stuff is actually physically stored somewhere as bits on servers that are run by whoever is providing the service. For example, some institutions have servers dedicated to an institutionally-based digital repository. These servers might live on the campus and will store everything that is added to the repository. But the whole repository will not exist on the specific machine that you might use to access documents stored there. Your computer will connect to the remote server to access the repository.

What makes this fun for digital preservationists is that cloud computing can really increase the scale and sharing of preservation duties. Maureen Pennock, the Web archive preservation Project Manager at the British Library, recognizes this in her blog: “This minimises costs for all concerned, addresses the skills shortage, and produces a more efficient, sustainable and reliable preservation infrastructure.”

In the future – and as we are seeing with DuraCloud – all the tech work behind producing ways to store and retrieve data may be provided as part of a single repository product. (This type of service, by the way, is referred to as IaaS – Infrastructure as Service.) This would be excellent news for a great deal of institutions that don’t have the means or skills to set up a repository themselves.

Cloud computing offers a huge potential for an off-site alternative to the out-of-the-box repository products that most institutions currently must use. Instead, external organizations will be able to do the tech work while the institutions will be able to focus on non-technical repository maintenance.

Images by Flickr users mansikka and AJC1

Original publication date: 7/15/09

DuraSpace and DuraCloud

Here’s some news: the projects behind the digital repository platforms DSpace and Fedora have joined efforts and will now live under a bigger umbrella called DuraSpace. This new organization will still enable full and independent functioning of Fedora and DSpace, but there will be new joint projects aimed at advancing digital preservation technology and addressing a larger group of stakeholders.duraspace

The May 9, 2009 press release that announced the DuraSpace partnership emphasizes the first new project of its portfolio: DuraCloud.  DuraCloud is currently in a year-long pilot phase, and has the advantage of being backed as an NDIIPP project.  What makes it special is that it seems to be the first repository project to use cloud technology to store data.  Institutions will be relieved of a huge economic and technological burden if they no longer have to store the data themselves.  The Library of Congress announcement states that, “Duracloud will let an institution provide data storage and access without having to maintain its own dedicated technical infrastructure.”  Which means the servers (and knowledgeable techies) are provided with the DuraCloud product.

This means that the duties of the institutions with repositories that are supported by cloud storage technology will be refined to making the repository data standardized and accessible, which is probably a better way to spend time and funding for them.  DuraSpace and DuraCloud will maintain the open source and non-profit legacies of DSpace and Fedora, which makes this new organization and its first project even more appealing to institutions on tightened budgets.

Original publication date: 7/14/09


Welcome to my new blog location!  I have moved to WordPress, and I’ll bring over some of my key posts from my previous blog host soon.

This blog is intended to provide basic information about concepts, organizations, people, methods, and breakthroughs in the field of digital preservation.  Closely tied to digital preservation efforts are digital repositories, so you will see posts about them as well.  Please use the search function or the Post Tags and Categories to look for something specific…or to just browse around.

Thanks for stopping by!