What's the difference between Backups and Archiving Electronic Data?

It's the end of the year, and perhaps by the time you read this you're thinking about packing things up and storing them for next year. Maybe you're also taking some time to make backups of your important files, and archiving your digital photos. In today's post, we're going to look at these two activities: Archiving and Backup.
Both of these activities are mandated by regulations and quality systems, often mentioned in the same paragraph. They look really similar on the surface. What exactly is the difference between archiving and backup when it comes to electronic records? Can't a backup of electronic data just be labelled as an archive? Aren't we just writing files to some hard-drive or storage media either way?

Want to get articles like this right to your inbox, every weekday?

Why the Confusion?

Let's start with some definitions:
Backups
are the regular activity of
making copies
of data and other components of computerized systems to ensure the timely availability of the data and systems in case of a technical error, malfunction or disaster.
Archiving
is the long term storage of
official records
in a way that preserves their quality, readability and accuracy.
So while both are about ‘storing’ and 'retrieving' records, their intent is quite different and as such the requirements that go along with them are different, too. The tools (e.g. software, media, etc.) we use to store and retrieve records when preparing for archiving can look like the tools we use to back it up, but that's only part of the story.

Backing up is about
Ensuring Current State

When backing up data, you make a complete and accurate copy of the current state of the chosen system or data in order to minimize data loss and provide for business continuity. You repeat this at regular intervals so that at any point in time you can get back to the latest possible 'good' state of the system - i.e. as it was at the time of the last backup.
So let’s say IT backs up the documents in the electronic record management system every night. In a disaster where one or more data drives went down, you would at most only lose up to a day’s worth of data.

Roll Your Snapshots

Because backups are mainly for ensuring business continuity, there are other considerations that are regularly made when backing up systems: For example, rather than just keeping one snapshot of a recent state of the system, there may be a ‘rolling’ set of backups for the last day, week and month. That way if there were some kind of corruption of the data that went back in time, there is always a previous state that you can go back to.

Bring in the Clones

Another consideration for backups is to have some method of cloning the hardware and software configuration of a critical system. This allows for the recreation of a particular computer or server when necessary. While this might involve saving software and configurations so that things can be reinstalled on a new computer when needed, that often takes much more time and effort that you’d want in a disaster recovery scenario.
Instead, special software is often used that can take snapshots of a server’s state. These ‘images’ of the system can be relatively easily and quickly reconstructed on new hardware if necessary. On the downside, they sometimes require the cooperation of the system's software - perhaps pausing it during the snapshot, or using some specific plug-in or setting.

Is the Backup You See the Same Backup I See?

Backups should be tested every once in a while to make sure they are working correctly, and your system rebuild strategy actually works. There should also be backups in multiple locations to ensure business continuity in the case of a disaster that affects a whole locations, like fire, flood or natural disaster.
The key things that you are concerned with when planning backups are: what is to be backed up, the frequency of backups (i.e. how long is an acceptable period of data loss), time and effort to recreate the data or the system, time and space it may take to make multiple backups.

Archiving is about
Preserving Past State

Whereas backing up is about ensuring the continuity of your current system, archiving is about preserving the past - for example to be able to reconstruct a study, prove the safety of a batch or lot, or show that you didn't cheat on your taxes.
Think of an archive as akin to a library of official records that can be pulled when necessary for review by auditors and inspectors. You don’t archive the same documents over and over again - instead you archive the official version of a record once, and ensure that the archived version is kept in a state that meets the quality requirements of e.g. good documentation practices and ALCOA+.
When archiving, you are intentionally choosing the records that need to be archived. They must be the originals, or verified exact copies. You need to ensure the whole record is archived, including any metadata and audit trails that define the record. It's also important to be able to index that record so that it's easily and accurately retrievable when necessary.
Compared to Backups:
what
you archive is different,
why
you archive is different; and for many types of records,
how
you archive will be different.
Which records are archived and how long they are kept in the archives will often be mandated by a regulatory requirement. GLP has a particular emphasis on the archives, mandating a number of other things that you need to meet for compliance when maintaining an archive. These may including having trained archivists, archives security, access controls and logs, as well as regular integrity checks and disaster planning.
While not necessarily explicitly mandated by other quality systems and regulatory frameworks, these aspects should also be considered whenever you’re preserving records for a regulatory requirement: I promise that the Tax Office won't accept the excuse that your dog broke into the cardboard box and ate your receipts.

Can a Backup be an Archive?

Because of the nature of electronic data, you might be able to use the same technology to prepare the data as you do a backup, for example writing data to a dedicated server or to some media.
But the similarities end there.

Backups are copies, while an archive should to be the single, official record.

Ideally, you should be transferring the data to the archives rather than copying it - even if you are copying data (for example to burn it to media) then you need to verify that the copy is identical and complete, and all other copies should be either deleted or very clearly marked as
not the official version
. If the copies change for some reason, there should be no confusion whatsoever as to which is to be used in a regulatory filing, inspection or audit.

Backups can be, and usually are, automated. Archiving requires humans.

You want backups to be frequent, regular, and consistent. Archiving on the other hand requires some human interaction including indexing, verification, approvals and chain of custody. While parts of these might be automated, you
need
to have some level of responsibility for the authenticity of the archive.

Backups are for the short term. Archives are for the long term.

Often businesses will want to delete backups as soon as possible after a pre-defined period, usually a year or less, for both cost and legal reasons. since backups are continually generated, this can also be automated. Archives on the other hand
will
need to be pulled up at a later date, and perhaps as far in the future as 25 or 30 years down the road. Deleting, moving or changing the contents of an archive is an intentional activity that requires documentation and approval.
The long-term nature of an archive means that other considerations may be necessary such as:
Indexing and ease of retrieval of specific records;
Longevity of the archival media and the data format;
Periodic evaluation of media and transfer to other formats or remediation if required; and
Additional archiving of related items such as the software and/or hardware necessary to read the data in the future.
And since these consist of signed records, you may also need to consider the ability to verify any electronic signatures in the future.

Backing up and Archiving digital data are both about ‘storing’ and 'retrieving' records, however they are activities with different purposes.
what, why and how
you archive can be quite different from backing up, and have some specific requirements to ensure the quality of the data is preserved for when it's needed. So while you
could
use the same process and technology to make an archive as you do a backup, in general it’s not going to, by itself, meet any of the other requirements of a good quality archive.
If you want some guidance on establishing an archive for electronic records, the
HSRAA Guide to Archiving Electronic Records
is particularly good for covering various media types and scenarios in a framework-agnostic way.
Happy holidays, everyone, and thanks for reading. Until next time!
– Brendan

Join Brendan's Daily HaiQu Newsletter

Daily articles on designing, improving, and troubleshooting systems and processes for GxP regulated industry professionals from Operations Managers to QA Leaders.
We'll never share your email. Unsubscribe any time.
© 2022 Brendan Hyland. All rights reserved. See our
privacy policy
and
terms and conditions
.
Generated by
elm-pages
. About the
icons and images
used on this site.