The EU funded Pericles project held its final project conference in London on 30 November to 1 December. PERICLES (Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics – http://pericles-project.eu/) is a collaborative, interdisciplinary project aimed to address the challenge of ensuring that digital content remains accessible in an environment that is subject to continual change. I have been lucky enough to be tangentially involved in the project, injecting (as you would expect) continuum ideas.
At the conference, I was asked to speak on two panels, one on OAIS (Open Archival Information Systems) the influential standard defined by the space community and now due for systematic review, and on Risk Assessment. In the panel on Risk Assessment I was asked to make some comments ‘representing’ the recordkeeping community in thinking about risk assessment for preservation in the active life of complex digital objects. This panel consisted of Pip Laurenson and Patricia Falcao (Tate Modern), Tomasz Miksa (senior researcher at SBA research working on preservation of open source systems and workflows) and Simon Waddington (from Kings College London working on e-infrastructure and repository technologies).
My contribution (in summary form) was:
The issues of preserving complex digital objects is a recursive one, and it is necessary to define the environment or available locus of attention/action first. So for these comments, my comments are directed at the risks and problems in the current workplace environment.
The current workplace has these problems with complex digital objects, too. It is a design issue. As a community we have known about these issues for over 20 years – perhaps most accessibly expressed by Terry Cook in ‘It’s 10 o’clock: do you know where your data are?’ in 1995 (available but irritatingly behind a paywall!). Examples from the workplace are: Microsoft excel spreadsheets with embedded formulae or linked spreadsheets or the encouraged practice of emailing links, not documents. The issue here is at least in part the lack of stable document identifiers, using the URL name which are at high risk of damage or loss with system upgrades or reconfiguration. The lack of persistent relationships between individual ‘document like objects’ is also a problem.
This is a systems design issue in the workplace. When should records be left in their self-referential environment? This is the current strategy of ‘in-place’ records, or retaining records in the business system. When should they be moved out of these creating environments? There are problems with how complex objects survive the transition between these environments in every day workplaces – not just in digital preservation environments. If the records don’t survive in tact in the workplace, digital preservation after the event comes far too late.
Migration is a specific point of risk. Again, this observation is not new. It was made most coherently by David Bearman in 2006 (‘Moments of Risk: Identifying Threats to Electronic Records’ Archivaria, 62, 2006 http://archivaria.ca/index.php/archivaria/article/view/12912/14148). The two most prominent techniques are migration and emulation. Emulation is in common use in the workplace – the use of ‘readers’ to enable readable versions is often built into other software. Migration in the workplace is far more common than appreciated. There are different types of migration – software migration between versions of a product, software migration to new products, migration of systems or records between organisations. The migration rule of thumb is that there will be a 5 year interval before some type of migration is needed. If we think about how long some records need to be kept – the life of a person, perhaps – then the migration risks are huge. And not only that, but the common techniques for migration migrate the objects, but there is significantly less attention paid to ensuring the metadata, relationships and links are maintained.
The recordkeeping community has recognised this set of problems for a long time. Various initiatives are in place to address it. One of the most recent is to re-orient the old notion of appraisal to use the same disciplinary expertise now directed to determining what records need to be created/kept and apply the analysis framework multiple times at specific points of risk. This is the thinking embedded in the revised ISO 15489 – 2016. The community has also done significant work, like many other communities, in identifying risk and defining approaches to risk assessment. In particular, the ISO TR 18128, Risk Assessment for Records provides disciplinary guidance.
The problems aren’t so different for the current workplace. If we think about risk and preservation as recursive, occurring in different environments on an almost continuous basis, we can identify intervention points which have maximum impact.