Dec 20, 2016 | Blog
The EU funded Pericles project held its final project conference in London on 30 November to 1 December. PERICLES (Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics – http://pericles-project.eu/) is a collaborative, interdisciplinary project aimed to address the challenge of ensuring that digital content remains accessible in an environment that is subject to continual change. I have been lucky enough to be tangentially involved in the project, injecting (as you would expect) continuum ideas.
At the conference, I was asked to speak on two panels, one on OAIS (Open Archival Information Systems) the influential standard defined by the space community and now due for systematic review, and on Risk Assessment. In the panel on Risk Assessment I was asked to make some comments ‘representing’ the recordkeeping community in thinking about risk assessment for preservation in the active life of complex digital objects. This panel consisted of Pip Laurenson and Patricia Falcao (Tate Modern), Tomasz Miksa (senior researcher at SBA research working on preservation of open source systems and workflows) and Simon Waddington (from Kings College London working on e-infrastructure and repository technologies).

Source @dpc pic.twitter.com/OEuOntPRJc
My contribution (in summary form) was:
The issues of preserving complex digital objects is a recursive one, and it is necessary to define the environment or available locus of attention/action first. So for these comments, my comments are directed at the risks and problems in the current workplace environment.
The current workplace has these problems with complex digital objects, too. It is a design issue. As a community we have known about these issues for over 20 years – perhaps most accessibly expressed by Terry Cook in ‘It’s 10 o’clock: do you know where your data are?’ in 1995 (available but irritatingly behind a paywall!). Examples from the workplace are: Microsoft excel spreadsheets with embedded formulae or linked spreadsheets or the encouraged practice of emailing links, not documents. The issue here is at least in part the lack of stable document identifiers, using the URL name which are at high risk of damage or loss with system upgrades or reconfiguration. The lack of persistent relationships between individual ‘document like objects’ is also a problem.
This is a systems design issue in the workplace. When should records be left in their self-referential environment? This is the current strategy of ‘in-place’ records, or retaining records in the business system. When should they be moved out of these creating environments? There are problems with how complex objects survive the transition between these environments in every day workplaces – not just in digital preservation environments. If the records don’t survive in tact in the workplace, digital preservation after the event comes far too late.
Migration is a specific point of risk. Again, this observation is not new. It was made most coherently by David Bearman in 2006 (‘Moments of Risk: Identifying Threats to Electronic Records’ Archivaria, 62, 2006 http://archivaria.ca/index.php/archivaria/article/view/12912/14148). The two most prominent techniques are migration and emulation. Emulation is in common use in the workplace – the use of ‘readers’ to enable readable versions is often built into other software. Migration in the workplace is far more common than appreciated. There are different types of migration – software migration between versions of a product, software migration to new products, migration of systems or records between organisations. The migration rule of thumb is that there will be a 5 year interval before some type of migration is needed. If we think about how long some records need to be kept – the life of a person, perhaps – then the migration risks are huge. And not only that, but the common techniques for migration migrate the objects, but there is significantly less attention paid to ensuring the metadata, relationships and links are maintained.
The recordkeeping community has recognised this set of problems for a long time. Various initiatives are in place to address it. One of the most recent is to re-orient the old notion of appraisal to use the same disciplinary expertise now directed to determining what records need to be created/kept and apply the analysis framework multiple times at specific points of risk. This is the thinking embedded in the revised ISO 15489 – 2016. The community has also done significant work, like many other communities, in identifying risk and defining approaches to risk assessment. In particular, the ISO TR 18128, Risk Assessment for Records provides disciplinary guidance.
The problems aren’t so different for the current workplace. If we think about risk and preservation as recursive, occurring in different environments on an almost continuous basis, we can identify intervention points which have maximum impact.
Dec 20, 2016 | Blog
The EU funded Pericles project held its final project conference in London on 30 November to 1 December. PERICLES (Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics – http://pericles-project.eu/) is a collaborative, interdisciplinary project aimed to address the challenge of ensuring that digital content remains accessible in an environment that is subject to continual change. I have been lucky enough to be tangentially involved in the project, injecting (as you would expect) continuum ideas.
At the conference, I was asked to speak on two panels, one on OAIS (Open Archival Information Systems) the influential standard defined by the space community and now due for systematic review, and on Risk Assessment. Inadvertently I was part of an all women expert panel – how good is that! – along with Kara van Malssen (AV Preserve), Angela Dappert (British Library), Pip Laurenson (Tate) and Barbara Sierman (National Library of the Netherlands), all moderated by William Kilbride of the DPC. My comments (in abbreviated form) on OAIS are below:
OAIS

Image: Wikipedia
I present a view from the archives/records community. This isn’t my specific area of expertise, so I ventured forth to check the validity or otherwise of my views from better informed digital archives colleagues, including Adrian Cunningham from QSA, Ross Spencer from Archives NZ, Richard Lehane from State Archives and Records NSW and Andrew Waugh from Public Record Office of Victoria. I don’t speak for them, but I did try to benchmarks my remarks against their experience and opinions.
OAIS operates effectively as a ‘lingua franca’. The OAIS reference model is designed as a conceptual framework in which to discuss and compare archives. The high level reference model approach and terminology is (kind of) foreign to all disciplines, so we all have to translate our practice into that expression.
- This has been particularly useful when dealing with the genuinely different approaches to digital preservation across disciplinary boundaries, encouraging a common language for communication between cognate disciplines and dealing with practical implication of institutional mergers etc. The model as lingua franca is useful because digital preservation MUST be inter-disciplinary.
- Beyond that, at the level of detail, there is mixed use and mixed response from the recordkeeping community.
- There is some wariness that the language is also something of a barrier to communication. Used wrongly, it can be a bit snake-oilish – our technology creates SIPs, AIPs and DIPs – digital preservation problem solved! Yes, well not quite. And clearly there are specific disciplinary issues in how each discipline defines the detail of the SIP, AIP and DIP.
OAIS has an inherently custodial world view (perhaps a bit like the term data curation which generally speaking doesn’t resonate well in the recordkeeping space).
- It presumes that the object of digital preservation is stored in a custodial repository – for recordkeeping, we are increasingly aware that we need to be non-custodial in a distributed environment.
- It has some preconceptions about what is being preserved – a fixed, static (end product) object view. This is problematic for managing digital objects. Records are objects plus their metadata. And what is a record is often a packaging of ‘cascading inscriptions’ (Latour) linked to their ever evolving contexts, at multiple and different levels of aggregation, relationship and complexity.
There are some issues, and different views, on what the boundaries of OAIS and digital preservation are or should be:
- OAIS assumes that things are already there and ready to go as SIPs. This is not usually the case for archives/records. Lots of intensive work is commonly required before a SIP-ready thing can be assumed. Is the answer to create a work-around, or to introduce a pre-ingest phase, as some have suggested? I think that the necessary inter-disciplinary nature of practice means that this will be problematic. Lowest common denominator, or approaches driven by one loud disciplinary voice, will not suit all. This might be a particular matter for individual disciplines to address and then seek to harmonise?
- Superficially the OAIS model applies to a single system – of course, it doesn’t have to be read that way, but it commonly is. Different functions can be (and in reality, are) handled by different systems – the business system of an archives supports much more than creating and managing AIPS or DIPS, but these can certainly be part of the larger system. The constructs of SIPs, AIPs and DIPs are useful for thinking about interfaces to other systems.
- Archives (generally) don’t perceive everything within the frame of a single OAIS but rather as modular components – the archival descriptive system is and will continue to be separate from the digital preservation component. The multiple ways that existing systems need to be interfaced in changing ways over time needs to be respected. Metadata needs to be moved into different systems, to serve different purposes.
- Not all metadata are simply preservation metadata – some are part of the creating environment, some are part of the archival descriptive system, and some are specific preservation metadata. There is endless discipline-specific detail on metadata requirements. The metadata for digital preservation are part of that discussion. In recordkeeping environments, we are managing metadata in independent systems. The OAIS metadata model is not prescriptive, but implicit. Even if we expand it to include PREMIS metadata, it is only part of the requirements for recordkeeping. We have complex requirements and ongoing issues about representations of relationships in recordkeeping that we have yet to find a definitive answer for (if one exists!).
- The function of access is common ground to all OAIS disciplines – but is this digital preservation? This is a well worn argument. In the archives/records discipline I would assert that they are intimately related, but not the same thing. It is a specialised component, and again, if the thinking is modular rather than monolithic, then the OAIS model requirements are valid.
- Some of the conceptualisations are irritating and wrong from a recordkeeping perspective. Of particular irritation value is the documentation of preservation actions. These are records – they are critical records for preservation, allowing us to assert the authenticity and integrity of the objects being managed. But in OAIS, these records are squished into ‘administrative functions’, really a substantial misconception here. Such records are part of the systematic recording of the ‘recordness’ of an object. What, of the functionality buried in the administrative or event components of OAIS needs to be managed as records? Appraisal thinking is conspicuously absent in the model as it currently stands.
Dec 20, 2016 | Blog

Image: flickr g4ll4ls
The IAPPANZ (International Association of Privacy Professionals, Australia New Zealand Chapter) held its annual summit in November, titled ‘Trust in Privacy’. I really like this conference which gets the top privacy brains talking to the community. Amongst other things there were updates from Privacy Commissioners on their jurisdiction which inter alia (have I been at a legal focussed conference?) allowed me to glean the following interesting issues:
- 83% of FOI requests at Commonwealth level are for personal information
- The Privacy Amendment (Re-identification Offence) Bill 2016 has been referred to committee which is anticipated to report in February 2017.
From NZ, some of the top issues reported by John Edwards Privacy Commissioner were:
- Discussion of mandatory data breach reporting, which surely is coming to Australia in one form or another pretty soon
- The introduction in Latin America of the concept of ‘Habeas data’ – a right to seek to know what information is held about a person in a data source (manual or automated), with remedies which vary between jurisdiction.
Beyond these very useful and interesting updates on specific jurisdictional issues I was really engaged by the presentation by Malcolm Crompton, himself a previous Privacy Commissioner. My musings are only part of his presentation, available here: http://iispartners.com/Publications/index.html (under Privacy regulation and reform). In particular I was struck by:
- A very interesting diagram of data types and individual awareness, categorising data as provided, observed, derived and inferred from Abrams ‘The Origins of Personal Data and its Implications for Governance’ (2014) http://informationaccountability.org/wp-content/uploads/Data-Origins-Abrams.pdf
- The observation that privacy (particularly managing personal information) is really a question of implementing an implicit social licence. If that social licence is not observed or deliberately broken, then there will be backlash. Observing a social licence isn’t the same as compliance with the law.
- A questioning of whether the privacy framework as currently conceived is really working all that well. It is operating at an incredibly granular level – individual consent to specific information; so much information; managing inferred data rather than data supplied directly by an individual. Is the current framework really feasible as a way forward, and if not, what alternatives might exist.
The discussion during the day reinforced the absolute synergy with recordkeeping issues, as you would expect. But interestingly there was NOT ONE reference to recordkeeping made!!! None the less, information governance is certainly on privacy professionals’ agendas. And so, too, were the problems of getting appropriate attention from senior decision making levels of organisations. Much synergy but not much appreciation of the need to think outside disciplinary silos – always work in progress.