The Records in Contexts conceptual model by the ICA’s Expert Group on Archival Description were issued for comment in September 2016. The approach creates a conceptual model that unifies previous archival descriptive standards issued by the International Council on Archives (ISAD-G; ISAAR-CPF, ISDF and one I think less than useful, ISDIAH).
This is a really interesting development, much to be encouraged. But getting the conceptual model clear is important. I’ve published my comments here.
This might be a bit esoteric, but hopefully interesting enough for those deep down in the fascinating world of recordkeeping metadata.
Comments on EGAD – Expert Group on Archival Description, Records in Context – Conceptual Model http://www.ica.org/en/call-comments-release-records-contexts-egad
Thank you for the opportunity to comment on the evolving RIC (Records in Contexts). I commend the work of the EGAD group in this complex and demanding work. Exposure to the archival community for comment is appreciated. The introduction of a multi-entity relational model which enables recursive relationships within entities, and extensive relationships between entities is totally supported.
However, as indicated in more detailed comments below, the definition of the entities is problematic, as is the management of relationships. Relationships and their management become critical in this type of model. Modelling relationships has always been difficult and problematic. Other disciplines do not seem to have the same requirements for persistence and management of relationships over time. Indeed it can be said to be one of the key features of the Australian Series System, an intellectual basis from which my practice evolved. As a community I would assert that we haven’t cracked the expression of relationships yet, and neither has the RIC, with problems in the models proposed here.
It is with great pleasure that I read this document, and applaud its aspirational stance. The networked, flexible model for archival description at the basis of RiC-CM will serve the archives profession well into the digital future. The alignment with recordkeeping metadata approaches, which can be seen in the multi-entity and relationship definition, will serve the broader recordkeeping community well. Compatible models for records regardless of the domain they are managed in (current workplaces, archives or in the ‘wild’) will enable much greater interoperability, inheritance opportunities and enhancement rather than replacement of metadata sourced from various processes over time.
Qualification for comments
I have had some involvement in definition of metadata for archives/records through work on the original SPIRT metadata project at Monash University, the development of the ISO Metadata for Records standard suite (ISO 23081) and the AS 5478 Australian Metadata Reference Set. I chaired the Australian Society of Archivists Archival Descriptive Committee for a time, and contributed to the codification of the Australian Series System. I have also had development and implementation experience with jurisdictional metadata standards such as those published by National Archives of Australia, Archives NZ, HK SARs Government Records Service.
Comments on Entities
Archives and records work deal fundamentally with three types of entities in relationship. This has been the bedrock of Australian archival practice for over 50 years. These entities are Records, Agents and Functions. These are the entities that we are professionally responsible for. Other entities introduced into the RIC may well be useful for description (eg Concept/Thing) in the world of semantic web construction. But they are generic and not our core business. While supportive of their inclusion and their potential to link to other professional and cultural domains, the RiC-CM should prioritise those entities which are our core business.
I commend the introduction of limited set aggregations. The inclusion of multiple recursive relationships for Records Set/Agents in particular opens up the data model to be responsive to multiple archival descriptive traditions (although the examples used in the property descriptions could more actively embrace this by using not only ISAD G representations, but also explicitly acknowledging that these can be inherited/used to support other descriptive traditions). The recursive nature also increases the relational power of the model, something endorsed totally.
In the articulation of the entities provided, there seems to be a confusion between conceptually defining our core professional entities (which doesn’t exclude inheriting the expression from others), and the practical demands of constructing system entities or more general descriptive entities. I cannot find the conceptual rationale or logic of separating out some of the entities, rather than making them properties. At minimum an articulation of the logic is required. But I would argue:
- Occupation, Position are attributes of Agents.
- Documentary Form is an attribute of Record Set/Record
- Date and its expression is an attribute of every entity, at every layer of aggregation, and also an attribute of relationships which need to be timebound. An alternative construct is to move further towards making relationships central. If every action is expressed as a dated relationship (for example existence, extent, actions on a record) then all critical statements are made as relationship statements which are dated. This might mean that date is no longer needed as an entity, but becomes critical in expression of relationship – some of this can actually be seen in the graph diagram included in Appendix 1.
- Relationship must be an entity if we are to coherently express and manage relationships over time (see below).
- Place and Concept/Thing is a subject based attribute that can be added to anything and while it links us to the Linked Open Data community and is desirable, it is a set of add ons, nice to have, not critical to our archival practice. In that spirit of great, but not exclusively recordkeeping relationships, I would think that Events could also be added, but recognising a potential overlap with what is currently inadequately expressed as ‘history’ needs distinguishing from recordkeeping events (see further below).
- Place is confusing – it seems to be both location (has holding location etc), as well as physical positioning, such as geographical coordinates. I suspect these two quite different notions of Place is made to enable inclusion of ISDIAH, which itself was always out of step with the ICA descriptive entities. Archival repositories are a type of agent, as has been argued before.
The relationship based approach is central to our Australian descriptive practice and I support it wholeheartedly. However, it has also been notoriously difficult to achieve an adequate expression of relationships. In traditional Australian practice relationships have been a component of each entity’s description – so a record description would include its relational link to its creating agency, controlling agency, related records etc. Using these relationship traces makes the tracing of relationship networks flexible, time bound and complex.
However we recognise that this de-emphasises the nature of relationship. In other work, we have expressed relationship as an entity itself, allowing relationships to have identifiers, dates and to contain persistent links to the related entities. This has been OK but subject to problems even at a conceptual level in expressing reciprocal relationships and ensuring persistence of relationships. It is the best we have done to date.
In RIC, the adoption of semantic data models, and graph technology construction is highly commendable and I endorse the notion that these are likely to be the technologies of the future. Relationships are central to these technologies and the approach is completely consistent with our Australian archival practice. However it is unclear to me how the technology or the data models document these relationships.
While I do not pretend to understand graph databases, the graph models such as the one attached here indicate that relationships can be managed as entities effectively in graph databases. Surely modelling relationships in this way is more sustainable for archival description. The relationship notion is central, not peripheral.
[diagram from slide 10 of Max de Marzi Graph Databases Use Cases http://www.slideshare.net/maxdemarzi/graph-database-use-cases]
Managing the relationships which are critical as only statements in RDF triples only provides persistence to the nodes that are linked, rather than attributing persistence to the relationships themselves. While the EGAD committee acknowledge that the expression of relationships is still a work in progress, I would suggest that confirming the data model around relationships is essential.
The extensive list of relationships provided is acknowledged as not comprehensive and in need of further work. Given that it is a central component of the RIC model, further development must take place before the model can be endorsed.
In Australian practice we have traditionally identified types of relationship. These are Provenance, Succession, Containment and Associative relationships. The recordkeeping metadata standard introduced a further relationship type of Events or Actions which allowed description of things done to or on records as a further relationship type (see below). Using this type of characterisation of relationships might assist in creating more clarity about what types of relationships are appropriate to each entity. Chris Hurley has proposed a further categorisation of relationships. Developing this thinking would be beneficial in RiC-CM.
Noting that the introduction to the draft RIC stated ‘RiC-CM also does not yet offer a model of the role of the archivist and the activities he or she performs in the formulation and ongoing maintenance of description’ this points to a conceptual gap in the articulation at present. The property ‘Authenticity and Integrity’ in Records Set/Record (RIC P5 and P22) inadequately encompass the requirements to enable assertions of authenticity and integrity. I would argue that the material not well encompassed in ‘History’ documents the events and actions that are taken on a record and this is the information needed to assert authenticity and integrity over time. For digital records integrity is also the digital checksum or hash of the specific record element. I find the conceptual thinking to be unclear about this.
Within our Australian practice we have identified ‘recordkeeping events’ to document these actions. This may, or may not, be the answer, it does propose a different way of thinking about these actions. Further, such recordkeeping events are expressed as relationships – themselves an expression of something done, by someone, on something at a particular date. Some of the relationships identified in the listing are of this nature (eg was written by, was collected by). This could be extended and, depending on the nature of relationships as they evolve, might prove a mechanism to clarify notions of authenticity and actions.
Particularly when encouraging inheritance of metadata from current recordkeeping systems, attention to recordkeeping events for digital records particularly, is essential.
While not proposed as necessarily authoritative, the work done in AS 5478 Australian Metadata Reference Set for event relationships may be useful. This is attached to this comment for reference.
Granularity: One of the things we know about translating archival descriptive systems to digital records is that increased layers of granularity of description are required. Thus what may be expressed as a ‘Record’ in the paper world, may in fact be composed of many digital components. Thinking of a file, traditionally considered an item in archival descriptive systems ( a ‘Record’ in RiC-CM) because it is a complete, and ‘issuable’ thing, can disaggregate into many many specific images of individual pages, each one of which can be considered an ‘issuable’ thing. Is it the intention to manage images of an individual page as a ‘Records component’? I suspect that this will not work particularly well, and certainly considerations of sequence must be addressed (noting that this too is explicitly noted as requiring further development p39). Alternative renditions or formats (microfilm, pdf, jpeg etc) may also exist for each page. Perhaps a better expression is to allow ‘Record’ too to become recursive.
Parallel provenance or multiple simultaneous provenance: Increasingly archival practice is allowing alternative models of description to co-exist with ‘official’ interpretations. This allows alternative versions of context to be constructed, and to have equal validity with ‘official’ expressions. How will RiC-CM enable these alternative expressions?
User contributions: Linked to the need to enable multiple provenance expressions but not the same, is the increasing prevalence of user contribution to archival descriptive systems. This might be through alternative expressions in the metadata of an archival descriptive system, or alternatively, the contribution of other items to an archival system. These need to be managed and attributed appropriately to the contributor. At present such notions do not appear to be enabled in RiC-CM and they are already a requirement of practice.
Digital records: presumably as a requirement to encompass the existing ICA descriptive standards, the examples and properties fields seem to reflect a paper based paradigm. Better examples and further thinking on characteristics of digital records would enhance the RiC-CM model.
I welcome the advent of RiC-CM and encourage the EGAD Group to continue development of this exciting initiative.
Director, Recordkeeping Innovation Pty Ltd