Open data. Data sharing. Research data. Big data. As a colleague said recently at a conference I attended in New Zealand, ‘data is the new black’. It’s everywhere! We seem to be in the midst of a data boom, a new gold (data) rush for the digital age, with citizens, business and government all hungry for data that will help them innovate, reduce costs, or simply improve their day to day lives.
But what exactly so we mean by open data? Is it one thing or many? And who’s doing it and why?
Open data is data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and share-alike. Good, usable open data is available not only for those who want to access and read it, but to software developers who want to build ways of using or displaying it, or put it together with other open datasets to make new products.
Some data may be shared but is not open as per this definition. For example, data that is name identified and needing to be managed in accordance with privacy laws, or data on which the responsible organisation has placed a monetary value. Such data might be placed in environments where it can be shared but only amongst a limited number of authorised entities. The NSW Government and others are supporting the ease with which such sharing can occur by establishing secure data publishing and sharing platforms.
Open data could be available in the form of published datasets in Excel or CSV formats, from systems in real time via an API (Application Programming Interface) or other method. Open data can relate to anything; health information, spatial data, crime statistics, train timetables; whatever!
There are so many applications for open data:
>> Transport agencies share data enabling the development of timetable and trip time apps keeping commuters up to date
>> State Records NSW offers an API enabling developer access to its full archival catalogue, documenting over 120 years of history
>> The NSW Bureau of Health provides RSS news feeds reporting on a variety of topics including the quarterly performance of NSW public hospitals.
>> Not for profits like Open Australia use public data to help citizens engage directly with government
The policy context
Government policy across Australia and internationally establishes a range of goals in data openness and sharing.
In NSW, the Open Data Policy and Open Data Implementation Plan set out a range of goals for more NSW Government open data to be available and used. The Government has established both data.nsw and the Information Asset Register as platforms to facilitate the publishing and sharing of government data. Case studies are published on data.nsw and an Open Government Community of Practice has been established. Other States and Territories all have open data policies and plans at differing stages of implementation.
At the Commonwealth level, the Department of Finance is leading the data.gov website and creating a culture of data sharing. They are also responsible for GovHack and offer a range of supporting advice and guidance to agencies on opening and sharing datasets, including a very useful implementation toolkit for agencies.
The Commonweath Department of Education also recently released the The Australian Research Data Infrastructure Strategy, which establishes a vision and set of recommendations to provide:
“a basis for policy makers, investors, developers, operators and users to build and sustain an effective and holistic Australian research data infrastructure system. It is a system that collects data systematically and intentionally, organises data to make it more valuable, and uses data insightfully many times over.”
Internationally, the Open Government Partnership and Open Data Charter set goals for member nations around transparency and data sharing. The Open Data Charter was approved by the G8 in 2013 and its principles and guidelines are to be implemented by 2015. The principles are:
- Open Data by Default
- Quality and Quantity
- Useable by All
- Releasing Data for Improved Governance
- Releasing Data for Innovation
Making the case
Despite all the excitement around open data, it can be difficult for people who are working with data sets that they know are suitable for sharing to make it happen. Reasons for this might include:
- concerns about managing privacy, and the potential for data matching as a result of data sharing;
- confusion about how open data fits into the bigger information management framework ;
- not seeing any benefits;
- lack of familiarity with methods for data sharing and publishing;
- lack of technical skills in house; and
- lack of clarity on how to start such projects and the likely investment in terms of time and resources.
Perhaps foremost among these is the perceived lack of business drivers. It can be hard, at the individual agency level, to see how even a modest investment of time and effort in putting data online can bring benefits. This problem was examined recently by the McKinsey Global Institute, who identified three ‘value levers’ for creating value from open data:
- Decision making – open data provides a fact base to make more informed and
objective choices using information that is often available in real time
- New offerings – open data enables organizations to better understand their customers
and context and to design new products and services
- Accountability – open data reveals issues in behaviour, choices, and spending that citizens and leaders can act on to effect change
McKinsey estimates that more than $3 trillion in economic value globally could be generated each year through enhanced use of open data.
An holistic approach
In many ways, the principles that agencies are encouraged to follow in making open data a reality mirror those that can support good overall information management. After all, data is data. Some of it is specially collected and stored as part of a research process, some is generated by the transaction of business, some is collected from other parties. It will need to be retained for longer or shorter time periods to meet regulatory or business requirements, it will need to be assessed in terms of risk, and it will need to be subject to access decisions.
The world of information governance, the world of recordkeeping, comes equipped with the tools to understand the data you hold and make sound decisions about its management, including:
- metadata to capture data provenance and set permissions
- analysis of regulatory and other requirements to make decisions on access and disposal; and
- digital continuity planning to ensure continued usability.
Data sharing may not yet be embedded in the standard practices and systems of government, but by adopting a more integrated approach to information management, a lot can be achieved within shrinking resources. Rather than tackling the needs of ‘records’, ‘data’, ‘published information’ and more separately, consider the purpose and uses of the information first, and establish management rules and requirements accordingly. The old labels are becoming less and less relevant. Data is not only ‘the new black’, it’s the fuel for our economy, for good governance and for innovation. Let’s manage it strategically.
About the author
Cassie Findlay is a Senior Consultant with Recordkeeping Innovation. In past roles, Cassie has worked strategically at the whole of public sector level on digital recordkeeping, training and open data / open government initiatives. In planning for and establishing the NSW Government’s first Digital State Archive, she gained practical experience in designing and implementing a large and complex technical and procedural infrastructure for keeping digital information. She was also responsible for a number of open data initiatives and the design and launch of OpenGov NSW, the NSW Government’s website for published information Cassie has a Masters degree in information management from the University of NSW and particular strengths in digital recordkeeping and information management strategy, digital preservation, training and communications design and delivery, systems design and implementation, and open data.