Sir Tim Berners-Lee, the creator of the World Wide Web, predicts that linked data will be the next Web (aka Web 2.0). “Big data” has become a buzzword that everyone is talking about nowadays. Given this changing environment, one may think that in Australia we’re now flooded with data. Researchers are in data heaven, right? Wrong! I have witnessed many cases that show in Australia we are far from the Sir Tim-Berners-Lee’s prediction becomes a reality. I give two recent examples below.
Australia is launching a National Disability Insurance Scheme (NDIS) at various trial sites. The NDIS is the second biggest health care reform in Australia after Medicare. In Chapter 12 (Volume 2) of the Enquiry Report on Disability Support in 2011, the Productivity Commission recommended the establishment of a longitudinal data set on NDIS and makes access to data transparence while satisfying confidentiality and privacy requirements. They also recommend conducting cost effectiveness analyses to ensure that the scheme operates in a cost-effective manner. I saw this as a window of opportunity to conduct a cost effectiveness analysis of the scheme. To my surprise, after months of chasing various sources, I was told that the data are not available for research. No hint given on when they will be available. The “window” of opportunity I saw has been firmly shut.
In an attempt to rescue the project, I contacted a disability service provider, which also participates in the NDIS. Initially, the organisation was very supportive as this research may provide insight on the effectiveness of their services. But when they heard that I plan to link data of participants with Medicare and PBS to get the cost data, things changed quickly. I emphasised that only de-identified data will be used. The organisation’s response was that no data will be available for a cost effectiveness analysis at this time. And there was no date on when such data will be available.
Recently I received an internal grant to investigate the operational efficiency and technological gaps between public and private hospitals in Australia. I had assumed that aggregate data at the hospital level would be readily available as collection of administrative data is routine. I was wrong again. I contacted the federal government agency responsible for hospital data. After reviewing my data request (a panel data of inputs, outputs and service quality of all public and private hospitals in Australia), I was informed that data extraction would be lengthy
and costly. Also, no data are available for private hospitals. The humble internal grant could not cover those costs. So I decided to downscale and limit the analysis to publicly available data. Luckily, the grant was sufficient to hire a research assistant to extract data from reports in pdf format.
The reasons for their reluctance to release data remain unclear to me. Confidentiality and privacy are possible concerns. However, access to de-identified data is sufficient for most research requirements, my proposals included. Perhaps the agencies responsible for data dissemination were under resourced. If so, this is unfortunate. The benefits of research will clearly outweigh the costs of data provision.
To provide a hint of what access to data might look like in the future, I quote the principles of the open data movement, which started in 2007. Government data shall be considered open and public if it complies with the eight principles below:
- Complete: All public data is made available. Publicdata is data that is not subject to valid privacy, security or privilege limitations.
- Primary: Data is as collected at the source, with the highest possible level of granularity, not in aggregate or modified forms.
- Timely: Data is made available as quickly as necessary to preserve the value of the data.
- Accessible: Data is available to the widest range of users for the widest range of purposes.
- Machine processable: Data is reasonably structured to allow automated processing.
- Non-discriminatory: Data is available to anyone, with no requirement of registration.
- Non-proprietary: Data is available in a format over which no entity has exclusive control.
- License-free: Data is not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed.
Although they are still far from complying with open data principles, countries such as the USA and Denmark have taken the initiative to expand the access to administrative data for research. For example, Centers for Medicare and Medicaid Services in the USA provide access to administrative micro-data. As a result, our recent systematic review reveals that the USA conducted most analyses on the cost effectiveness of Medicare and Medicaid despite many developed countries have similar health care programs. Similarly, Statistics Denmark provides access to de-identified data through a secure server for approved research projects. If we want insight into how our health system functions then I believe that it is time for Australia to follow this trend.
Dr Son Nghiem
QUT VC Research Fellow