Using COVID-19 Patient Data for Research: Sizing Up RisksAs Medical Data Collaborations Kick Off, Privacy and Security Concerns Emerge
In the effort to develop COVID-19 medical insights, some healthcare and technology firms are reportedly partnering to collect coronavirus patient information to assist government and academic researchers. But such efforts are raising significant security and privacy concerns.
Among such collaborations is an effort being spearheaded by a San Francisco-based data aggregator, Datavant, according to the Wall Street Journal, which reports that several other companies, including health IT providers Allscripts Healthcare Solutions and Change Healthcare, are part of that effort.
The effort aims to create a registry of COVID-19 patients by pooling data, such as medical records, insurance claim data and medication information, to help researchers and government agencies glean insights about the disease, the Journal reports.
The Datavant effort will involve patient identifiers, such as names and Social Security numbers, being "transformed through an irreversible process into encrypted keys" so that patient records used by researchers are tied to an anonymous patient ID, the Journal reports.
Datavant, Allscripts and Change Healthcare did not immediately respond to Information Security Media Group's request for comment on the initiative.
The Journal reports that Datavant contacted the Food and Drug Administration about the effort.
While not commenting specifically about the Datavant effort, the FDA, in a statement provided to ISMG, says the agency "recognizes the potential for many different real-world data sources to complement traditional clinical studies and speed the process of evaluating the impact of potential COVID-19 therapies. To that end, the agency is advancing relationships with partners in the public and private sectors to rapidly collect and analyze information in areas such as illness patterns and treatment outcomes."
Collaborative efforts to gather patient data bring an array of privacy and security issues into the spotlight.
"There are many items to consider with regard to information security and cybersecurity with this effort to create a centralized or 'pooled' data repository in the cloud regarding COVID-19," says Stanley Mierzwa, director of the Center for Cybersecurity at Kean University in Union, New Jersey.
For instance, for any potential cloud provider selected to participate - to host or house the central data repository - "there needs to be a thorough review of current and prior assessments for security controls that map to regulations, as well as security and privacy related concerns," he contends.
"Consider this data extremely sensitive, and privacy is of the utmost critical need," he says. "There exist cloud provider registries available publicly that could be a potential start to review the cloud provider's indication of best practices and validation of their security posture of their cloud offers. ... Verifying the security and privacy controls of the cloud provider is a critical first step."
The collected data will be considered high velocity data - often changing and growing by the hour, he adds.
"Because of this, it will be critical to ensure the system developed to house and analyze the data will be able to scale and handle the fast growth," Mierzwa says. In testing the system, it's critical that test data mirrors the volume, format, and speed of creation that will be part of the production centralized system, he says.
But Mierzwa stresses that there "should be no way to go from reviewing the test data to the original data - to prevent damaging leaked exposure. Going further than simply anonymizing the data will be required to ensure during testing of the system being developed there is no potential to expose the actual confidential collected data from the facilities."
In addition to the Datavant project, several other proposals are circulating to develop COVID-19 surveillance and analytics systems that combine using public health data and information drawn from data collected from private industry, notes privacy attorney David Holtzman of the security consulting firm CynergisTek.
"While some projects seek to collect only information that is aggregated or anonymized, other proposals argue that personally identifiable information is needed to ensure effective identification of individuals to stop the spread of the virus," he says.
"My concern is that the plans originating from the tech industry appear to create gobs of data profiles about individuals without minimization of collecting only the personally identifiable data needed to advance the work of medical and public health researchers," he says.
As companies, insurers and others race to pool COVID-19 patient records for research and other efforts, "the amount of data collected would be on a massive scale," Holtzman adds. "The absence of a national privacy law setting a floor for how personally identifiable information about individuals could be shared or consumers protected against misuse or unauthorized disclosures underscores the hurdles that such a plan would face."
These threats to consumer privacy could be avoided, he argues, "if commercial data that does not reveal identifiable information about individuals was combined with aggregated data from public health authorities or local governments. Whenever we see 'big tech' attempt to propose collecting massive amounts of personal information, I turn immediately to distrust based on their record of skimping on privacy and security."
To help ensure privacy is protected, Holtzman calls for those involved in the projects to "slow down this process to assess the risks and threats to the vast stores of personal data that would be created."