New federal guidance on how to de-identify patient data used for secondary purposes, including research, provides important insights on ensuring privacy, several legal experts say. But one privacy advocate says the guidance comes up short.
The guidance is "a good first step toward achieving a better-quality, less-expensive healthcare system that carries the added benefit of better protections for individual patient health records," says attorney Deven McGraw. She's chair of the HIT Policy Committee's Privacy and Security Tiger Team, which advises federal regulators.
The new report is important because "access to the vast amounts of health data increasingly available as the nation continues to roll out its all-digital health information network will provide the opportunity for the kind of rigorous data analysis that is critical if the U.S. is to realize the promise of a lower-cost, better-quality health care system," McGraw notes in a blog.
"It is just as critical that the privacy of the individuals from which that data is drawn is protected in a way that invokes trust in the system. This is where the new guidance comes into play," says McGraw, director of the health privacy project at the Center for Democracy & Technology.
Other legal experts say the guidance is helpful because it fleshes out details for the two methods of data de-identification allowed under HIPAA. For instance, it spells out examples of individually identifiable data that needs to be removed to meet the "safe harbor" method requirements. And it reaffirms that cryptography can be used to de-identify data in the "expert determination" method.
But privacy advocate Deborah Peel, M.D., founder and chair of Patient Privacy Rights, calls the new guidance "wholly inadequate," lamenting that it falls short of establishing any new requirements.
"Nothing is required, and there are no penalties for not following even the flawed approaches set out in the guidance," Peel says. "It's well-known now and was in 2010, but not when HIPAA was drafted, that completely de-identifying health data sets so that they cannot be re-identified is very difficult."
Protecting Research Data
The guidance, issued Nov. 26 by the Department of Health and Human Services' Office for Civil Rights, describes how to comply with the HIPAA Privacy Rule's de-identification requirements for aggregated information used for secondary purposes, including clinical effectiveness studies, policy assessments and life sciences research (see: Data De-Identification Guidance Offered).
The HITECH Act mandated that federal officials issue the guidance by early in 2010. "We are disappointed with how long it took the administration to release this guidance," McGraw says. "OCR held a workshop to gather stakeholder input on health data de-identification in March 2010, and this guidance was not released until nearly three years later."
Under the safe harbor de-identification method described in HIPAA, 18 elements of identifying information, such as an individual's name, address, date of birth, Social Security or medical record number, must be removed.
The expert determination method described in HIPAA involves an organization using the services of an expert "with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable."
The expert can use methods, such as encryption, to de-identify data, and then conduct an analysis to determine "that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information," the guidance explains. The expert must also document their methods and the results of their analysis.
The de-identification of data is aimed at protecting patient information that's aggregated and then used for purposes other than for "treatment, payment or healthcare operations," as specified under HIPAA, explains Kathryn Coburn, a healthcare attorney that specializes in data security and privacy issues at law firm Cooke Kobrick & Wu LLP. Typical secondary uses for de-identified patient data are for pharmaceutical industry and public health research, she notes.
"This is extremely valuable information that's aggregated but is illegal to use unless de-identified," she says.
The new OCR report helps clarify the expert determination method, says Adam Greene, a partner at the law firm Davis, Wright, Tremaine and a former OCR official.
"The guidance confirms that an expert can rely on techniques such as cryptographic hashing or the use of a data use agreement to determine that the risk of identification is 'very small,'" he notes.
OCR notes in its guidance that "a covered entity may require the recipient of de-identified information to enter into a data use agreement to access files with known disclosure risk. ... This agreement may contain a number of clauses designed to protect the data, such as prohibiting re-identification."
The expert method is riskier than the safe harbor method because there is more chance that the data could be re-identified, Coburn contends. That's because this method can involve the use of cryptography, and it's possible that encrypted data could be re-identified with a cryptography key or hacked, she says.
"The expert method might be more useful for analytics, but it's riskier," Coburn says. It's also more expensive than the safe harbor method because of the cost of obtaining the professional services of an expert.
And while the expert needs to be qualified, "the guidance doesn't specify what 'expert' means," Coburn says. Typically, an expert involved with de-identification would be a statistician, she says. "If a breach happens after de-identification and re-identification, OCR would look to see if the expert was qualified" to do the work," she stresses.
The guidance also helps clarify some details regarding the more commonly used safe harbor method.
For example, it confirms that retaining the use of identifiers such as the last four digits of a Social Security number or any date information that is more specific than the year of an event are impermissible under the safe harbor method, Greene notes.
The guidance also clarifies that covered entities qualify for the safe harbor if they strip out 18 identifiers and also have no "actual knowledge" that remaining information could be used to identify an individual.
The OCR report states: "Actual knowledge means clear and direct knowledge that the remaining information could be used, either alone or in combination with other information, to identify an individual who is a subject of the information."
Peel argues that even a small risk of identifying a patient through de-identified information is too high.
"The only acceptable risk to an individual may be zero, not "small," she says. "Each of us has our own tolerance for risk and each of us has the right to control access to our health information. For many of us with health data, we would want zero risk of re-identification."