Security for Kaiser's Genomics Project

Massive Research Effort Includes Multiple Privacy Protections
Security for Kaiser's Genomics Project
By late next year, researchers from around the world will be able to use genetic information, as well as data pulled from electronic health records, for thousands of Kaiser Permanente members to support efforts to develop improved treatments. But how will that sensitive information be kept private?

Kaiser Permanente, which offers a health insurance plan as well as a provider network that serve 3.5 million members, plans to take many security steps for the massive research project, including:

  • De-identifying all data used for research;
  • Storing all research data on servers separate from clinical databases; and
  • Having only the organization's programmers conduct the queries on behalf of researchers.

Kaiser Permanente recently announced that it had completed genotyping of DNA samples from 100,000 of the 180,000 members in Northern California who have voluntarily submitted samples for the project so far. It hopes to eventually gather samples from 500,000 of its members, says Cathy Schaefer, Ph.D, executive director of Kaiser Permanente's Research Program on Genes, Environment and Health. The University of California, San Francisco's Institute for Human Genetics is a partner in the project, which is funded, in part, by the National Institutes of Health.

The Department of Veterans Affairs recently announced the launch of a similar personalized medicine project.

A Wealth of Information

The Kaiser Permanente research initiative stands out from some others, Schaefer says, because it includes a broad cross-section of adults who have complete EHRs that date back as far as 1995. Those who have provided DNA samples for the bio-bank range in age from 18 to 107, she notes. Volunteers also are answering a detailed survey addressing environmental and behavioral factors that can influence their health.

By taking advantage of this wealth of information, Schaefer says, researchers will be able to pose questions like "What are the genetic and non-genetic factors that influence why one person responds well to a particular medication and another does not?" They'll also be able to study the genetic and environmental factors "that may influence complications of a disease or recurrence of a condition."

In signing up volunteers for the massive research effort, Kaiser Permanente stressed that their genetic information and survey results would not be included in their EHRs or be made available to the insurer or provider arms of the organization. "That makes people free to report information without the fear of it affecting anything in their relationship to Kaiser Permanente," Schaefer says.

The Genetic Information Nondiscrimination Act of 2008 prohibits the use of genetic information to deny health insurance or as a basis for hiring decisions. An omnibus package of healthcare privacy regulations expected in the coming months will include provisions that will formalize that using genetic information for insurance underwriting purposes is a privacy violation as well as a non-discrimination violation (see: HITECH Mandated Regs Still In Works ).

When the research data is made available late in 2012, the Kaiser Permanente program's institutional review board will review each research proposal to assure it meets its standards for safety and patient rights.

The Security Details

The research program is still ironing out all the security details for the project. But Larry Walter, the program's associate director of informatics, offers an overview.

To enhance security, databases for the project containing genotypes, EHR information and survey results will be stored on servers physically isolated from Kaiser Permanente's clinical databases, Walter notes.

Queries of the databases will be conducted by a handful of programmers. A randomly assigned subject identifier will be assigned to each genotype. And programmers will use a "highly restricted translation table" enabling the identifier to tie together all related information about a patient. "So unless you have access to the translation table, you'd never be able to associate the genotype to any identifiable information or the identity of the individual," Walter says.

Once programmers link all the information about the patient, they will de-identify it before providing it to the researchers. Information gathered in data research queries will be stripped of all personal identifiers, such as medical record numbers, names, addresses and more.

At a minimum, the research program will follow the guidelines set forth in the HIPAA privacy rule's "safe harbor" provision for de-identifying data for research, Walter says. Under that provision, 18 common identifiers must be stripped out of data for it to qualify as de-identified so it can be shared for research purposes.

"But we want to go beyond what HIPAA requires," he notes, by investigating ways to address the risk of re-identification posed by, for example, including information on patients with certain rare diseases or specific combinations of characteristics. The goal, he says, is to "minimize risk of re-identification as much as possible while retaining the scientific utility of the data."

Potential New Regulations

The Department of Health and Human Services and the Food and Drug Administration recently issued an advanced notice of proposed rulemaking to solicit ideas for changing the regulations overseeing research on human subjects (see: Research Data Protections Considered). Regulators are seeking feedback on a plan to establish mandatory data security and information protection standards for all studies involving identifiable or potentially identifiable data.

Earlier, federal regulators commissioned a study intended to yield recommended best practices for de-identifying health information for research (see: De-Identified Data: The Security Risks ). Regulators ultimately will determine whether existing HIPAA regulations on the issue need to be modified.

Where Are We Headed?

If research on personalized medicine, based on genotypes, EHRs and other relevant information, succeeds, Schaefer foresees a day when patients will routinely undergo tests that yield genotype information that physicians could use in making such decisions as selecting the right prescription. That genetic information might even be stored within an EHR.

But first, research projects like those at Kaiser Permanente, the VA and elsewhere will delve into the mysteries of how genetics as well as environmental factors affect our health and our healthcare, paving the way for a new, more personalized, approach to medicine.


About the Author

Howard Anderson

Howard Anderson

Former News Editor, ISMG

Anderson was news editor of Information Security Media Group and founding editor of HealthcareInfoSecurity and DataBreachToday. He has more than 40 years of journalism experience, with a focus on healthcare information technology issues. Before launching HealthcareInfoSecurity, he served as founding editor of Health Data Management magazine, where he worked for 17 years, and he served in leadership roles at several other healthcare magazines and newspapers.




Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing healthcareinfosecurity.com, you agree to our use of cookies.