Precision Medicine: Data analytics has ventured into another vital area that of using data to deliver personalized health care to millions of citizens. Personalized Medicine has given way to ‘Precision medicine’ and is a term used to describe a transformative model of health care that involves the selection of diagnostic tests that have the potential to identify changes in each individual’s patients cells and to customize treatment as per individual requirements. Ultimately, the goal of precision medicine is to improve patient outcomes. Having said that while there has been reasonable amounts of success in the application of big data, there are still certain major hurdles in its universal adoption which stem from the economic, technical, regulatory and human aspects involved. The NAS report defines “precision medicine,” — the use of genomic, epigenomic, exposure, and other data to define individual patterns of disease, potentially leading to better individual treatment. Precision Medicine is much more than personalized medicine and is meant to convey a more accurate image of diagnosis that is person-centered and multifaceted.
Data Analytics and Precision Medicine: Data Analytics or Big data in precision medicine essentially refers to the use of data science techniques to capture and analyze huge and complex datasets in order to positively impact patient care outcomes, and optimize business processes. While the term, big data, may seem to reference the volume of data, that’s not necessarily the case. Big data may also refer to the extent of technology that an organization requires to handle large amounts of data, as well as the needed facilities to store it. The healthcare industry produces large amounts of clinical, financial, administrative and genomic data and needs big data techniques to manage it.
Healthcare Data: Health care data uses a variety of data and includes Web and social media data (interaction data from Facebook, Twitter, LinkedIn, blogs, health plan websites, and smartphone apps), Machine-to-machine data (information from sensors, meters, and other devices), Transaction data (healthcare claims and billing records in both semi-structured and unstructured formats), Biometric data (fingerprints, genetics, handwriting, retinal scans, X-rays and other medical images), Human-generated data (Electronic Medical Records (EMRs), physicians’ notes, email, and paper documents), Pharmaceutical R&D data( related to a drug’s mechanism of action, target behavior in the human body and side effects)
The Electronic Health Record (EHR) is a longitudinal systematic collection of electronic health information for a patient generated by one or more interactions in any care setting. In order that they may be accessed it is essential that the same should be shareable across different health care settings. An EHR typically includes information such as Patient demographics, Medical history, Medications and allergies, Immunization status, Laboratory test results, Radiology images, Vital signs, Personal statistics like age and weight, Progress notes and problem details and Billing information. The Electronic Medical Record (EMR) is a part of EHR and refers to the digitized version of the paper chart in clinician offices, clinics, and hospitals. The EMR contains notes and information collected by and for the clinicians in that specific office, clinic, or hospital setting and is mostly used by providers for diagnosis and treatment. The term Personal Health Record (PHR) refers to EHRs that are designed to be set up, accessed, managed and controlled by patients in a private, secure and confidential environment. They usually contain health information generated by clinicians, home-based monitoring devices, and patients themselves.
Big Data has been used for a considerable time in the health care industry. Some of the current applications of big data have involved the processes of collecting and aggregating the vast amounts of patient data produced from a variety of sources for analyzing hospital systems, effectiveness of patient care, return on investment of business processes and general business intelligence. The use of data analytics will however go a step further in using the data collected to improve patient outcomes through use of advanced clinical analytics to enhance proactive care, enhancement of clinical decision-support using analysis of current knowledge databases, improvement of clinical trial design with the use of statistical tools and algorithms and finally building better models of personalized medicine through the analysis of large data sets.
Applications of Data Analytics in Precision Medicine: It is suggested that data analytics can be used in six very practical cases. They are high-cost patients, readmissions, triage, decompensation (when a patient’s condition worsens), adverse events, and treatment optimization for diseases affecting multiple organ systems (such as autoimmune diseases, including lupus).
- High-Cost Patients: As per a study done in the US, approximately 5% of patients account for 50% of all US Health Spending. Therefore identification of high cost patients is paramount. To identify high risk or high cost patients may require inclusion of attributes such as behavioral health problems or socioeconomic factors such as poverty or racial minority, marital and living status in the algorithms developed. Algorithms are most effective and perform best when they are derived from and then used in similar populations. A very critical aspect of identification of high cost patients is the use of behavioral data since it is found that a large portion of patients at high risk for hospital admission have some sort of behavioral health issue, depression being most especially frequent. For instance, the standard approach may be to give all patients who are discharged from the hospital a follow-up appointment in two weeks. But it might make more sense to ensure that the highest-risk patients are seen within two days, while patients with very low risk might require follow-up care only as needed. Algorithms can help reallocate resources more effectively at both the high-risk and low-risk ends of the spectrum.
- Readmissions: As many as one-third of readmissions have been regarded to be preventable and, therefore present a significant opportunity for improving care delivery. Health care organizations can use a predictive algorithm to predict who is likely to be readmitted to the hospital. The important result differentiation would consist of tailoring the intervention to the individual patient, ensuring that patients actually get the precise interventions intended for them, monitoring specific patients after discharge to find out if they are having problems before they decompensate, and ensuring a low ratio of patients flagged for an intervention to patients who experience a readmission (that is, a low false positive rate). It may also make sense soon to ask patients with a smartphone to allow health care organizations to access data from their phones that will help identify patients who are not managing a chronic condition well or that will monitor people recently discharged from the hospital, since it appears that patients who are not making calls or sending e-mail with their usual frequency may be depressed or suffering from other issues. Patients may also be asked to wear some type of device that monitors physiological parameters, such as heart rate or rhythm. These data will be most effective in informing health care decisions if they are processed with analytics.
- Triage: Estimating the risk of complications when a patient first presents to a hospital can be useful for a number of reasons, such as managing staffing and bed resources, anticipating the need for a transfer to the appropriate unit, and informing overall strategy for managing the patient. In newborns and many other populations, using modern big-data techniques that combine routinely collected physiological measurements makes much more accurate assessments possible with a minimal burden of training and implementation. For example using maternal data to assign a preliminary probability of early onset sepsis. Similarly for adult high risk patients, clinicians in the emergency department may be provided with two composite scores that have been calibrated using millions of patient records and that are applicable to all hospitalized patients, not just those in intensive care. The first of these scores summarizes a patient’s global comorbidity burden during the preceding twelve months; the second captures a patient’s physiological instability in the preceding seventy-two hours. In addition, these two scores, available in real time, are combined with vital signs, trends in vital signs, and other information, such as how long a patient has been in the hospital. If the information collectively indicates that a patient has ≥8 percent risk of deteriorating in the next twelve hours, an alert is sent to the responsible providers.
- Decompensation: Often before decompensation—the worsening of a patient’s condition—there is a period in which physiological data can be used to determine whether the patient is at risk for decompensating. Much of the initial rationale for intensive care units (ICUs) was to allow patients who were critically ill to be closely monitored. A host of technologies are now available that can be used to monitor patients who are in general care units, in nursing homes, or even at home but at risk of some sort of decompensation. Monitors are becoming available in which multiple data streams can be compared simultaneously, and analytics can be used in the background to determine whether or not the signal is valid. One example of these new monitors is a device that sits under the mattress and that collects data about the patient’s respiratory rate and pulse and whether or not the patient is moving. The data are transmitted to a server, where analytics are used in real time to determine if the patient appears likely to be decompensating. When the system detects a likely decompensation, an e-mail message is sent to an on-duty nurse’s smartphone. With this system, the likelihood that a true decompensation is present has been increased to approximately 50 percent—far better than for cardiac telemetry, for which it is typically 5–10 percent.
- Adverse Events: Another use case for analytics will be to predict which patients are at risk of adverse events of several types. Adverse events are expensive and cause substantial morbidity and mortality, yet many are preventable. It seems likely that analytics could be combined with data about exposures to specific medications and with measures of kidney function, blood pressure, urine output, and other processes to identify patients at risk of decompensation. Analytics can also be effective in managing infection. One example involves monitoring and interpreting changes in heart rate variability for detection of major decompensation in infants with very low birthweights before the emergence of an infection. Monitoring the heart-rate characteristics of newborns alone has already resulted in reductions in mortality and increases in the number of ventilator-free days. However, there is room for improvement using increasingly sophisticated analytics that account for subtle signals but also filter out extraneous patterns, such as those that occur when the baby moves.
Challenges for Data Analytics in Precision Medicine: However there are also various factors that inhibit the use of big data in analytics. Some of them can be cited as: A resistance to a systems-approach by the medical community; acute shortage of professionals in data analytics in health care; lack of comparable and transparent data in health care; financial constraints in adoption of big data analytical systems and finally the lack of interoperability between health care systems.
The perhaps most difficult challenge to be addressed in a Precision Medicine research program is that it must have a value proposition that is compelling to all participants’ i.e. data providers, patients and healthcare organizations. With the rapid growth and developments in genomics research, data science and electronic health records, it is possible to integrate an individual’s information into their clinical care for precision medicine. To address this opportunity, it is necessary to have a large cohort which is participant centric in concept so that it is possible to share structured data with participants, researchers and clinicians and at the same time is transparent in purpose, management and execution.
Another challenge relates to the lack of a practical mechanism to uniquely identify participants. The integrity and value of EHR data depends fundamentally on being able to unambiguously link it to one and only one individual whose healthcare it represents. This is sometimes achieved by matching records based on several characteristics (probabilistic matching) instead of using a unique identifier, which circumvents but does not eliminate privacy concerns. Unintentional duplicate records, name ambiguities, and utilization of another person’s identity to obtain care all contribute complexity to the seemingly straightforward task of linking data to the correct individual. This problem grows in proportion to the number of individuals who participate and the number of different care organizations that generate data about those individuals.
To complicate things, existing consortia that extract, transform, “clean” and analyze clinical data for research purposes have extensively depended upon the local expertise of informatics teams at each of the data contributing sites. Such expertise is currently unavailable in most healthcare institutions except in large academic medical centers. The paucity of experts constitutes a bottleneck in the development of a national PM cohort that includes all care settings.
Further, there will be certain technical specifications that must be methodically drafted and all possibilities integrated. To make progress toward a reliable research data resource based on individuals accessing their records as study participants, there is a need for a definition of a common structure for granular research-related data, have agreed-upon definitions (semantics) for key research data, and finally a development of an extensible “comprehensive electronic data format” that will accommodate both structured and unstructured data.
Transforming the realities of real-world EHR data into interoperable, standardized data structures requires significant vendor-to-vendor and site-to-site interpretation. EHR vendors and healthcare providers must be given guidance and testing capabilities regarding how to configure systems for these new requirements.