A significant portion of my time at Massachusetts General Hospital (MGH) was spent analyzing a cohort of 65,099 individuals diagnosed with type 2 diabetes mellitus (T2DM). During these training years as a research fellow (2013 to 2016), I, along with my colleagues at MGH and Harvard, implemented a variety of predictive modeling methods as well as incorporated natural language processing techniques to better understand diseases and their complications. We focused on cardiovascular disease, liver disease, and insomnia.
This T2DM cohort contained complete clinical details as well as demographics of patients who received care at MGH or Brigham and Women’s Hospital (BWH) between 1992 and 2010. The cohort was large, considering all clinical narrative notes (office, medication management, and operative notes) that accompanied the traditional electronic medical records elements (billing codes and medication prescriptions).
I had the opportunity to present the methods used to create the cohort as well as its content in more detail at the American Medical Informatics Association 2015 Annual Symposium . We were interested in analyzing chronic conditions, including the 15 common conditions reported by the Centers for Disease Control and Prevention (CDC) as contributing to multiple chronic comorbidities . We implemented the same comorbidity identification method proposed by the CDC, counting a certain comorbidity for a patient if the comorbidity was documented as a ≥1 inpatient or ≥2 outpatient code. We found that more than 87 percent (56,691 of the patients) had two or more chronic conditions, and, looking at a sicker portion of the population, more than 42 percent (27,806 of the patients) had five or more chronic conditions.
I recall that of those 65,099 individuals, one man had the full spectrum of the 15 chronic conditions. He had current or historical diagnoses of hypertension, asthma, osteoporosis, congestive heart failure, atrial fibrillation/atrial flutter, ischemic heart disease, chronic kidney disease, cerebrovascular disease, depression, arthritis, diabetes, hyperlipidemia, chronic obstructive pulmonary disease, Alzheimer’s, and at least one type of cancer. This patient had all conditions—yet, strikingly, he was alive.
I now conduct research that focuses primarily on healthcare at IBM Research, and I often remember that patient and wonder whether he has survived. If so, what spectacular mixture of physical and mental characteristics made him so sick and, at the same time, so invincible? Currently, when I dissect patient profiles of a medical complexity similar to that man’s, I often ask myself whether those situations could be reversed.
Many publications describe novel findings of using nontraditional risk factors to better identify individuals at high risk of developing certain diseases. Recent advances in software, such as code-sharing tools, freely available R and Python machine-learning libraries, rapid communication tools, and new algorithms have accelerated healthcare applications (such as decision support systems in the ER and remote patient monitoring), as well as research. Such capabilities were not sufficiently available to us a decade ago. Prediction models have gradually become more accurate because of the software and hardware advances that have accumulated over the past few decades. Combining the ability to store and rapidly process the records of hundreds of millions of individuals by using current or developing new analytic techniques may bring prediction accuracy to crystal-ball levels.
Many hardware-based technologies that once existed only in the realm of science fiction are gradually becoming available.
Many hardware-based technologies that once existed only in the realm of science fiction are gradually becoming available, including robotic nurse assistants, artificial retinas, and light bulbs that kill bacteria at patients’ bedsides. An even more exotic technology is Caltech’s experimental prototype that demonstrates how a paralyzed man can drink beer with the help of a mind-reading robot. Advances in prosthetics, anti-aging drugs, tooth-regeneration techniques, sensors that allow us to watch veins under the skin in real time, gel that stops bleeding, and cholesterol-removing machines are only a few examples of the disruptive technologies that may reverse chronic conditions.
Most scientists and researchers believe that big data will transform medicine  despite current substantial technological barriers . However, when algorithms are applied to clinical data, they are deficient in their ability to make accurate predictions. The algorithms rely on information that occurred in the past, then attempt to create links between covariates and outcomes, and reflect the reasoning about the future at the level of the individual patient. Had the computational capabilities available to us at present existed in the 15th century, would it have been possible for Leonardo da Vinci to develop a machine-learning algorithm to predict the emergence of the 21st century’s new infectious diseases, such as severe acute respiratory syndrome (SARS)? Probably not. Like humans who rely on their intelligence, imagination, and observations of the past to make predictions , even the most advanced algorithms  share a similar deficiency. Efficient machine-learning algorithms of the future must incorporate elements unrestricted to computer science or statistics into their functionality, and at this stage such elements are not yet known to science.
Hats off to the man who had all 15 chronic conditions, and hats off to his heroic struggle for survival. His unbreakable soul was enhanced by care from clinicians equipped with not only extensive experience but also state-of-the-art tools and algorithms. For now, technology combined with human care can merely delay the deterioration of his condition, at least temporarily. I believe, however, that some chronic conditions will disappear in coming years, and we—scientists, engineers, and clinicians—are here to make that happen.
1. Kartoun, U., Kumar, V., Cheng, S.C., Yu, S., Liao, K., Karlson, E., Ananthakrishnan, A., Xia, Z., Gainer, V., Cagan, A., Savova, G., Chen, P., Murphy, S., Churchill, S., Kohane, I., Szolovits, P., Cai, T., and Shaw, S.Y. Demonstrating the advantages of applying data mining techniques on time-dependent electronic medical records. Proc. of American Medical Informatics Association 2015 Annual Symposium. Nov. 2015, San Francisco, CA.
Uri Kartoun is a research staff member at IBM Research in Cambridge, MA. Previously he was a research fellow at Harvard Medical School/Massachusetts General Hospital. His Ph.D. from Ben-Gurion University of the Negev, Israel, focused on human-robot collaboration. firstname.lastname@example.org
Copyright held by author
The Digital Library is published by the Association for Computing Machinery. Copyright © 2017 ACM, Inc.