Good medical practice depends on good clinical research. Without rigorous, replicable, reliable research findings, we cannot trust that our medical decisions are based on truth. To put it bluntly, flawed research leads to bad medicine. It’s essential that we get it right.
In this series, I have argued for a more rigorous approach. The present model of clinical research is expensive, slow, studies insufficient populations of subjects—making generalizability difficult— and lacks power to examine important variations in clinical and personal characteristics of individuals. In my biased view, study design determines if research is being done. Without an appropriate design, we cannot prove if there is an independent contribution of some “input” to some “output,” nor can we adequately surmise the size of that contribution. The only designs that allow for research are the randomized controlled trial (RT), or full population research.
This is not just armchair analysis or ivory tower argument. The need for accurate clinical research is crucial for patients who need reliable medical information, for providers whom patients depend on for that information, and for our society, with its growing distrust of institutions, in general, and the medical profession, in particular.
Research Is More Than Inquiry
To be clear, I am redefining the word “research.” The dictionary definition is broad and lacks context; “investigation into and study of materials and sources in order to establish facts and reach new conclusions.” This, to me, is so nondescript that it allows everything we do to be called research; are we not constantly investigating to learn and reach new conclusions? We seek Yelp to find food places, we ask our partners questions to understand their moods, we put our hands out of the window to feel the temperature.
Everything in our lives is inquiry of some sort, but this is not research; research has to have an element of seeking assured truth. And truth requires a representative group of subjects, a controlled comparison, and an accurate measure of differences. Yelp does not assure I will like the food, my questions to my partner do not assure I will know their mood, and I can find the temperature more accurately with a thermometer.
Observational Studies Are Unreliable Predictors of Medical Outcomes
Observational study (uncontrolled data collection on people who, unplanned, do something while others do something else) and secondary analyses of RTs are not research, either. Yes, there is inquiry. Yes, they seek new insights. But those studies do not assure truth, and truth means that if we act on the new insights, things will get better. Too many observational/secondary “research” studies have led us astray (estrogen therapy, radical mastectomy, high dose chemotherapy and bone marrow transplant for breast cancer, length of time needed to take anticoagulation after a stent, and many other bad ideas arising from bad study designs).
I apologize to many of my friends and teachers for the above paragraph, as many are masters at doing observational studies. In fact, much of training in epidemiology and statistics is learning how to do observational studies, since those studies are easy to do in comparison to RTs or full population research; just get some data and explore connections between independent (input) and dependent (output) variables, and hope you have the variables on the appropriate side of the equation. Observational studies fill the pages of tens of thousands of publications. So, they serve a purpose, just not the purpose of best research for individuals who are ill.
AI is Just a Sophisticated Form of Observational Study
Which brings me to the newest, rapidly growing, highly touted form of medical inquiry, artificial intelligence (AI). And a question: Is artificial intelligence (AI) clinical research? A recent review highlighted 32 examples, including faster and better diagnoses, improved radiology and pathology interpretations, novel materials discovered in yeast, new drugs found for rare diseases, better supply chain management, and robotics for surgery. According to one blog on AI, Google Analytics had over 2,400 projects using AI; one in five companies added AI capability; some estimate that AI will be a $150-billion-per-year industry. Investors in AI are brilliant leaders in information sciences, far more advanced in AI than me, and they consider AI to be a burgeoning research field. It, however, does not fit my definition of clinical research. AI is sophisticated, but, still, observational study. It’s an important distinction. For all its power, AI has yet to evolve to a point where we can rely on its findings for generalizable medical decision-making.
When the discipline began in 1956, AI was defined as the simulation of human cognition. AI has expanded to include not just cognitive models, but emotional and social models, and their data, as well. AI programs, are, basically, statistical programs. “Machine learning” is essentially regression models and “if-then” statements (algorithms) looking for correlations between inputs and outputs, which make sense only in comparison to known relationships between those inputs and outputs. The last part of that sentence is key; computers are remarkable, but AI programs need structure, and instruction. The output of an AI model uses some internal representation of associations. Certainly, the statistical models and programs are advancing, but the conceptual idea remains similar to statistical modeling.
What makes AI different is that there are now new sources of input variables. Computers can capture optical data, size/volume data, spoken/written words, pixels of radiology data, data in any sort of unstructured format. Some programs, even, create variables to study on multiple layers of the variable, not just its existence (neural networks). For example, a cancer cell can be modeled by color change, change in the volume of the cell, or growth rates, and all of these can be combined into a new variable to correlate with known examples of cancer cells. Some AI, called “deep learning,” relies on machines to capture, self-refer, and reinforce learning without any human external input (some define machine learning as requiring human tweaking, while deep learning is supposedly on its own, but even that needs a target, or reference to aim at).
More Data Doesn’t Mean More or Better Knowledge
All this new, or newly formed, or newly reconfigured data, however, does not assure we will learn more. It will depend on how distinctly independent the new input data will be in comparison to older, known relationships (you may now have data to identify the make of my car, but the only thing that needs knowing is that it starts and gets me where I want to go).
Clinical research must demand a planned comparison of AI versus human judgment in a representative, random sample of people to see if AI insights are generalizable. AI may bring ideas to test, but it can’t be the tester. Why? Because AI is working with observational data, gathered for a purpose other than the purpose of the AI program. This is a big problem; this is not research.
I’m willing to concede that I may eventually be proved wrong. Some people even go as far as to suggest that clinical research funding should end until we see what the machines can do. But the machines were made by us, and the rules and data collected are biased by human input variables.
It will be fascinating to see how the focus on AI plays out. I hope we will not see another “AI winter” (no AI funding) by overselling the tool until we figure out how to use it. My hope, also, is that we study AI just like we study other things, with random samples of full populations, or full populations, with really smart study designs. Only then will we truly know both the strengths and limitations of AI for good medical decision-making.
Founded as ICLOPS in 2002, Roji Health Intelligence guides health care systems, providers and patients on the path to better health through Solutions that help providers improve their value and succeed in Risk. Roji Health Intelligence is a CMS Qualified Clinical Data Registry.
Image: Franck V.