We all know our American health care system is on an unsustainable cost trajectory. During my adult lifetime the cost of healthcare has exceeded inflation, as well as growth in gross domestic product, nearly every year. The effects of this excessive inflation have been felt hardest by corporations who want to do the best they can for their employees, but remain profitable and competitive.
What if it were possible to cut one third of your spend on health care, while helping your employees live safer, healthier lives? There is strong evidence from studies of utilization, expert panel review, and regional variation in spend that a significant portion of health care does not meet evidence-based standards for its delivery or improve community health. Artificial intelligence applied to utilization review can eliminate this variance. It’s the new knife we need to trim the fat. This approach could result in higher employee satisfaction, as well as a safer health care system we can afford.
Utilization Review isn’t working
Utilization review is the check and balance of our health care system. When you go see a doctor with a problem that requires treatment, she will take your history, perform a physical exam, consider the available data, and make recommendations regarding the risks, benefits, and alternative to a proposed treatment. The Doctor’s recommendation is based on her training, experience, and capabilities. Your insurance company then reviews the recommendation, compares it to internal guidelines, and approves or denies payment for the procedure. This review process is called utilization review. And it’s the way our system protects us from unnecessary procedures by ensuring evidence-based processes are followed. There is strong evidence that utilization review as we now do it is not adequate to get the job done. This evidence comes from utilization studies, expert panel review of cases, as well as regional variability in health care expenditure.
Utilization studies examine how out of pocket cost effects our consumption of health care. If health care were free, how much more of it would you use? And conversely, if it were expensive, would we use less? The Rand Corporation performed the mother of all utilization studies way back between 1971-1986. They studied 2751 families by randomizing them into groups ranging from totally free care to really high out of pocket expense. Over the years people with high out of pocket expense limited their care to around 30% less than those with for whom the care was free. Except in the case of poor families with certain diseases like high blood pressure, there was no real difference in health or satisfaction. This suggests that there is definitely room to cut healthcare spending without making people sicker or suffer more. The place to start would be procedures that really don’t help that much, or making sure the right patients have the right procedures.
Utilization review is supposed to make sure that the right patients get the right procedures. But retrospective studies applying evidence-based methods show that over a third of the medical procedures examined were not indicated by the best available data. A typical study assembles a panel of experts and has them review charts from a group of patients and opine as to whether the patients met evidence-based criteria for the procedure. You could call this a juried panel of physician experts. Using this paradigm researchers report that 34% of knee replacements, 23% of pacemakers, 12% of cardiac catheterizations, and a whopping 70% of hysterectomies did not follow best practice despite having undergone utilization review.
These studies are too small to be conclusive on their own, but fluctuations in Medicare’s spend across geographies suggests they may be on to something. In 2016 Medicare spent $4,589 per person for hospitalization in Fort Worth, Tx and $2,477 per person in La Crosse, Wisconsin. Are the people of Fort Worth so physiologically different from those of La Crosse that they require nearly twice as much hospital cost? Notably, Medicare does not use utilization review; however, Kaiser does, and their numbers are a little better, but still show substantial variance.
Regional fluctuations in health care delivery are so profound in the US that Dartmouth has produced an Atlas of Healthcare in America for the last 20 years. In Elyria Ohio the rate of cardiac stents per capita is three times what it is in nearby Cleveland OH even though the world-renowned Cleveland Clinic cardiovascular center is not in Elyria. These differences in procedure utilization across communities are not explained by poverty, or the degree of illness in the population, or the cost of care delivered. What is really interesting, though, is that the difference in spend is also not clearly associated with greater health, or higher patient satisfaction. This type of data combined with the utilization studies cited above led the American Academy of Physicians to conclude that “In aggregate health care spending could be reduced by one third without affecting population health.”
If it's not poverty, prevalence of illness, or even cost, then what does explain the regional differences in healthcare utilization? It comes down to doctors. While physicians earn only 20 cents of the health care dollar, it is widely believed that our orders and preferences account for 80% of the spend. While patient preference may be a variable, there’s not real reason to believe people in one area want to be hospitalized, or have cardiac stents, more than people in the next town. Every one of those patients in Fort Worth was admitted to the hospital with an order for services written by a doctor. Why do doctors in Fort Worth order more expensive care than doctors in La Crosse? After all, they all have similar medical school, internship, and residency training.
Training fades as doctors get farther and farther out from Residency. I know, I am one. In addition, advances in medical science often make old training obsolete. Doctors try to stay up to date with continued medical education, and national specialty boards offer certification for passing a test every ten years. But over time Doctors tend to take on the characteristics of their communities. As a young surgeon I once saw a patient of mine from clinic in the hospital recovering from surgery with another Neurosurgeon. I asked him what he was doing in the hospital and he told me, “when you didn’t want to do the surgery my doctor sent me to another guy, and I just had the surgery. I’m doing great!” The standard of care is ultimately determined by the community, and therefore can vary between communities.
This problem is not going to be solved by changes in healthcare delivery systems, physician integration to combat fragmentation, better electronic medical records, or changes in payment schemes. Kaiser Permanente is a good example. The Kaiser system is wholly integrated, shares medical records, and has a physician culture dedicated to population health. But studies of the Kaiser system show (less but) similar degrees of geographic variation in spend when compared with Medicare. Similarly, attempts at integrating care through Accountable Care Organizations has resulted in some savings, but nothing like the 30% that utilization and juried panel studies suggest is possible.
We’ve tried the physician education approach before without success. Regional differences in tonsillectomies in children were identified in England in 1938. A 70 year-old program has tried to change payment schemes, educate, and even force physicians to reduce the number of tonsillectomies done in children was without success when last reported in 2010. There’s no reason to believe continuing to change delivery systems, payment schemes, and delivery structures over again is going to be successful this time, when these approaches have been failing for 70 years.
Artificial Intelligence can solve this problem
Artificial intelligence applied to utilization review can reduce the number of procedures that don’t meet evidence-based guidelines, increase patient satisfaction, and ultimately reduce variations in healthcare delivery. This can be done by training models to simulate the care recommendations of an expert panel, or by offering decisions for care based on criteria developed through risk stratification.
Let's start with the expert juried panel paradigm. Suppose you go see your primary doctor about a pain in your right groin that is worse with walking, and has been present for several years. You did the same thing last year and were referred for physical therapy, but the pain persists. Your primary doctor refers you to an orthopedic surgeon who explains your groin pain with walking is due to arthritis of your hip, and the x-rays show the problem is so advanced that the best treatment is a hip replacement. Of course you trust your doctors, but wouldn't it be great if you could open an app and get a second opinion from whole panel of expert joint replacement surgeons in like 2 milliseconds? Not only could you double check, your insurance company could do the same thing. This is now possible using the programming techniques of artificial intelligence.
Neural networks are trained by extracting features of the data from the available inputs, then developing an underlying digital model linking the combination of inputs with the output. In the case of medical procedures, the inputs are the patient's history, survey data, physical examination, laboratory studies, and available imaging such as X-ray, ultrasound, nuclear medicine studies, magnetic resonance imaging and computed tomography. The output is the model is to recommend the surgery, or not recommend the surgery. The network is trained based on data obtained from a juried panel of experts. The following figure shows graphically how such as model could be developed.
One of the challenges of neural networks is that it’s hard to know exactly how the model came to its recommendation. Alternatively, the decision to recommend or not recommend surgery could be reduced to a series of decision trees arranged into a forest. From of a logical standpoint this paradigm more closely resembles how human experts process patient problems and come to a recommendation. Does the history suggest the problem is severe enough to warrant surgery? Does the X-ray meet expert criteria for the procedure? Is the patient healthy enough to undergo the procedure at acceptable risk? All of the relevant variables and decision points can be put together to make a decision tree. Artificial intelligence affords techniques, such as random forest generation, to identify the most important variables in predicting outcome, and can then optimize the decisions to arrive at the best recommendation. One of the great things about the random forest technique ---as opposed to neural networks--- is that we can understand how the model came to its recommendation. This is super important in Medicine, where we will need to communicate the results to a real patient whose real life is going to be affected by the result.
Another approach groups together large numbers of patient histories as inputs and uses the probability of selected functional outcomes as the output. For example, we can train a neural network or random forest on a population of patients who had a procedure with outputs of patient satisfaction, functional improvement, infection, chronic pain, and continued use of narcotics. Using this model, self-insured companies or commercial insurance payers could calculate criteria for the desired results. For example, we know infection will always be a risk of surgery, but by controlling the input criteria how low can we possibly make that risk? This is determined mathematically using a receiver operating curve of the data from each decision point. Once an acceptable risk of infection is determined, the model can then predict whether a new patient’s risk meets that criteria for surgery. If a patient’s risk of complication, or poor outcome is higher than acceptable, then the carrier could deny the procedure.
A great feature of this approach is that outcome data will be available for patients and their doctors. In our legacy health care system doctors do their best to inform patients of the risks of procedures compared to the benefits, but they don’t really know the numbers. Except for the very rare physician who participates in a registry, most doctors don’t know with certainty their own complication rates. Medicine is no different than any other human endeavor: you get the performance you measure and manage toward. The models described above will naturally generate the information doctors need to improve, and patients need to make informed choices. By giving patients the actual risks, we will increase their satisfaction with the process by giving them real data on which to make their treatment decisions.
One of the neat things about using artificial intelligence in utilization review is that the “second opinion” could actually be done first. A typical surgeon sees ten patients for every two that require surgery. This is a real misallocation of resources for the health care system, and a waste of time and money for patients. With the models described above a primary doctor could coordinate better, more evidence-based and personalized care for patients. Consider the example of hip pain above. After taking your history and doing the physical exam, your primary doctor could run an app to find out if a juried panel of experts thinks you need hip replacement. If the jury says no, your doctor can recommend additional exercises and stem cell derived protein injection instead. You get the treatment you need, without a wasted trip and the expense of seeing a surgeon. And the surgeon is free to dedicate more time to patients he can really help.
My colleagues and I at Aptus Engineering are developing tools that will save trillions of dollars on healthcare while making it safer for patients. We think we can help large self-insured employers and commercial insurance carriers reduce their spend on procedures by 30%, while leaving your employees happier and healthier. We can help large multi-specialty practice organizations empower primary care physicians with practice to the full extent of their license by supplementing their care with artificial intelligence. We are also looking for venture capital to help further fund and develop our capabilities. If you have any questions or comments, please contact me.