Brett Keller, MPA
Back in September the New York Times reported on an unexpected finding from a clinical trial: “A promising but expensive device to prop open blocked arteries in the brain in the hope of preventing disabling or fatal strokes failed in a rigorous study.” Many promising medical innovations fall short when they finally reach clinical trials, but this story was unusual because the stents had already been approved by the FDA under a so-called humanitarian exemption. The FDA approved the stents to reduce the risk of stroke, but those who received it had twice as many strokes.
How did this happen? The Times chronicled experts’ puzzlement: “Researchers said the device seemed as if it should work.” And Joseph Broderick, a prominent neurologist, is quoted as saying “Quite frankly, the results were a surprise.” Researchers are delving into this case to discover why the stent failed, but policymakers from all fields should take it as a valuable lesson. This is one more argument for testing policies whenever possible: not only does expert opinion sometimes get things wrong, but without good data there is often no way to really know when they are right.
Similar lessons can be gleaned from the history of surgical response to breast cancer. In The Emperor of All Maladies (2010), a new history of cancer, oncologist Siddhartha Mukherjee chronicles the history of such failed interventions as the radical mastectomy. Over a period of decades this brutal procedure – removing the breasts, lymph nodes, and much of the chest muscles – became the tool of choice for surgeons treating breast cancer. In the 1970s rigorous trials comparing radical mastectomy to more limited procedures showed that this terribly disfiguring procedure did not in fact help patients live longer at all. Some surgeons refused to believe the evidence – to believe it would have required them to acknowledge the harm they had done. But eventually the radical mastectomy fell from favor; today it is quite rare. Many similar stories are included in a free e-book titled Testing Treatments (2011).
As a society we’ve come to accept that medical devices should be tested by the most rigorous and neutral means possible, because the stakes are life and death for all of us. Thousands of people faced with deadly illnesses volunteer for clinical trials every year. Some of them survive while others do not, but as a society we are better off when we know what actually works. For every downside, like the delay of a promising treatment until evidence is gathered properly, there is an upside – something we otherwise would have thought is a good idea is revealed not to be helpful at all.
Under normal circumstances most new drugs are weeded out as they face a gauntlet of tests for safety and efficacy required before FDA licensure. The stories of the humanitarian-exemption stent and the radical mastectomy are different because these procedures became more widely used before there was rigorous evidence that they helped, though in both cases there were plenty of anecdotes, case studies, and small or non-controlled studies that made it look like they did. This haphazard, post-hoc testing is analogous to how policy in many other fields, from welfare and education, is developed. Many public policy decisions have considerable impacts on our livelihoods, education, and health. Why are we note similarly outraged by poor standards of evidence that leads to poor outcomes in other fields?
A recent example from New York City helps illustrate how helpful good evidence can be in shaping policy. A few years ago Mayor Michael Bloomberg rolled out a massive program that seemed to make a lot of sense: pay teachers bonuses based on their students’ performance. The common sense proposal was hailed as “transcendent” and gained the support of the teachers’ union. It cost $75 million, and it didn’t work. How do we know? The program was designed from the beginning as a pilot where schools were randomly assigned to the program or to a control group, and the research showing that the program had no effect on outcomes was subsequently published. What would have happened if this policy had been put in place without an effective evaluation plan? In all likelihood New York officials would now be touting its success at conferences and urging other cites to implement similar programs. Instead it was quietly shelved. That this particular program did not have the intended effect is disappointing, but it is much better than if we believed it worked and continued on unaware.
The pros and cons of randomized trials have been discussed here on 14 Points before – see recent posts by Jake Velker and Shawn Powers. The cases I presented here are ones where the results were not “no-brainers” at all, and without systematic evaluation bad policies would have been or tragically were put in place. While good evidence does not have to come from randomized trials, there are still many areas where they are underused. In areas where they are feasible (i.e. not macroeconomics) such evidence should be the norm, and those who implement policies with great optimism but without planning for thoughtful evaluation should be panned. Even without random assignment of the treatment, the best policy evaluations should involve a serious attempt to estimate the counterfactual: what would have happened in the absence of the intervention. Moving beyond arguments over specific programs and whether they work, policymakers can move us towards better outcomes by creating a culture where strong evidence is valued. After all, the clinical trial as we know it in medicine is a 20th century innovation; it hasn’t always been this way.