Over a third of Americans in recent surveys say they will decline a vaccine against COVID-19. Both people with vaccine hesitancy and policy-makers need information on vaccine safety.
Thus, the FDA has an ambitious program to monitor the safety of COVID-19 vaccines. Unfortunately, the proposed analyses will not isolate the causal effect of vaccination. Thus, it risks erroneously reporting side effects, which may slow herd immunity.
Fortunately, more rigorous study designs using these data can causally identify side effects. The same data and study designs can also measure the relative value of different vaccines in fighting COVID-19 and in reducing transmission. Importantly, these studies can measure results for different groups (e.g., sex, age, race, and comorbidities), permitting personalized recommendations.
The current plan
The FDA safety surveillance protocol uses an impressive research platform with health data from hundreds of millions of people. The protocol’s primary analysis counts adverse events of interest such as strokes and Bell’s palsy. These counts are typically for the four to six weeks post-vaccination (depending on the event). The analysis then compares those rates with groups of similar age, sex and (when available) race/ethnicity before the pandemic.
The protocol notes the challenges of causal inference, given that vaccination will not be randomized. For example, nursing home residents both receive early vaccines and suffer many health problems. If nursing home residents have more vaccinations and more health problems than others of similar ages, that correlation need not indicate causality. Thus, the protocol briefly mentions follow-up analyses that adjust more carefully for observable characteristics. Unfortunately, controlling for observable characteristics is not sufficient to build confidence in results when (1) the vaccination system initially targets those with health problems and (2) those most concerned about their health may seek the vaccine first.
Fortunately, the FDAs data infrastructure permits study designs that can better measure the effect of the vaccine. For example, the CDC recommends the vaccine be available first to those 75 and over, and then to those 65 and over. It is straightforward to compare those just too young to be eligible to those just old enough to be eligible. This “regression discontinuity” design can provide evidence almost as reliable as a randomized trial. My colleague Michael Anderson and his colleagues recently used this study design to examine the effectiveness of the flu vaccine in the UK, based on a clinical guideline that substantially increased vaccination rates for those 65 and over.
A similar study of teens will have great policy relevance. The FDA has or will approve most vaccines for those ages 18 and over, and ages 16 and over for Pfizer-BioNTech. During 2020, the CDC reported 4 deaths per million Americans ages 15-17. Thus, fear of side effects that show up (for example) 300 times per million vaccines could deter many parents from vaccinating their children. At the same time, that rate (3/10,000) is too low to detect in a randomized trial with ten thousand or so treatments. A regression discontinuity comparing a million 16-year-olds with a million 15-year-olds provides the necessary sample size to detect such problems – or to reassure parents and policy-makers that no such problems exist.
In addition, states will vary when they vaccinate different groups. For example, it is likely that some states will start immunizing those ages 65-74 many weeks earlier than other states. If so, we can compare COVID-19 incidence of those 65-74 in states with early versus late vaccination (controlling for rates of COVID-19 of other age groups and months). This study measures the protective effect of the first dose of the vaccine.
In addition, it is likely that some states will prioritize somewhat younger African Americans and Hispanics than other states. We can compare outcomes of the somewhat younger cohort of African Americans and Hispanics in states where they were eligible earlier versus later.
These same data and study designs can answer other important questions.
For example, these enormous datasets are well suited for machine learning how the safety and effectiveness of each vaccine varies with recipient characteristics. Such studies will also need information on how each brand of vaccine was distributed. Ideally, each person can receive a personalized recommendation for which vaccine is most likely to be safe and effective.
Understanding which vaccines are most effective at reducing transmission is important for ending the pandemic. Fortunately, if we link people who live in the same home we can measure the effects of each vaccine on transmission.
All of these study designs require that most practitioners follow their state’s guidelines and that most people eligible for the vaccine accept it. They also all can build on the sophisticated statistical techniques the FDA already uses.
In short, the FDA and its partners have created an invaluable resource for studying vaccine safety. Adding stronger study designs can create more convincing evidence on vaccine safety. As importantly, the same data and study designs can answer crucial questions on extending vaccines to younger populations, the safest and most effective vaccines for different groups, and how well each vaccine reduces transmission.