Today we celebrate US Independence Day with some more Epidemiology 101. If the word “statistics” gives you yucky feelings, nausea, chills, phantom electric shocks, etc, stay right where you are. This is the post for you. I promise that I will not use this post to make you feel worse about your quantitative skills.
Still with me? Good. Let’s move on to this report released out of Brigham & Women’s Hospital in March of 2014 (I’m not gonna lie, I started writing this post like a year ago but was distracted by a spurt of dissertation productivity). The report is titled “Sex-Specific Medical Research: Why Women’s Health Can’t Wait.” There’s a lot in the report, and if health policy is your bag I recommend you go read the original, or at least the executive summary. What I want to focus on here is a recurring theme in the report pertaining to the statistical analysis of data: even studies that collect data on women may not use it. And you’ll be shocked to hear that the same problem is true for data collected on race and ethnicity. So women of color are even more understudied.
At first blush, this is a kind of puzzling finding. Why would you bother to collect data on women and/or people of color if you’re not going to use it? It’s a choice, but it’s often an unconscious choice. Like many areas of discrimination, implicit bias can influence the scientific process. And because it’s unconscious, people will resist naming it as an injustice.
In the bad old days, medical research was often carried out on men alone–often male undergraduates. It was assumed that conclusions drawn from research in men applied equally well to women. That turned out to be a bad assumption, and the repercussions for women were serious. To name the most famous example, the so-called classic symptoms of a heart attack like chest pain are really only classic in men, and are often absent when women have heart attacks (see also this PSA starring Elizabeth Banks).
The history of racism in research is more complex–a whole field of study. White scientists have used black people’s bodies as models to test medical treatment intended for white patients, and the repercussions are still being felt. Yet the opposite problem was also going on at the same time; it was fully acceptable to conduct studies on all-white populations. No surprise, the assumption that conclusions reached from data on white people applied equally to people of color also proved faulty. For example until a few years ago the treatment of choice for Hepatitis C, a disease which disproportionately effects African-Americans, was five times less likely to cure the disease in black men than white men.
A study population of white men alone is no longer acceptable in the world of medical research. A lot of credit for this change goes to the NIH Revitalization Act, which was passed in 1994, and which requires the inclusion of women and people of color in federally funded studies. So far, so good (mostly). But inclusion of these subpopulations doesn’t actually benefit anyone if the effect of gender or race isn’t examined in the data–and surprisingly often it isn’t.
Let’s focus on race for a moment. When you read about a study that proudly announces the racial diversity of its study population, I encourage you to ask the following question: Was race examined as a factor that could change the effect being studied, or was it treated as a nuisance variable? In epidemiologic lingo I’m talking about the difference between treating race as an effect modifier and treating it as a confounder. Here’s what I mean by that.
Let’s imagine you are a doctor at a teaching hospital with a large and diverse patient population. There is a new drug on the market that is used to treat bad breath (brand name Mintifreshimab). You have noticed that several of your patients that use the anti-halitosis drug have been coming to see you with a strange cough. You are afraid this new drug has a heretofore undiscovered side effect. Coughing was not studied in the large trials that led to the approval of this drug, so you decide to study it yourself. You receive permission to review the medical records of patients at your hospital for research purposes, and you use them to find out how many of the patients who have been prescribed the halitosis drug have returned complaining of cough. For comparison you choose a control group of patients who have been prescribed a new drug to prevent flatulence (brand name Tootnomor), and find out how many of them have returned with coughs as well. This is definitely not the optimal study design for this question, but you do what you can.
You find that patients taking the anti-halitosis drug have a similar amount of coughing complaints as patients taking the anti-flatulence drug. Pretty reassuring that the drug doesn’t cause the coughs. But, you astutely recognize that 75% of patients being prescribed the anti-halitosis drug are white, but only 25% of patients on the anti-flatulence drug are white. You also recognize that since asthma is less common in whites than in African-Americans, you would expect to find less coughing in a population with proportionately more white people–including the population of anti-halitosis drug users–regardless of their medications. You don’t have good data on respiratory disease for some reason (let’s say someone just released some really weird malware). So you must account for race in your analysis. What do you do?
If you said “control for race” or “adjust for race” (same thing), then you’re thinking like most people in this situation. You choose a model that essentially takes a complicated average of the drug’s effect in whites and its effect in blacks. This model assumes that even if whites have less coughing overall than blacks, it has nothing to do with the anti-halitosis drug–taking the halitosis drug wouldn’t give any more black people coughs than white people, or vice versa. Proceeding under this assumption, the model adjusts for race and calculates once again that there is no more cough in users than in nonusers. This estimated lack of effect applies to the whole population, “independent” of race.
In many contexts the assumption that a drug or exposure effects the health condition you are studying the same way in people of all races is an excellent assumption. But I want to point out that this kind of model implicitly frames the difference between white patients and black patients as a distortion of the “true” effect of the halitosis drug. What if those differences are important? Not a distortion of the effect your studying, but an intrinsic part of it?
The whole reason for studying a racially diverse sample is to investigate whether the drug acts differently on different populations. If the effect of the drug is the same for people of all races, then there would be no need for a diverse sample. You could study the effect of the drug just in African-Americans, or just in whites, and arrive at an estimate that was correct for any population. We know already that that is a bad assumption. Yet by “controlling” for race, you have actually removed the effect of race from your analysis instead of studying it. Henceforth I will be referring to this approach as the Misguided Approach, mostly because I put my real name on my blog, and as an aspiring medical professional it wouldn’t be smart for me to fill my blog with bathroom words.
A better plan–I’m going to call it the Astute Approach–would be to analyze the data in a way that allows the effect of the drug to vary between the two populations. If you do it that way, you might just find out that the drug does have an effect after all–two different effects, to be precise. In our scenario, it turns out that race is a strong predictor of what kind of relationship the drug has to coughing, but you have to be looking for it. When you examine white people alone, you find that halitosis drug users have more cough. Important finding–maybe some people should change their medications. But when you examine black people alone, you find that in this population halitosis users have less cough–whoa! So among African-Americans, this drug could actually help with cough? I mean, this is just one retrospective observational study, and also made up, so let’s not get carried away. But my point is that whether or not a clinican might want to prescribe this drug for a given patient might depend a lot on that patient’s race. This is sometimes called statistical interaction, or effect modification. The Misguided Approach fails to look for evidence of this kind of effect modification by race, and just assumes that race is unimportant. It averages the increased cough in whites and the decreased cough in blacks out to no effect at all.
If the assumptions underlying the model that controls/adjusts for race are wrong, and those effects really are different in the two populations, then the estimated average will be correct for neither population. It’s actually worse than studying an all-white sample, because it arrives at an estimated effect that is incorrect for white people, too. An estimate whose generalizability is unknown is better than an estimate that is universally wrong.
So now we’ve got a plan for a more fair application of statistics in medical science. And really, as fruit goes, this is very low hanging. You only have to use the data you already have! But…this analysis plan will only work if you have a large enough sample of people of color to investigate your research question separately for each population. This is what is meant by statistical power; the larger the sample, the less likely it is that your finding arose as a matter of chance. In the scenario I laid out above, the study data comes from preexisting medical records in a health care system serving a diverse population, so that’s not especially difficult. But for a study that’s gathering new data, study volunteers have to be recruited with particular attention to recruiting enough people from the relevant subpopulation.
Time to turn our attention to some fine print. The NIH continues to be the driver of most research in the U.S., and their policies are incredibly important. To receive NIH funding, a human subjects study has to include women and people of color unless there’s some specific reason not to (if you’re studying prostates, for example, you don’t need to recruit cis women). If the study is a Phase III drug trial, AND if there is a preexisting body of research suggesting that the effect being studied is different in men and women, then you also have to recruit a study population large enough to allow you to analyze the effect of gender. Ditto for the effect of race. If no one has looked for evidence of a gender or race effect, or if you are conducting some other kind of trial, then studies are not required to recruit a large enough sample size to look for differences by gender or race.
The default is to recruit a population that mirrors the 2010 census. So let’s say you take that approach, and recruit a sample of volunteers, 13% of whom identify as African-American. You can have a nice healthy sample of 200 subjects, but you’ll still only have 26 black subjects. So when you study the effect on black people alone, you will have low statistical power, and limited ability to draw any conclusions about the effect of the drug in black people.
Hey it turns out women are underrepresented as research participants. And you know I wouldn’t leave out the intersectional issues here. If the study is powered to investigate the effect of race, and it’s powered to investigate the effect of gender, is it powered to investigate whether the effect of gender is different in different racial/ethnic groups and vice versa? If in the example above you have half men and half women, you’ve got at most 13 black women fromnwhich to draw your conclusions. Will women of color really know if these research findings apply to them?
The first time my PhD adviser pointed this fine print out to me, it blew my mind in a way I had never imagined fine print could. You’re required to have women and people of color in your study, but you’re not required to recruit enough of them to look for evidence of gender and race effects? Why even bother then? It practically mandates the Misguided Approach.
Every additional subject makes the study more expensive, and since people of color are less likely to agree to participate on research (whole other blog post there someday I think), there is an associated cost to making greater recruiting efforts. Where I live, researchers find themselves competing for study volunteers, to the point where one of the local clinics serving primarily people of color will only help with recruitment if the investigators can convince them that this research will actually benefit underserved populations–and good on them. Hey, science isn’t easy. Unless you’re doing it really badly.
This is a matter of putting our money where our mouths are. If we are going to use our money, as a nation, to produce research that will benefit more populations, then we have to spend more money on research, or we have to conduct fewer studies. You’ve heard this before with respect to health care, but it applies to research too. When there’s a shortage–and just in case you haven’t heard, there has been a dismal shortage of funds for research for a while now–there is rationing. Conducting fewer studies would mean rationing on the basis of research agendas, and people who could benefit from more kinds of research questions being answered will lose out. People with a particular disease will not get that one other clinical trial that could help treat them. People in a particular job will not get the study that demonstrates that their work is unsafe, and they’ll keep doing the job. That stinks.
Right now, research findings need only be true in white people to become the new paradigm that applies to all people. A drug only has to work in white people in order to get approved, and then prescribed to people of all races. That’s rationing on the basis of race, and in my opinion it stinks worse, because it is not just frustrating and sad, it is also unjust.
The pervasive failure to examine race effects in research, and the failure to prioritize the investigation of race effects in health, is a research-specific manifestation of colorblind racism. Researchers who take the Misguided Approach aren’t intentionally setting out to commit discrimination. They’d probably be the first to tell you that using an all-white population is wrong–after all, they’re working with data on studies that actually recruited a racially diverse sample. But choosing an analytic approach that fails to “see” race produces research that still leaves people of color underserved.
So, here is your action plan when evaluating human subjects research:
- What is the proportion of men and women, or whites and non-whites? Did they classify race and/or ethnicity in a useful way?
- Did the investigators look for evidence that the effect they are studying is different in men and women, or different in whites and non-whites? If you see a sentence that says something like “there was no evidence of effect modification by race,” that means that they looked, and it turned out the effect really was the same regardless of race. In which case controlling/adjusting for race is Astute. If they forgot to check that assumption, it is Misguided.
- Are women of color being lumped in with men of color or with white women?
Go forth and critique. This is one area of social justice with a comparatively simple solution. Let’s demand it.