The Epidemiology of Abortion: A Primer

A few days ago I posted about the proposed ban here in Wisconsin on abortion after 20 weeks’ gestation–and about how honestly, we’d rather not talk about it. Today’s post involves not one but two things most people would rather not discuss: abortion and statistics. Wait, wait, don’t go! Here, have a calming manatee.


I wasn’t especially interested in the issue of abortion until I went to graduate school for my Master’s in Public Health. In my time there I came to appreciate the control of fertility as a public health issue. But it was really the training in epidemiologic methods that got me interested in the issue.

Research findings are a big part of the public discourse around abortion, specifically of the argument that undergoing an abortion is bad for your health. The safety of legal abortion is just one piece of a much larger debate, but it is an important piece. Unfortunately since most people are unfamiliar with the methodological issues involved, we have to rely on experts. Each side of any debate inevitably has its own experts, all wielding citations to published articles, so the lay public may be left with no clear idea of the state of the evidence, and could understandably conclude that this debate is just more divisive ideology with a scientific facade.

I think we can do better, though. As one of my statistics professors at Berkeley was fond of saying, “Why be a slave to some little number cruncher?” I happen to believe most people can understand the science around this issue just fine. This post is an attempt to provide a basic understanding of how people go about trying to prove or disprove a connection between abortion and subsequent health problems, and some tools to help you evaluate research claims for yourselves.

My intent is that this will be useful no matter what your perspective on the legality or accessibility of abortion. I don’t think a pretense of neutrality contributes constructively to the public conversation, and I make no attempt to hide the conclusions I myself have drawn. These issues have become so intertwined with social identity that it is rare to have a productive conversation about abortion among people with very different positions, but I’ve had such conversations and I want to have more of them. I’m not oblivious to the state of U.S. politics right now, but I continue to believe it’s possible to move the public debate away from “What side are you on?” and toward “What works?”


“Well I read it on HuffPo so shut your cake hole, Phyllis!”

The skills that allow a critical reading of scientific evidence are relevant to anyone who has or would like to have an opinion on the ethics of abortion. As I said in my last post, we have enough consensus to start a conversation as long as we agree on the precept that good ethics begin with good facts. If you disagree with that crucial premise, however, this is not the post for you. Perhaps you would like to pass the time on this excellent site instead.

Time for some Epidemiology 101. In order to keep this simple, I’m going to focus on just one hypothesis: that abortion causes depression. It’s plausible; having an abortion is often a difficult experience, accompanied by feelings of sadness and stress. Dozens of studies have examined the question. However the principles below apply equally well to other outcomes.

Let’s say that you know a person–we are going to call this imaginary person Veronica–who had an abortion at ten weeks of pregnancy, and six months later was diagnosed with depression. You suspect Veronica’s depression was caused by the abortion. However, you already know about Rooster Syndrome (just because the rooster crowed and then the sun came up doesn’t mean the rooster made the sun rise), so you know that just because Veronica had first an abortion and then an episode of depression doesn’t mean the two events are related. So how could you prove or disprove your suspicions?

What you really want to know is this: if Veronica had not had an abortion, would she still have depression now? The only way you could actually prove such a thing is with time travel. Since you have observed what happened to Veronica after having an abortion, you could get in your time machine, travel back to the moment before, and intervene so that in this version of history she never gets the abortion. You then stay in this alternate universe to observe her for six months and compare the two Veronicas. Did she still get depression? If she did, then you know that her abortion did not cause it. If she did not get depression in the alternate universe, then you know that in the real universe her abortion set into motion a chain of events that led to her depression.

You may recall that this alternate universe is what is known in theoretical circles as the counterfactual scenario (thanks, David Hume!). Obviously, it is not real–boy I hope that’s obvious. But I bring it up because I think it helps to understand how studies are designed. A useful way to think about critiquing a study is to ask how close or far it gets us to the counterfactual scenario, given our ongoing shortage of time machines.

Even before we leave the alternate universe, though, we already have some methodological issues. You can’t simply erase Veronica’s abortion; if she doesn’t have an abortion she has some other experience instead. You are always comparing the results of the abortion to the results of something else. So how did you intervene to prevent the abortion? Did you go yet further back in time and provide her with a condom at the appropriate moment so that she never got pregnant at all? Did you get her take her to a pub crawl before she ever missed her first period, causing a miscarriage through heavy drinking? Did you provide support for her during her pregnancy in exchange for agreeing to adopt the child? Did you remove some of the barriers to giving birth, allowing her to choose to raise a child herself right now (you can probably come up with $245,000 considering you came up with a time machine)?

We can’t know without testing them what the outcome of any of these scenarios would be, but it seems plausible that they could make Veronica less vulnerable to depression. But on the other hand…

Did you make her an appointment at a Crisis Pregnancy Center where she was misinformed about how far along she was in her pregnancy so that she couldn’t schedule the abortion until past the time when the clinics in her area could perform it? Did you call up a group of  your friends to stand in front of the clinic shouting and thrusting pictures of mutilated children at her, until Veronica was too afraid to enter the clinic?  Did you close the only clinic she could get to, by lobbying your state legislators to pass laws requiring abortion providers have hospital admitting privileges? Did you kill the only abortion provider in the area? Did you get the US Supreme Court to overturn Roe v. Wade so that Veronica would–let’s say she lives here in Wisconsin–face three years in prison for getting an abortion, while her doctor would face ten?

Personally, I find it much harder to believe that these experiences would be less traumatic and less likely to cause depression than choosing an abortion. Plenty of people would disagree with me, though. But I’m an empiricist by nature and training–we can’t actually know what works by reasoning it out in our heads. The point is that none of these interventions are equivalent to each other, and none carries an equivalent risk of depression. You’re going to have to get in your time machine and go back at least nine times in order to conduct this counterfactual experiment. At least you will if you want to know if any of these strategies would actually have made Veronica healthier than she is here in the universe in which she got an abortion.

We can’t actually observe the counterfactual, so what’s the next best thing? The next best thing would be to observe two identical people, one of whom has an abortion and one of whom doesn’t. I’m not talking twins here, cause as Orphan Black teaches us, genetics do not make a person. I mean two people who are actually the same people. So we’ll get a big tub of programmable flesh, make ten pregnant doppelgängers of Veronica and…


Okay, okay. Enough about time machines and programmable flesh; let’s take this back to reality. When it comes to abortion, you can’t observe the same person under two conditions, and you can’t observe two copies of the same person. What you can do is compare two groups. The people in the groups are not the same, but the groups are the same on average. In particular they are the same with respect to everything that predicts depression–same number of people of each race and ethnicity, same number of people at each age, same number of people with children, same number of married people, same number of people living in poverty, same number of people with a prior history of mental illness, etc. If in fact your hypothesis is correct and abortion causes depression, that doesn’t mean that you would expect to find that everyone in the abortion group has depression, or that zero people in the comparison group have it, because lots of other factors cause depression in some people and protect against it in others. But if there is a causal connection, you would expect to see meaningfully (how meaningful? ymmv) more cases of depression in the abortion group.

In real studies, investigators attempt to make groups that are the same on average by randomly assigning subjects to one group or the other. Obviously no one will be conducting a study in which every pregnant person who enters the the study is randomly assigned to Group A or Group B, and then the Principal Investigator assigns Group A to give birth and Group B to have an abortion. Boy I hope that’s obvious, cause if it’s not I hope none of you are conducting any experiments. So when it comes to studying the effect of abortion, that study design is out, too. Abortion is hard to study.

Given that there will be no randomized groups, which groups can you compare? Now at last we arrive at real-world research. Here are a few strategies researchers have employed to try to study the effects of abortion.

Skip the comparison group. One strategy is to study only people who have had abortions, and draw conclusions from just one group (example here, admittedly an old one) . But you already know what is wrong with this reasoning. This is Rooster Syndrome again. Without a comparison group of people who did not have an abortion, there is nothing to suggest that the abortions are responsible for the prevalence of depression in this group. This is the weakest kind of evidence.

Compare people to themselves. An alternative strategy is to measure the same people twice (example here), before and after the abortion, and compare how much they’ve changed. That’s more convincing, but it doesn’t help with Rooster Syndrome.

Compare pregnancies ending in abortions to wanted pregnancies or miscarriages. Most studies on this topic have compared a group of people who chose to end unwanted pregnancies to a group of people who chose to continue planned or wanted pregnancies (example here). This type of study is limited by the fact that it is impossible to know how much of the differences between the two groups result from abortions and how much result from the higher proportion of unwanted pregnancies among people who choose abortions. An unwanted or mistimed pregnancy is an extremely stressful experience, no matter how the pregnancy ends, and could explain different amounts of depression in the two groups. In epidemiologic terms, this type of bias is called confounding. The relationship of abortion to depression is likely to be confounded by the “wantedness” of the pregnancies in studies using this type of comparison group.

Compare pregnancies ending in abortions to unwanted pregnancies carried to term. Other studies have compared a group of people who had abortions to a group of people who chose to continue unintended pregnancies (example here). That removes the issue of confounding by “wantedness,” but there are still important sources of bias here.

One of the most important issues is that people who have had depression in the past are more likely than people who have never had it to have another depressive episode–no surprise there. So one important predictor of depression after an abortion is depression before the abortion. Since ones current and past mental health can factor in to how capable one feels of continuing a pregnancy, failure to account for mental health history can lead to what is known as an indication bias. In this case the perceived need for an abortion (the indication) is associated both with the probability of choosing an abortion and with subsequent depression–potentially creating a spurious association.

There are other sources of indication bias, as well. In this study, some of the reasons people in the U.S. cited for choosing an abortion included “Can’t afford the basic needs of life,” “Not enough support from husband or partner,” “Physical problem with my health,” and “Became pregnant as a result of incest.” These are things that can contribute to depression no matter how the pregnancy ends.

How close do you think those two groups of people are to being the same on average at the start of the study? If people who continue their pregnancies are more likely on average to have supportive partners, adequate financial resources, good physical health, and good mental health, they are less likely to have depression. You may find more depression in the abortion group, but it is impossible to know whether that is a result of their having had abortions or of the preexisting issues that led to them feeling they needed abortions.

Statistically adjust or “control” for sources of bias. Any of the above methods that use a control group can use statistical methods to reduce confounding bias by measured effects. This is a complex topic that I’ll have to cover another day, but I’ll just give a brief overview. You know that people living in poverty are more likely to have abortions, and more likely to have depression, so you think socioeconomic status could be biasing your estimate of the relationship between abortion and depression. So you use statistical techniques to adjust your estimate for socioeconomic status, so that the estimated association between abortion and depression can be interpreted as independent of class. At least you can try. How well you can adjust for potential confounders depends on how well you can measure them (socioeconomic status is challenging to characterize well). But most importantly, you can’t measure every kind of confounder. There will always be some patterning in who gets and abortion and who gets a different pregnancy outcome, and some of that patterning will lead to distortion. It’s an inherent limitation of observational epidemiologic studies.

Given all these different approaches, there has been a lot of conflicting evidence about the relationship of abortion to depression. These three influential systematic reviews found that there was no evidence of an association between abortion and depression and that most of the research on the topic was fundamentally flawed, but the author of a fourth review disagreed, and by choosing a different set of studies to review found evidence that people who had abortions did have more depression.


Right about now you might be asking me what was the point of telling you all this if there’s no clear answer, and it all just depends on who you ask. But we’re not done yet, because someone came up with an extremely elegant solution to the limitations of the methods described above.

Compare people who chose abortions and obtained them to people who chose abortions but were denied them. Now we arrive at the highest quality evidence available. In comparing two groups of people that both sought out abortions, you solve the problems of confounding by “wantedness” and confounding by indication. When people are recruited at the same clinics you remove many other sources of confounding by socioeconomic and demographic characteristics, and you remove some of the influence of selection bias (when people who had abortions and have depression are more likely to join or stay in your study than people who don’t). You also get an assessment that is more relevant to the question of whether abortion should be legal and/or accessible.

There is only one study that has ever used this design. It is called the Turnaway Study, and you should remember that name because it is very important. Importantly, the subjects in this study were all at very similar stages of pregnancy, but some were just over the limit of when abortions could be performed at their clinic. Subjects did not self-select into the two groups. This is about as close to random assignment as an observational study can get, and thus the best approximation of the counterfactual scenario. Incidentally, the Turnaway Study findings suggest that having an abortion was associated with similar or even lower risk of depression as carrying an unwanted pregnancy to term.

Clearly I can’t cover all of epidemiology in one blog post, or even all of the methodological issues involved in studies of the effects of abortion. But I hope this has given you an action plan:

1. When you read an article claiming a study says “abortion causes ______” or “abortion is associated with ______,” you ask “Compared to what?”  Likewise, when a study says “women who had abortions have more/less ______,” you ask, “Compared to whom?”

2. When you read study results, think critically about how much the observed differences between comparison groups are telling you about abortion, as opposed to other things correlated with abortion like socioeconomic status, past mental health history, lack of access to contraception, lack of health literacy, etc.

3. Read a lot about the Turnaway Study. They are publishing a lot of results, all of them interesting. Right now this is the best kind of evidence we have.

Class dismissed.

Image: An Argument from Opposite Premises, Follower of Ralph Hedley [Public domain], via Wikimedia Commons