Does regular church attendance lengthen people’s lives? Do doctors discriminate against women in treating heart disease? Does talking on a cell phone while driving increase the risk of having an accident? These are cause-and effect questions, so we reach for our favorite tool, the randomized comparative experiment. Sorry. We can’t randomly assign people to attend church or not, because going to religious services is an expression of beliefs or their absence. We can’t use random digits to assign heart disease patients to be men or women. We are reluctant to require drivers to use cell phones in traffic, because talking while driving may be risky. The best data we have about these and many other cause-and-effect questions come from observational studies. We know that observation is a weak second best to experiment, but good observational studies are far from worthless. What makes a good observational study?
First, good studies are comparative even when they are not experiments. We compare random samples of people who do and who don’t attend religious services regularly. We compare how doctors treat men and women patients. We might compare drivers talking on cell phones with the same drivers when they are not on the phone. We can often combine comparison with matching in creating a control group. To see the effects of taking a painkiller during pregnancy, we compare women who did so with women who did not. From a large pool of women who did not take the drug, we select individuals who match the drug group in age, education, number of children, and other lurking variables. We now have two groups that are similar in all these ways, so that these lurking variables should not affect our comparison of the groups. However, if other important lurking variables, not measurable or not thought of, are present, they will affect the comparison, and confounding will still be present. Matching does not entirely eliminate confounding.