A new study has been published in the Dec. 14, 2015 issue of JAMA: Pediatrics, which claims a link between prenatal use of antidepressants, particularly in the 2nd and 3rd trimesters, with an 87% increased risk of future development of autism. As is often the case, the media has already taken up the cause and ran with it:

The Huffington Post has declared: Major New Study Links Autism to Antidepressant Use During Pregnancy. Wow! Guess it’s settled then. Rest assured, this and other pieces will have women (many of whom already have underlying anxiety disorders) clamoring to stop their medicines and angry at their doctors for giving it to them in the first place. Lawsuits will be filed.

Not all of the media outlets have jumped on the bandwagon. In fact, Slate has this excellent exposé entitled, Another Misguided Panic About Autism, which is well-worth reading. A few points from that article:

So, luckily, in the case, a great deal of excellent counter-information has already been generated (just in the first two days). There is even an accompanying editorial in the same issue of JAMA: Pediatrics which puts many of the issues in perspective and urges caution in interpreting this report. It is a shame that the authors of the original article didn’t make many of the same, excellent points that the editorialist did. But Bérard has an agenda. She has stated in the media:

Depression needs to be treated during pregnancy but with something other than antidepressants in the majority of cases. The risk/benefit ratio is clearly leaning towards no use.

Bérard, a paid consultant for plaintiff’s attorneys, is biased. Bias is a huge factor in scientific research. Because she already has the reputation of being biased and because the editorialist in JAMA: Pediatrics provided some counterpoints to the article, then the journalists at Slate, Science, Wired, and NPR quickly and readily delved in and attacked. But what if this had not been the case? How would we, the readers, interpret and utilize an article of this type, published in a respected, peer-reviewed journal? This is why we must have a  process for analyzing scientific literature and at least a basic understanding of statistics.

So let’s look at this article, as is we were reading it for the first time. Using the Primer: How to Systematically Read A Scientific Paper from howardisms, let’s look at some major points. I won’t include every question from the primer, since not all of them apply:

  • What is the quality of the journal in which the article is published?
    • High quality journal. The accompanying editorial should give us perspective about the article and needs to be read with the article.
  • Do the authors have any conflicts of interest or other biases?
    • Yes. The article discloses: “Dr Bérard reported serving as a consultant for plaintiffs in the litigations involving antidepressants and birth defects.” Unfortunately, this is at the end of the article with the footnotes and hard to see unless you are deliberately looking for it.
  • What question is being asked by the researchers?
    • Is there an association between antidepressant use (AD) in pregnancy and the later development of an autism spectrum disorder (ASD).  
  • What is the null hypothesis?
    • This is not stated explicitly. Presumably, it is that there is no association between AD use and ASD.
  • What is already known about this topic?
    • Previous studies have shown mixed results regarding an association between AD and ASD. Maternal psychiatric disorders have already been linked to ASD, which doesn’t have a well-understood etiology.  
  • Do the stated alternative hypotheses make sense based on what we already know? What is the pre-study probability that the hypotheses are true?
    • The alternative hypothesis of the authors is, essentially, that antidepressants somehow are a cause of autism. Other hypotheses to be considered include errors in data collection or measurement; errors in statistical analysis; that women with significant enough mood disorders to need usage of antidepressants (compared to those who have mood disorders but can get by without an antidepressant) are at increased risk of ASD, related to more severe disease; that more severe mood psychiatric disorders (associated with increased use of AD) have a common genetic etiology with ASD; that women with mood disorders and frequent access to mental health practitioners may be more likely to have autism over-diagnosed in their children; etc.
  • What would be the best way to design a study to answer the research question?
    • A prospective, randomized, placebo-controlled trial.
  • How did the authors design the study?
    • This is a registry study which is retrospective and case-controlled in nature. There is no information about the severity or accuracy of the diagnoses considered (depression, ASD) or whether or not the prescribed medications were even taken, and if so, for how long. There is little reliable data about confounders other than basic demographic information which can be deemed reliable.
  • What are the weaknesses of the study design?
    • All of the above. There was no attempt to identify individuals in the cohort and interview them or verify information.
  • What were the inclusion and exclusion criteria? Did these criteria make sense? How might the study have been affected by different choices?
    • They are not explicitly stated, but basically the characteristics available in the database were used.
  • Does the study method address the most important sources of bias for the question?
    • No.
  • Was a power analysis done? If not, why not?
    • No. A power analysis was not conducted. At best, information from this study should be used to perform a power analysis for a future study. This is typical of data-mining, retrospective, epidemiological studies.
  • Was the study IRB approved? Did the IRB controls adversely affect the study design?
    • They do not comment on IRB approval
  • Were the statistical methods used appropriate?
    • No. The authors admit that “No statistical adjustment was made for multiple comparisons” and “No adjustment was made for multiple comparisons; hence, we cannot rule out chance findings given the number of comparisons made.” Also, since there is no power analysis, we have no way of knowing if a Type I Error was made.
  • How closely-matched were the patient populations in the different groups of the study?
    • In the primary analysis, the groups were not matched with respect to prior exposure to antidepressants, maternal age, educational level, socioeconomic status, living alone, gestational age at delivery or birth weight, history of depression, anxiety, or other psychiatric disorders, maternal diabetes or maternal hypertension. And these are only the characteristics we know about. Dozens more of potentially relevant characteristics were not considered. This makes the non-adjusted data next to worthless.
  • Were all of the relevant confounders accounted for?
    • No. Other relevant confounders like maternal drug use, severity of psychiatric illness, tobacco abuse, BMI, ethnicity, family history of ASD, etc. were not considered.
  • What did the study find?
    • That use of AD during pregnancy was associated with ASD with a hazard ratio of 1.87; in particular, SSRIs were associated with a HR of 2.17. When the control group was restricted to women with a history of depression, the HR for use of AD for risk of ASD was reduced to 1.75. When the group was limited to children whose diagnosis was confirmed by a Psychiatrist or a Neurologist, they found no statistically significant increased risk of ASD with use of AD during pregnancy. The authors claimed that the study was insufficiently powered to find such a difference (though no power analysis was performed), despite studying the records of 289,688 women.
    • The study population had an ASD rate of only 0.7%, but most other studies show a rate of 1.0%. Under-diagnosis of ASD in the cohort alone is more than enough to skew these statistics. In fact, it’s plausible that with appropriately-controlled diagnoses of ASD, a study might find that AD use protects against ASD. Children under two and preterm births should have been excluded from the study, and were not, since ASD cannot be reliably diagnosed before two and since preterm birth is a significant risk factor for development of ASD.
  • Were the findings statistically significant?
    • Well, no. Two things to consider: first, since there is no power analysis and since appropriate methods for multiple analysis were not used, then tests of significance in general (confidence intervals as the data is presented in the paper) are irrelevant in this study. Second, if we believe it is important that autism was accurately diagnosed, then recall that the authors stated there was no significant increased risk with appropriate diagnoses of autism (and in that subgroup, they did not control for women with a history of depression, which would have made their data even more insignificant).
  • How certain is the measured effect?
    • Not at all.
  • Were conclusions about subset or secondary outcomes statistically relevant? Were appropriate statistical methods used for them? Was the study powered sufficiently to address all of the secondary outcomes?
    • The authors admit they did not use appropriate statistical methods for subset analyses.
  • If the data suggests a correlation, is there any evidence of causation?
    • Using Bradford Hill’s criteria, there can be no claims of causation of SSRIs for autism based on the data in this study, even if the correlation holds out statistically.
  • Were all of the appropriate outcomes addressed?
    • No. We do not know if the women who didn’t take SSRIs in the control group (but who might have needed them) had, for example, higher levels of suicides or injurious behaviors. We know nothing about rates of pulmonary hypertension in the newborn in either arm, or many other neonatal and childhood outcomes of interest. Untreated maternal depression has been linked to numerous and profound poor neurologic outcomes. These were not considered.
  • What are other alternative hypotheses that fit with the data?
    • There are dozens of other explanations, as discussed above.
  • Are all of the authors’ conclusions supported by their data?
    • Not at all. In the paper, they accept an unconfirmed diagnosis of ASD in order to make any statistically significant conclusion, and outside of the paper, in the media, the senior author of the paper has used these data to claim women should stop using SSRIs in pregnancy. 
  • Is the outcome studied clinically relevant? What is the magnitude of the effect to the number needed to treat, etc?
    • No. Even if the data is accepted as legitimate, the actual number needed to not treat to avoid harm is so large that much more data is needed to determine what harms might come from non-treatment. 
  • Is the result of the study broadly applicable, or applicable to your patient population?
    • No
  • What do others think about the study?
    • We have seen what others think at the beginning of the post.  

So what’s the lesson: We should evaluate every paper with the same rigor so that we can value its conclusions appropriately. We should not rely on Slate or NPR to tell us that a paper is good or bad. Learning to read scientific papers and critically analyze them is key to utilizing literature appropriately in our practice and staying relevant and up-to-date in our clinical practices.