Averaged across all live births, the mean duration of pregnancy for first babies is 38.6 weeks, compared to 38.5 weeks for other babies.

Those means include pre-term babies, which affect the averages in a way that understates the difference. For full-term babies, the differences are a little bigger.

For example, if you are at the beginning of Week 36, the average time until delivery is 3.4 weeks for first babies and 3.1 weeks for others, a difference of 1.8 days. The gap is about the same for weeks 37 through 40. After that, there is no consistent difference between first babies and others.

The following figure shows average remaining duration in weeks, for first babies and others, computed for weeks 36 through 43.

The gap between first babies and others is consistent until Week 41. As an aside, this figure also shows a surprising pattern: after Week 38, the expected remaining duration levels off at about one week. For more than a month, the finish line is always a week away!

Looking at the probability of delivering in the next week, we see a similar pattern: from Week 38 on, the probability is almost the same, with some increase after Week 41.

In summary, among full-term pregnancies, first babies arrive a little later than others, by about two days. After Week 38, the expected remaining duration is about one week.

**Methods**

The code I used to generate these results is in this IPython Notebook. I used data from the National Survey of Family Growth (NSFG). During the last three survey cycles, they interviewed more than 25,000 women and collected data about more than 48,000 pregnancies. Of those, I selected the 30,110 pregnancies whose outcome was a live birth.

Of those, there were 13,864 first babies and 16,246 others. The mean duration of pregnancy for first babies is 38.61, with SE 0.024; for others it is 38.52 with SE 0.019. The difference is statistically significant with p < 0.001.

However, those means could be misleading for two reasons: they include pre-term babies, which bring down the averages for both groups. Also, they do not take into account the stratified survey design.

To address the second point, I use weighted resampling, running each analysis 101 times and selecting the 10th, 50th, and 90th percentile of the results. The lines in the figure above show median values (50th percentile). The gray areas show an 80% confidence interval (between the 10th and 90th percentiles).

**Limitations**

This analysis is based on data reported by respondents, so it includes errors due to inaccurate memory and reporting. In most cases respondents are reporting estimates made by doctors, but some might be reporting their own estimates.

The observed differences between first babies and others might be caused by differences in measurement error. For example, estimates for first time mothers might be less accurate. Based on this data, we can't tell whether the observed differences are due to biological factors or procedural factors.

But for purposes of prediction, it doesn't matter. If you are a first time mother and your doctor estimates that you are at Week 36, your chance of delivering in the next week is lower, relative to other mothers, and your expected time until delivery is longer, regardless of what causes the difference.

**Background**

I use this question—whether first babies are more likely to be late—as a case study in my book,

*Think Stats*. There, I used data from only one cycle of the NSFG. I report a small difference between first babies and others, but it is not statistically significant.

I also wrote about this question in a previous blog article, "Are first babies more likely to be late?", which has been viewed more than 100,000 times, more than any other article on this blog.

I am reviewing the question now for two reasons:

1) I worked on another project that required me to load data from other cycles of the NSFG. Having done that work, I saw an opportunity to run my analysis again with more data.

2) Since my previous articles were intended partly for statistics education, I kept the analysis simple. In particular, I ignored the stratified design of the survey, which made the results suspect. Fortunately, it turns out that the effect is small; the new results are consistent with what I saw before.

Since I've been writing about this topic and using it as a teaching example for more than 5 years, I hope the question is settled now.

Two confounders that I can think of:

ReplyDelete1. Later children are more like to be caesarean. This is because usually there are more complications (older mothers), but also if there has been a c-section twice, the doctors will not allow a natural birth.

2. Younger mothers are more likely to give birth to healthier children. Therefore, premature babies are more likely to survive if the mother is younger. Therefore older premature children might have died in pregnancy.

Thanks for your comment. I added a "Limitations" section to the article to address possible causes of the observed difference. I should clarify that the difference might not be biological; with this data, we can't tell.

DeleteBut for purposes of prediction, it mostly doesn't matter what the cause is. See above.

I got the following note from a friend:

ReplyDelete"I just read your latest blog post, and I'll admit that I didn't dig deeply into the data, but I wonder just how much this is a matter of large amounts of bad data producing statistically significant results.

"Case story, from Jennifer, many years ago. For one of her babies (maybe the first), the OB was estimating gestational age and asked Jennifer when her last period was. I don't remember the numbers, but let's say Jennifer answered "12 or 13 weeks ago," and the doc wrote down 13. Later, the doc was adamant that Jennifer needed to be induced because "the baby is a week overdue!" when Jennifer said "maybe not; maybe she's exactly on time!" Jennifer won, and the baby arrived in a couple of days.

"What I learned from that story is that pregnancy lengths are like first downs in football: the officials measure with extreme accuracy (chains and such) exactly 10 yards from a wild-ass guess.

Do you know more about how pregnancy lengths are measured?"

To answer these questions, I added a Limitations section to the article. And here's the reply I sent:

"You are right that these numbers are not super precise. They are based on respondents' recollections of estimates that are only approximate.

"But for the use case I presented, I think the results stand. If you are nominally at Week 37 and you want to know the expected remaining duration, or the probability of delivering in the next week, the answer is different for first babies and others. Some part of that difference might be due to measurement error, but any variation, regardless of the cause, affects the predictions.

"Maybe I should be more careful about the wording. To say that first babies are "more likely to be late" makes it sound like I am talking about biology. It's possible that there are biological differences, but I would not make a strong claim based on this data.

"But I think it is fair to say that the nominal gestation period, as it is defined in medicine, is a little longer for first babies than others. Part of that might be due to measurement error, and it is plausible that measurement error is higher for less experienced mothers, but it is not obvious why first-time mothers (and their doctors) would tend to start the clock earlier rather than later."