Monday, February 7, 2011

Are first babies more likely to be late?

UPDATE: The version of this article with the most recent data is here.

When my wife and I were expecting our first child, we heard from several people that first babies tend to be late.  If you Google this question, you will find plenty of discussion.  Some people claim it's true, others say it's a myth, and some people say it's the other way around: first babies come early.

What you will probably not find is any data to support these claims.  Except this kind of data:
``My two friends that have given birth recently to their first babies, BOTH went almost 2 weeks overdue before going into labour or being induced.''

``I don't think that can be true because my sister was my mother's first and she was early, as with many of my cousins.''
If you don’t find those arguments compelling, you might be interested in the National Survey of Family Growth (NSFG), a survey conducted by the U.S. Centers for Disease Control and Prevention (CDC) to gather ``information on family life, marriage and divorce, pregnancy, infertility, use of contraception, and men's and women's health.”

Their 2002 dataset includes 7643 women, who reported the gestational age for their 9148 live births.  This figure shows the distribution of pregnancy length:

The distributions are similar for first babies and others.  The mode is 39 weeks; the distribution is skewed to the left, with some births as early as 24 weeks, but almost none later than 44 weeks.

On average, first babies are 0.078 weeks later than others.  This difference, 13 hours, is not statistically significant.  But there are differences in the shape of the distribution.  This figure shows the percent difference between first babies and others for weeks 34 through 46:

The general pattern is that first babies are more likely to be early (37 weeks or less), less likely to be on time (38-40), and more likely to be late (41 or more).  In terms of relative risk, first babies are 8% more likely to be born early and 66% more likely to be late.  And those differences are statistically significant (using a chi-square test, p < 0.001).

So far, none of this is useful for planning, so let’s consider a scenario: suppose you are in week 38 and you want to know your chances of delivering during the next three weeks.  For first babies, the chance is 81%, for others it is 89%.  So yes, in this scenario, first babies are more likely to be late.  Sorry.


If you find this sort of thing interesting, you might like my free statistics textbook, Think Stats. You can download it or read it at


  1. Quite a while ago, I remember discussing another childbirth-related question with you: correlations between the sexes of siblings. Given that your first child is a boy, what's the probability that your next child will be? (And so forth -- given the sexes of the first n children, what can you say about the (n+1)st?) I seem to recall that you had a data-based answer to this. Future blog post?

  2. Yes, I hope so. I had a model that worked well for the 2002 dataset, but when I tested it on earlier data (also from the NSFG) it all came crashing down. I will try to get back to it for a future post.

  3. Interesting. Does the dataset have any information on the nature of the births? It seems that the spike at 39 would be in part due to medical intervention after 38 weeks (which ends up getting pegged as the "due date" and creates a mindset that 39 is "late" even though 38-40 is normal). Elective C's are usually done at 39 weeks, iirc.

    So, it might be that fewer first babies are the subject of medical intervention (induction or c-section) due to desire to push for a "natural" birth process for the first child and more >1st babies are the subject of elective C's or other interventions, reducing the >39 dataset for the >1sts.

    I recall the discussion Ted mentioned - and that there was a skewing issue caused by parental choices - which actually created interesting stats not about birth gender, but family size based on gender balance. That is, there was an uneven distribution of bb, gg families were more likely to go for a third child than bg, gb families. Or something like that.

  4. Could be scheduled C-sections, but I doubt it is because of other interventions, which tend to be later (40+).

    The dataset does have other medical information about the births, so I could investigate your theory.

    As for gender skew due to parental choices, I did a bunch of work on that, but in the end I didn't find anything that held up to statistical scrutiny.

  5. Actually, the medical intervention I was primarily thinking of was induction, which makes up a significant portion of birth events - possibly more than c-sections (see, and is subject to non-medical election.

    If election of induction was uniform across birth order, it wouldn't have a significant effect on the lateness theory.

    It would be interesting to see if there is any correlation between the timing of a first baby and whether or not the mom has a second baby. Not that I'm trying to generate more work for you, I just find these things fascinating.

  6. While interesting, I can't help but think you need to compare the first and others for the same woman. While may be unlikely it could still be that a tendency exists for a woman's second, third, etc, child comes earlier.

    1. Good question. I just ran a quick experiment and posted the results here:

  7. Just as nature of birth (natural vs induced) could affect the data set, it may be interesting to find if time of birth (as in season during brith, summer or winter) may have an influence of affecting this data set. For example we may be able to infer that winter babies are likely to be late compared to summer babies ?

    1. Very interesting! When you ask if time of birth affects the data set, you might be asking one of
      two questions: (1) Is there a difference between babies born in summer or winter, for example, or (2) if so, does this difference explain a substantial part of the effect I wrote about.

      I haven't checked either one, but my subjective prior probability for (1) is 5%. There might be a difference in gestation period between summer and winter babies, but I doubt it.

      And my prior for (2) is less than 1%. Even if there is a seasonal effect, it could only explain the first baby effect if first babies are substantially more likely to be born during particular seasons.

      But if your subjective priors are higher than mine, you should check it out!