What’s Going on with Twin Rates?

Posted in science at 4:57 pm by danvk

I recently built a version of the CDC’s Vital Statistics database for Google’s BigQuery service. You can read more in my post on the Google Research Blog.

The Natality data set is one of the most fascinating I’ve ever worked with. It is an electronic record which goes back to 1969. Every single one of the 68 million rows in it represents a live human birth. I can’t imagine any other data set which was more… laborious… to create. :)

But beyond the data itself, the processes surrounding it also tell a fascinating story. The yearly user guides are a tour-de-force in how publishing has changed in the last forty years. The early manuals were clearly written on typewriters. To make a table, you spaced things out right, then used a ruler and a pen to draw in the lines. Desktop publishing is so easy now that it’s easy to forget how much standards have improved in the last few decades.

They’ve had to balance the statistical benefits of gathering a uniform data set year after year with a need to track a society which has evolved considerably. In 1969, your race was either “Black”, “White” or “Other”. There was a question about whether the child was “legitimate”. There were no questions about alcohol, smoking or drug use. And there was no attempt to protect privacy — most of these early records contain enough information to uniquely identify individuals (though doing so is a federal crime).

I included four example analyses on the BigQuery site. I’ll include one more here: it’s a chart of the twin rate over thirty years as a function of age.

A few takeaways from this chart:

  • The twin rate is clearly a function of age.
  • It used to be that older women were less likely to have twins.
  • Starting around 1994, this pattern reversed itself (likely due to IVF).
  • The y-axis is on a log scale, so this effect is truly dramatic.
  • There has been an overall increase in the twin rate in the last thirty years.
  • This increase spans all ages.

The increase in twin rate is often attributed to IVF, but the last two points indicate that this isn’t the whole story. IVF clearly has a huge effect on the twin rate for older (40+) women, but it can’t explain the increase for younger women. A 21-year old mother was 40% more likely to have twins in 2002 than she was in 1971.

My guess is that this is ultimately because of improved neonatal care. Twins pregnancies are more likely to have complications, and these are less likely to lead to miscarriages than in the past. If this interpretation is correct, then there were just as many 21-year olds pregnant with twins forty years ago. It’s just that this led to fewer births.

Chart credits: dygraphs and jQuery UI Slider.


  1. Robert Konigsberg said,

    January 14, 2012 at 11:23 pm

    My guess is an increased and more effective use of fertility drugs in older women, which, when they work, can sometimes _really_ work. The older you are, the more likely you need help with fertility, the more likely you get the powerful drugs.

    The graph also shows how older women are even _having_ babies.

    I love the infographic, I kinda want it to animate on a timer.

  2. danvk said,

    January 15, 2012 at 12:39 am

    IVF certainly explains the dramatically increased twin rate amongst older women. But the real surprise to me was that the twin rate has increased across the board. Surely 21-year olds aren’t using fertility treatment?

  3. kakaz said,

    January 15, 2012 at 8:01 am

    “Surely 21-year olds aren’t using fertility treatment?” – do not they? Are You sure?

  4. danvk said,

    January 16, 2012 at 12:50 pm

    I’d be shocked if enough 21-year olds are using fertility treatment to produce a 40% increase in the twin rate nationwide. If you don’t believe me on 21-year olds, the numbers are quite similar for 16-year olds. Do you really think they’re using assisted reproduction techniques?