Wednesday, April 25, 2012

Fog warning system: part three


BackgroundI am trying to evaluate the effect on traffic safety of a fog warning system deployed in California in November 1996.  The system was installed by CalTrans on a section of I-5 and SR-120 near Stockton where the accident rate is generally high, particularly during the morning commute when ground fog is common.  The warning system consists of (1) weather monitoring stations that detect fog and (2) changeable message signs that warn drivers to reduce speed.


I will post my findings as I go in order to solicit comments from professionals and demonstrate methods for students.  If I can get permission, I will also post my data and code so you can follow along at home.

Previously: In the first installment I reviewed the first batch of data I am working with, and ran some tests to confirm that Poisson regression is appropriate for modeling the number of accidents in a given day.  In part two I ran Poisson regressions to identify factors that influence the number of accidents per day.

Critical events

I have been waiting to get more details about several events that affected traffic safety during the observation period.  I was able to get in touch with a Transportation Engineer in the Traffic Safety Branch of Caltrans District 10, which includes the study area.  According to Caltrans records, the speed limit on the relevant section of I-5 was increased from 55 to 70 mph on March 25, 1996.  The speed limit on SR-120 was increased from 55 to 65 mph about a month later, on April 22, 1996.  Many thanks to my correspondent for this information!

The automated warning system was activated in November 1996.  My collaborator has collected data on weather measurements made by the system and the warning it displayed.  I hope to get this data processed soon.

Accidents per million vehicles

In the previous article, I ran models with raw accident counts as the dependent variable, and found that traffic volume is a significant explanatory variable.  Not surprisingly, more cars yield more accidents.

Rather than use volume as an explanatory variable, an alternative is to express the dependent variable in terms of accidents per million vehicles.  As a reminder, here's what the traffic volume (in thousands of cars per day) looks like during the observation period:


And here are the raw accident counts:


I divided counts by volume and converted to accidents per million cars.  At the same time I smoothed the curves by aggregating quarterly.  Here's what that looks like:

The vertical red lines show major events expected to affect traffic safety: increased speed limits in March and April 1996, and the activation of the warning system in November 1996.

This graph suggests several observations:

  1. In the control directions, the accident rate was flat from 1992 through 1994, increased quickly in 1995 (before the speed limits were increased) and has been flat every since.
  2. In the treatment directions, the accident rate was trending down until late 1996, including three quarters after the speed limit was increased.  The accident rate increased sharply in 1997 and possibly again in 2000.
  3. The accident rate in both directions was unusually low during the third quarter of 1996, when the warning system was activated.  Other than that, there is no obvious relationship between accident rates and the events of 1996.
Since we don't expect the warning system to have much effect on the control directions (that's why they're called "control"), the speed limit changes are by far the most likely explanation for the accident rate changes.  But it is puzzling that a large part of the change occurred before the new speed limits went into effect.  One possibility is that as new speed limits were rolled out throughout California, drivers became accustomed to higher speeds and drove faster even on roads where the new limits were not in effect.  But if that's true, it doesn't explain the continuing decline in the treatment directions.

My collaborator has some data on actual driving speeds before and after 1996.  Once I process that data, I will be able to get back to this puzzle.

Injuries and fatal accidents

In response to a previous post, a reader suggested that if the warning system causes drivers to slow down, it might affect the severity of accidents more than the raw number.  To investigate that possibility, I also plotted the rates for injury accidents (including fatalities) and fatal accidents.

Here is the graph for injury accidents:

The patterns we saw in the previous graph appear here, too.  In addition, this graph suggests, more strongly, the possibility of a second changepoint in late 1999 or 2000.

And here is the graph for fatal accidents:


The number of fatal accidents is, fortunately, small.  During more than 10 years of observation, there were only 26 in the study area.  The trends in the other graphs are not apparent here, other than the general increase in the rate of fatal accidents in the second half of the observation period.

Summary


  1. Accident rates in the control and treatment directions increased sharply around 1996, but neither effect is related in an obvious way to increased speed limits or deployment of the warning system.
  2. Accident rates were unusually low in the quarter the warning system was activated; other than that, no effect of the warning system is apparent.
  3. It looks like there was a second increase in accident rates in late 1999 or 2000.  I will ask my correspondent at Caltrans if he has an explanation.

Next steps


There's not much more I want to do with this data.  Now I need more numbers!  In particular, I will be able to get data from the warning system itself, including:

  1. Conditions measured at roadside weather stations, which should be better than the data I have from the airport 8 miles away, and
  2. Messages displayed when the warning system was active.
If the warning system has an effect, it should be apparent on the days it is active.  By comparing the treatment and control directions, it should be possible to quantify the effect.

Also, I have permission now to share the data; I will try to get it posted, along with my code, before the next update.

[UPDATE April 26, 2012]

A reader asked
I can think of two ways that overall traffic volume affects accident rates: (1) more cars = more accidents overall, which you control for by measuring accident rates, and now you're seeing rising accident rates per car. So this raises the next thought, (2) more cars = more traffic density, which raises accident rates per car for each car on the road. 
What happens if you regress on traffic volume squared, or include traffic volume as an independent variable in the accident rate regression? The density effect is likely nonlinear but it's a thought.
This is a great question.  If there is a non-linear relationship between traffic volume and the raw number of accidents, then even after we switch to accident rates, there might still be a positive relationship between traffic volume and accident rates.

I ran these regressions, and in fact there is a relationship, but with the limitations of the data I have, I don't think it means much.  Specifically, I only have annual estimates for traffic volume, so there's no fluctuation over time; traffic volume increases at a nearly constant rate for the entire observation period (see the figure above).

So traffic volume will have a positive relationship with anything else that's increasing, and a negative relationship with anything decreasing.  And that's what I see in the regressions:



All of the relationships are statistically significant, but notice that in the treatment directions, before 1996 when the accident rate was declining, the relationship with traffic volume is negative!

I don't think this variable has any explanatory content; any other ramp function would behave the same way.  If I can get finer-grain data on traffic volume, I might be able to look for a more meaningful effect.

10 comments:

  1. Fascinating project. Could more widespread use of cell phones leading to more distracted drivers cause the jump you see at the end of the 1990's?

    ReplyDelete
    Replies
    1. I'm open to all theories, although in that case I might expect a trend rather than a jump.

      (Keeping in mind that a trend with noise might look like a jump.)

      Delete
  2. Allen - I can think of two ways that overall traffic volume affects accident rates: (1) more cars = more accidents overall, which you control for by measuring accident rates, and now you're seeing rising accident rates per car. So this raises the next thought, (2) more cars = more traffic density, which raises accident rates per car for each car on the road.

    What happens if you regress on traffic volume squared, or include traffic volume as an independent variable in the accident rate regression? The density effect is likely nonlinear but it's a thought.

    ReplyDelete
    Replies
    1. Also are we confident that the control is a good control? Traffic volumes and fog are both heavily dependent on time-of-day, and the control direction could conceivably have a different traffic pattern from the treatment direction during foggy times, I would venture.

      That is, if traffic volumes are higher or lower in the treatment direction than in the control diretion during the likely fog time, and we find that accident rates are themselves dependent on density, then this potentially has a significant effect.

      I suppose we could study the control direction to examine whether density affects accident rates overall, and then figure out what to do from there.

      Delete
    2. Great questions!

      1) About the non-linear effect of traffic volume: I have looked into this; since you asked, I will post an update with the results. Short answer: I can't say much with the data I have now, but might be able to do more later.

      2) About whether fog affects the treatment direction more because it happens during the morning commute: yes, this is almost certainly something I will have to deal with. I think I can get at it by comparing four conditions, (fog and no fog), (treatment and control). Will try to get to this in the next update.

      Thanks!

      Delete
    3. I am looking forward to seeing this. I wonder if there is a way to look at time-specific accident rates.

      Delete
    4. Yes, I have time of day for all accidents, and (sort of) time of day for low visibility due to fog. So we should be able to get into this is some detail.

      The next installment will be delayed while I work on another project, but I will get back to it as soon as I can.

      Delete
  3. Really nice idea to blog on the research as it unfolds.

    Speaking of Poisson processes, perhaps some of the readers here might like this simple puzzle I recently posted.

    ReplyDelete
    Replies
    1. Just read your article. It is excellent! Thanks for letting me know about it.

      Delete
    2. Thanks for the encouraging feedback!

      Delete