The warning system consists of (1) weather monitoring stations that detect fog and (2) changeable message signs that warn drivers to reduce speed. Among people who study traffic safety, there are two theories about these kinds of systems:
1) The mainstream theory is that when drivers are warned to slow down, they slow down, and lower speeds reduce the accident rate.
2) The heterodox theory, which my collaborator holds, is that warning signs introduce perturbations into the flow of traffic so they can cause more accidents than they prevent.
My job is to evaluate which theory the data support. Here's what I have to work with:
1) The warning system was activated in November 1996.
2) My collaborator and his students collected data from the CalTrans Traffic Accident Surveillance and Analysis System (TASAS). It includes all fatal and injury accidents and a large portion of "property damage only" accidents. Their data runs from Jan 1, 1992 to March 31, 2002, about five years before and six years after the warning system was activated.
3) NOAA's National Climatic Data Center (NCDC) operates a weather station at Stockton Airport (KSCK), about 8 miles from the study area. I downloaded daily weather data from 1992 through 2002.
4) California DOT publishes average daily traffic volume (ADT) through several locations in the study area. I downloaded their reports from 1992 through 2002.
But there are some challenges:
1) The biggest problem is that the speed limit in the study area changed in January 1996, 11 months before the warning system was deployed. It will be difficult to separate the effect of the warning system from the effect of the speed limit. I have exchanged email messages with someone at CalTrans who might be able to tell me exactly when the speed limit signs were changed on each stretch of road.
2) The traffic volume data is annualized, so it doesn't account for variation on smaller time scales, and it does not distinguish between traffic in different directions.
3) The weather station at the airport is about 8 miles from the study area, and at a higher altitude, so the fog data may not capture the actual conditions in the study area.
However, we have one ace in the hole: the system only shows warnings to traffic moving in one direction (toward the merger of the two highways), so traffic in the other direction acts as a natural control. I'll refer to the segments with warning signs, 5S and 120W, as the "treatment directions," and the others, 5N and 120E, as the "control directions."
To get a quick feel for the data, I plotted the number of accidents per month in the treatment and control directions.
In both directions, the number of accidents increases in 1996. The change is larger in the treatment direction; before 1996, the treatment direction was safer; after 1996, it became more dangerous. It looks like the treatment directions might have become more dangerous again in 2000, but just by looking at this figure, it's hard to say if that effect is significant.
Based on raw number of accidents, there is no evidence that the warning system is effective. But there are several other factors to consider, including traffic volume, speed, and weather.
This plot shows annualized estimates of average daily traffic volume (ADT) through the study area.
The traffic volume on SR-120 increases consistently during the observation period; volume on I-5 is mostly flat; volume after the merge point increases substantially. Since many accidents occur near the merge point, I will use the estimates from after the merge point for analysis.
These estimates include both directions of travel, so we can't distinguish volumes in the treatment and control directions.
This plot shows the number of days per month with fog, heavy fog, or more than 3mm of precipitation, as observed at Stockton Airport:
Not surprisingly, all three variables show seasonal variability. Other than that, there are no obvious trends.
Having cleaned and processed the data, we can look for factors that contribute to accidents. In total, there were 932 accidents in the control directions and 968 in the treatment directions. Over 3744 days, the average number of accidents per day is 0.51. Many days have no accidents. On the worst day in the observation period, there were nine!
I'll use Poisson regression to model the number of accidents in each day as a function of the explanatory factors. A requirement of Poisson regression is that the distribution of the dependent variable should be (wait for it) Poisson. To check this requirement, I computed the number of accidents each day for the control and treatment directions, before and after November 15, 1996 (roughly when the warning system was activated).
To check whether these distributions are Poisson, I plotted the complementary CDF on a log-y scale; under this transform, a straight line is characteristic of a Poisson distribution.
In all four cases the transformed CDF is roughly a straight line, so Poisson regression with this data should be just fine.
The other characteristic of the Poisson distribution is that the mean and variance are the same; we can check that, too.
control before 0.11 0.14
after 0.37 0.46
treatment before 0.06 0.07
after 0.44 0.62
In each case, the variance is a little higher than the mean, which suggests that there are more multi-accident days than we expect in a Poisson process (it's easy to imagine an explanatory mechanism). But the difference is small, so again I think we are safe using Poisson regression.
This table also demonstrates the effect I mentioned earlier. Before the changes in 1996 (increased speed limit and activation of the warning system) there were fewer accidents in the treatment directions (about half). After 1996 it's the other way around: there are more accidents in the treatment directions.
That's enough with the preliminaries. Next time we'll get into the analysis and see what factors contribute to the accident rate.