Probably Overthinking It: February 2015

Tuesday, February 24, 2015

Upcoming talk on survival analysis in Python

On March 2, 2015 I am presenting a short talk for the Python Data Science meetup. Here is the announcement for the meetup.

And here are my slides:

The code for the talk is in an IPython notebook you can view on nbviewer. It is still a work in progress!

And here's the punchline graph:

Each line corresponds to a different cohort: 50s means women born during the 1950s, and so on. The curves show the probability of being unmarried as a function of age; for example, about 50% of women born in the 50s had been married by age 23. For women born in the 80s, only about 25% were married by age 23.

The gray lines show predictions based on applying patterns from previous cohorts.

A few patterns emerge from this figure:

1) Successive generations of women are getting married later and later. No surprise there.

2) For women born in the 50s, the curve leveled off; if they didn't marry early, they were unlikely to marry at all. For later generations, the curve keeps dropping, which indicates some women getting married for the first time at later ages.

3) The predictions suggest that the fraction of women who eventually marry is not changing substantially. My predictions for Millennials suggest that they will end up marrying, eventually, at rates similar to previous cohorts.

Tuesday, February 10, 2015

Bayesian analysis of match rates on Tinder

Last fall I taught an introduction to Bayesian statistics at Olin College. My students worked on some excellent projects, and I invited them to write up their results as guest articles for this blog. Just in time for Valentine's Day, here's the second in the series:

It’s (Not) a Match!

Ankur Das and Mason del Rosario

All code for this post can be found on Github: https://github.com/mdelrosa/tinderStudy

Valentine’s day is fast approaching, and if you are anything like us engineering students, you will most likely be dateless on the 14th of February. Luckily, there are lots of great apps geared towards bringing singles together. One of the most popular is Tinder. For those unfamiliar with the app, Tinder presents the user with other users’ profiles, and the user then either swipes right for yes or left for no. When two people have swiped right on one another, they become a “match” and can chat with each other via text through the app.

Images of our Tinder profiles. Aren’t we a couple of handsome devils?

Some time ago, two of our friends decided to conduct an experiment on Tinder. The friends, one male and one female, each swiped right on the next 100 profiles and recorded how many matches they got. Though our female friend received over 90 matches, the male garnered only one, giving a 90% match rate for the female and an abysmal 1% response rate for the male! The discrepancy we saw is not an isolated incident. This article from Godofstyle.com reports an experiment where male profiles at best garnered 7% match rates while a female profile received a 20% match rate.

Why such poor odds for men on Tinder? Women may be more discriminating with their swipes, but another possibility is that women are simply less active on Tinder. Pursuing the less depressing option, we set about trying to answer the following question: how does the activity rate of Tinder’s population affect a given user’s match rate?

p q r

Luckily for all us dateless men and women out there, the number of responses we get is not an accurate measure of your attractiveness on Tinder. Adapting Allen Downey’s work on “The Volunteer Problem,” a user’s total response rate p is the product of r, the true attractiveness rating, and q, the activity rating. For every 100 users we swipe, only some q fraction of them will even see our profiles, let alone swipe back. With that in mind, we have something to attribute our Tinder failures to: the inactivity of users who may not have logged in recently.

Still, it would be nice to know what our actual r response rate is. Finding p is simple --- simply measure the number of return swipes --- but we need to determine q to solve for r. With our set of Bayesian tools, Python scripts, and swiping fingers, we set out to estimate the activity rate, q, of other Tinder users.

How often do we Tinder?

For some of us, Tinder is a way of life. For a reasons we won’t get into here, we check Tinder hourly, keeping up to date with the latest swipes. But many users check the app more infrequently, going days or weeks without looking at new potential matches. Unfortunately Tinder’s best selling points also make life difficult for the amateur Bayesian statistician. The app provides minimal information on profiles and has no public history of past swipes and encounters. When seeing a new profile, the user’s last log-in time provides the only hint of her future activity, leaving us guessing if she chose to ignore our profile or simply never saw it.

Luckily, we can adapt Prof. Downey’s solution to “The Red Line Problem” to turn this sparse data into a predictive model. We start with an assumption that the average time between Tinder activity follows a normal distribution centered at 10 hours. Even then, someone who uses Tinder every 10 hours on average will occasionally check twice an hour or just once a week. We model this variance with an exponential distribution for each average activity period to show different activity rates for the same user.

(Note: all times and arrival rates are given in hours unless otherwise noted)

While some other user may go 10 hours without checking Tinder, our last arrival data only gives us a single point in that interval, skewing our results. First, we are 5 times more likely to observe someone during a 5 hour interval than a 1 hour interval, simply due to its length. We account for this bias towards larger intervals by multiplying each probability by time.

Exponential PDF scaled by time for for a user with an average of 10 hours between activity

Even with these likelihoods, our data comes in the form of time since last activity. 15 minutes ago could indicate a future log in 10 minutes or 10 days later, so we need to adapt our distribution to this type of input. For every gap between activity, we have an equal chance of making our observation at any point in between. Combining the chances of each activity period happening with the chances of finding a time in that period produces a mixture of various observed last log-in times, specific to each hypothetical user with a different average activity rate.

Exponential CDF for observed last activity time for a user with an average of 10 hours between activity

Now what?

That was a lot of work to get to a distribution within a distribution within a distribution, but now we finally have a chance to use our data! Looking back at our prior distribution of average activity rates, we now have a corresponding distribution of observed past log-in times for each. We sampled 100 users to record their activity times, giving us accurate information for the users relevant to us. Our findings vary between genders, areas, and age ranges, so new time data must be collected for each aspiring Tinder statistician. Updating each of our hypotheses with the chance of observing the times we saw shows the graph below, a steep distribution centered at 10 hours.

Exponential PDF scaled by time for for a user with an average of 10 hours between activity

The posterior distribution peaks at the same central value as the prior, but with a smaller standard deviation, due the large amount of data. The small standard deviation likely indicates more about Tinder’s sorting algorithm than the true userbase, however. Given the number of people who have likely stopped using the app, we should see a much longer tail for users checking Tinder every 24 hours or longer. This data actually indicates that Tinder only shows us users that tend to be fairly active. This behavior makes sense: what dating app would match you with inactive users?

Predictive Modeling

After a bit of data collection and analysis, we now know the breakdown of our potential matches. Most of them check Tinder about every 10 hours on average, so they should respond promptly to our swipes. For the sake of modeling ease, let’s say that when someone checks Tinder, they instantly see our profile, regardless of the time they spend active or Tinder’s display algorithm. Not only does this make our analysis much easier, it also gives us a worst case estimate. If our results for Tinder success assume extra people see our profile, our true response rate can only be better.

Using the spread of potential Tinder matches, we can predict how many will use Tinder within some threshold time. If we check again in 5 hours, for instance, we will of course receive fewer views than 10 hours later. To find the distributions of response rates, we looked at each average response rate, finding the chance of activity for that rate within some threshold time. Scaling all these probabilities by the likelihood of that average response rate produces a distribution of q, the portion of users active within some time.

q distributions for active users within 5 and 10 hours

True Response Rate

Now that we are able to generate a distribution for q, we are half way towards generating a distribution for a given user’s true response rate, r. Next, we need to know what the distribution of perceived response rate, p, looks like. Recalling that the product of q and r gives us p, we can reverse engineer our desired r distribution by taking the quotient of p over q! We’re working on that part of the problem now.

Conclusion

Clearly, we still have some work to do before we can say anything quantitative regarding a user’s true match rate, but we can say some positive things without going much further. One encouraging observation is that a user’s true match rate will always be better than the perceived match rate. While a user may see that only 10 out of every 100 profiles become matches, a percentage of those users have not even been on Tinder for quite some time. The upshot of this: you’re probably more attractive than Tinder makes you think you are!

That’s all for now. Happy swiping!

Thursday, February 5, 2015

Godless freshmen: now more Nones than Catholics

This article is an update to my annual series on one of the most under-reported stories of the decade: the fraction of college freshmen who report no religious preference has more than tripled since 1985, from 8% to 27%, and the trend is accelerating.

In last year's installment, I made the bold prediction that the trend would continue, and that the students starting college in 2014 would again, be the most godless ever. It turns out I was right for the fifth year in a row. The number of students reporting no religious preference increased to 27.5%, a substantial increase since last year's record-high 24.6%. Also, for the first time in the history of the survey, the number of "Nones" exceeds the number of Catholics (25.3%).

The number of people reporting that they never attended a religious service also reached an all-time high at 29.3%, up from 27.3% last year.

Of course, we should not over-interpret a single data point, but generally:

1) This year's data points are consistent with previous predictions, and

2) Data since 1990 support the conclusion that the number of incoming college students with no religious preference is increasing and accelerating.

This analysis is based on survey results from the Cooperative Institutional Research Program (CIRP) of the Higher Education Research Insitute (HERI). In 2014, more than 153,000 students at 227 colleges and universities completed the CIRP Freshman Survey, which includes questions about students’ backgrounds, activities, and attitudes.

In one question, students select their “current religious preference,” from a choice of seventeen common religions, “Other religion,” or “None.”

Another question asks students how often they “attended a religious service” in the last year. The choices are “Frequently,” “Occasionally,” and “Not at all.” Students are instructed to select “Occasionally” if they attended one or more times.

The following figure shows the fraction of Nones over more than 40 years of the survey

Fraction of college Freshmen with no religious preference.

The blue line shows actual data through 2013; the blue square shows the new data point for 2014. The gray regions shows the predictions I generated last year based on data through 2013. The new data point falls at the high end of the predicted interval.

The red line shows a quadratic fit to the data. The dark gray region shows a 90% confidence interval, which quantifies sampling error, so it reflects uncertainty about the parameters of the fit. The light gray region shows a 90% confidence interval taking into account both sampling error and residual error. So it reflects total uncertainty about the predicted value, including uncertainty due to random variation from year to year.

Here is the corresponding plot for attendance at religious services:

Fraction of college Freshmen who report no attendance at religious services.

Again, the new data point for 2014, 29.3%, falls in the predicted range, although somewhat ahead of the long term trend.

Predictions for 2015

Using the new 2014 data, we can generate predictions for 2015. Here is the revised plot for "Nones":

Predictive interval for 2015.

This year's measurement is ahead of the long-term trend, so next year's is likely to regress, slightly, to 27.1% (down 0.4%).

And here is the prediction for "No attendance":

Predictive interval for 2015.

Again, because this year's value is ahead of the long term trend, the center of the predictive distribution is slightly lower, at 28.6% (down 0.7%).

I'll be back next year to check on these predictions.

Comments

1) As always, more males than females report no religious preference, and the gender gap appears to be growing.

Difference between men and women, fraction reporting no religious preference.

Evidence that the gender gap is increasing is strong. The p-value of the slope of the fitted curve is less than 1e-6.

2) I notice that the number of schools and the number of students participating in the Freshman Survey has been falling for several years. I wonder if the mix of schools represented in the survey is changing over time, and what effect this might have on the trends I am watching. The percentage of "Nones" is different across different kinds of institutions (colleges, universities, public, private, etc.) If participation rates are changing among these groups, that would affect the results.

3) Obviously college students are not representative of the general population. Data from other sources indicate that the same trends are happening in the general population, but I haven't been able to make a quantitative comparison between college students and others. Data from other sources also indicate that college graduates are slightly more likely to attend religious services, and to report a religious preference, than the general population.

Data Source
Eagan, K., Stolzenberg, E. B., Ramirez, J. J., Aragon, M. C., Suchard, M. R., & Hurtado, S. (2014). The American freshman: National norms fall 2014. Los Angeles: Higher Education Research Institute, UCLA.

This and all previous reports are available from the HERI publications page.