## Friday, December 27, 2013

### Leslie Valiant is probably British. Or old.

I got Leslie Valiant's new book, Probably Approximately Correct, for Christmas.  I'm embarrassed to admit that I was not familiar with the author, especially since he won the Turing Award in 2010.  But I wasn't, and that led to a funny sequence of thoughts, which leads to an interesting problem in Bayesian inference.

When I saw the first name "Leslie," I thought that the author was probably female, since Leslie is a primarily female name, at least for young people in the US.  But the dust jacket identifies the author as a computer scientist, and when I read that I saw blue and smelled cheese, which is the synesthetic sensation I get when I encounter high-surprisal information that causes large updates in my subjective probabilistic beliefs (or maybe it's just the title of a TV show).

Specifically, the information that the author is a computer scientist caused two immediate updates: I concluded that the author is more likely to be male and, if male, more likely to be British, or old, or both.

A quick flip to the back cover revealed that both of those conclusions were true, but it made me wonder if they were justified.  That is, was my internal Bayesian update system (IBUS) working correctly, or leaping to conclusions?

Part One: Is the author male?

To check, I will try to quantify the analysis my IBUS performed.  First let's think about the odds that the author is male.  Starting with the name "Leslie" I would have guessed that about 1/3 of Leslies are male.  So my prior odds were 1:2 against.

Now let's update with the information that Leslie is a computer scientist who writes popular non-fiction.  I have read lots of popular computer science books, and of them about 1 in 20 were written by women.  I have no idea what fraction of computer science books are actually written by women.  My estimate might be wrong because my reading habits are biased, or because my recollection is not accurate.  But remember that we are talking about my subjective probabilistic beliefs.   Feel free to plug in your own numbers.

Writing this formally, I'll define

M: the author is male
F: the author is female
B: the author is a computer scientist
L: the author's name is Leslie

then

odds(M | L, B) = odds(M | L) like(B | M) / like(B | F)

If the prior odds are 1:2 and the likelihood ratio is 20, the posterior odds are 10:1 in favor of male.  Intuitively, "Leslie" is weak evidence that the author is female, but "computer scientist" is stronger evidence that the author is male.

Part Two: Is the author British?

So what led me to think that the author is British?  Well, I know that "Leslie" is primarily female in the US, but closer to gender-neutral in the UK.  If someone named Leslie is more likely to be male in the UK (compared to the US), then maybe men named Leslie are more likely to be from the UK.  But not necessarily.  We need to be careful.

If the name Leslie is much more common in the US than in the UK, then the absolute number of men named Leslie might be greater in the US.  In that case, "Leslie" would be evidence in favor of the hypothesis that the author is American.

I don't know whether "Leslie" is more popular in the US.  I could do some research, but for now I will stick with my subjective update process, and assume that the number of people named Leslie is about the same in the US and the UK.

So let's see what the update looks like.  I'll define

US: the author is from the US
UK: the author is from the UK

then

odds(UK | L, B) = odds(UK | B) like(L | UK) / like(L | US)

Again thinking about my collection of popular computer science books, I guess that one author in 10 is from the UK, so my prior odds are about 10:1.

To compute the likelihoods, I use the law of total probability conditioned on the probability that the author is male (which I just computed).  So:

like(L | UK) = prob(M) like(L | UK, M) + prob(F) like(L | UK, F)

and

like(L | US) = prob(M) like(L | US, M) + prob(F) like(L | US, F)

Based on my posterior odds from Part One:

prob(M) = 90%
prob(F) = 10%

Assuming that the number of people named Leslie is about the same in the US and the UK, and guessing that "Leslie" is gender neutral in the UK:

like(L | UK, M) = 50%
like(L | UK, F) = 50%

And guessing that "Leslie" is primarily female in the US:

like(L | US, M) = 10%
like(L | US, F) = 90%

Taken together, the likelihood ratio is about 3:1, which means that knowing L and suspecting M is evidence in favor of UK.  But not very strong evidence.

Summary

It looks like my IBUS is functioning correctly or, at least, my analysis can be justified provided that you accept the assumptions and guesses that went into it.  Since any of those numbers could easily be off by a factor of two, or more, don't take the results too seriously.

1. By the way, another computer scientist Leslie is Leslie Lamport.

1. Indeed. He is not British, but he is old :)

2. Entertaining analysis, Allen! Cheers!

2. Allen

Very nice writeup. As Bayes and Turing have articulated, intuition, indeed, turns out to be deep calculation! True, earlier philosophers have suspected this and explained this, but Bayesian theory illuminates this so well by allowing us to mathematically simulate it.

I am relatively new to Bayesian thinking and two weeks ago, I had a very similar epiphany.

I drive towards a main intersection and see that the traffic light is green but the guys infront are stopped.

Then I notice a cop car with blinkers on going fast across, so I think 'ah ha!' cop chasing someone to give a ticket.

Then I notice yet another cop car chasing behind. I am a bit puzzled, but think 'Aha, seems to be a serious case of someone running away and now a team of cops on the pursuit.

Then I notice a few more cop cars, all with blinkers, so my hypothesis seems absurd. Now I think 'Maybe there is fire or some other incident and all the cops are going there quick'.

Then I notice that the cop cars are from various local cities, not just from the city where I live, but neighboring cities too. Now this evidence forces me to question my hypothesis and now it is impossible that all these city cop were summoned so quickly to my city to proceed from here to the incident venue!

So puzzled, I think, there is some event where all cops are going to but why are the blinkers on?

I call my wife, she says, and I tell her 'Oh there are so many cop cars in a procession' and as I am saying I realize it is probably a procession, she immediately adds, 'Oh I have not seen something like that before, but I think they are all going in respect of a fallen officer and as a mark of respect the blinkers are all on.'

This event showed me, like the NewtonApple, how beautifully our mind uses evidence ASAP to prime up to hypothesis for what is happening but is very very humble (and prudent) to give up its predispositions to search for more appropriate explanation as more and more evidence is revealed. So not only that the beliefs are originally learned (induced) bit by bit with evidence coming in, they are also applied to deduce what is happening in a similar way! In other words the inductive-deductive process of learning and applying are not two different processes at all, which explains how we can explain a novel happening as it happens for the very first time and be so quick to jettison a strong belief when it does not explain even one event. We just accept that 'that is plausible, but not what is happening HERE'

I consider it my 'NewtonApple' moment, not to imply that I am as insightful as Newton but the moment was a NewtonApple moment - where my mind could use an everyday(!) event to point me to a theoretical explanation!