tag:blogger.com,1999:blog-6894866515532737257.comments2016-10-26T06:10:15.186-07:00Probably Overthinking ItAllen Downeyhttps://plus.google.com/111942648516576371054noreply@blogger.comBlogger676125tag:blogger.com,1999:blog-6894866515532737257.post-6883500172906903422016-10-14T15:04:15.174-07:002016-10-14T15:04:15.174-07:00It's hard to want to get married when I saw so...It's hard to want to get married when I saw so many people in the previous generation or two got divorced. As a man, the idea of losing half my stuff sucks. On top of that if there are kids involved I would only get to see them on Tuesdays and every other weekend, and I'd also get the pleasure of watching some other guy raise them. This makes me very cautious to anyone I'd consider marrying. Greg Bhttp://www.blogger.com/profile/08033946748326670909noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-72125112234101430192016-10-14T13:26:45.120-07:002016-10-14T13:26:45.120-07:00Excellent data viz! As a married 32yo millennial, ...Excellent data viz! As a married 32yo millennial, I can see the shift in my friend group, many of whom have never been married. My wedding was in 2014 when I was 30.Joshua Oliveirahttp://www.blogger.com/profile/04949363293575817646noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-36946818009232125082016-10-13T06:06:30.186-07:002016-10-13T06:06:30.186-07:00Hi. I just added a worksheet that shows how to so...Hi. I just added a worksheet that shows how to solve this problem using Bayes's Theorem.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-56656754707304130282016-10-13T05:50:47.838-07:002016-10-13T05:50:47.838-07:00Hi Emile, I'm glad it is making sense.
But I...Hi Emile, I'm glad it is making sense.<br /><br />But I want to clarify the point of my article. I am not saying that there are two answers to this question, one Bayesian and one frequentist, and the Bayesian one is right.<br /><br />I am saying that there is only one answer to this question, and it is neither Bayesian nor frequentist. It is just a consequence of the laws of probability.<br /><br />The issue Dimiter raised is called the reference class problem: https://en.wikipedia.org/wiki/Reference_class_problem<br /><br />And while the reference class problem is relevant to the problem, it is a general problem for all of probability, and not specifically Bayesian or frequentist, either.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-25120124062450485922016-10-12T18:28:14.111-07:002016-10-12T18:28:14.111-07:00I deleted my initial comment because I was probabl...I deleted my initial comment because I was probably overthinking it. I grappled with your explanation and the other posts, but now understand your statement that only a Bayesian approach can lead to the right answer. <br /><br />We want Prob(raining|YYY).<br />This is equal to Prob(raining and YYY)/Prob(YYY). We don't know Prob(raining and YYY).<br /><br />However Prob(raining and YYY) also equals Prob(raining)*Prob(YYY|raining)<br /><br />So we need Prob(raining)but don't know this. So as already mentioned by Dimiter, there are infinitely many answers in the absence of information. <br />However, we do know Prob(YYY|raining)=8/27<br /><br />Furthermore, <br />Prob(YYY) = Prob(YYY|raining)*Prob(raining) + Prob(YYY|not raining)* Prob (not raining)<br /><br />We also know Prob(YYY|not raining) = 1/27. <br /><br />The frequentist solution is really a relative odds ratio: Prob(YYY|raining)/{ Prob(YYY|raining) + Prob(YYY|not raining)}. It is not really a probability at all. <br /><br />Now when Prob(raining)=1/2 ( the best guess if you have no information) then the frequentist relative odds will equal the Bayesian probability. <br /><br />If instead of Seattle the trip was to Los Angeles, and the Prob(raining) =0.01, then having three friends saying "Yes it's raining" will result in Prob(raining|YYY) =~0.075.<br /><br />However if you asked 14 of your mostly truthful friends, and all 14 said,"yes dude, it's raining", then the conditional probability increase to ~0.993 (frequentist relative odds = ~0.99994). <br /><br />This makes sense.<br /> <br />Thanks for posting this.<br /><br />Emile Elefteriadishttp://www.blogger.com/profile/06682927722971051966noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-71304155594207886632016-10-11T23:24:50.902-07:002016-10-11T23:24:50.902-07:00Hello Allen, I'm having difficulty solving thi...Hello Allen, I'm having difficulty solving this problem using Bayes Theorem but have no idea where I'm going wrong. Could you please shed me a light?<br /><br />We have:<br /><br />P(rain|YYY) = P(YYY|rain)*P(rain) / P(YYY)<br /><br />P(YYY|rain) = 8/27 ?<br />P(rain) = 0.1<br />P(YYY) = 8/27 ?<br /><br />Are the previous values correct? If they are then P(rain|YYY) = 0.1*8/27 / 8/27 = 0.1<br /><br />This would imply that P(rain|YYY) is not dependent on P(YYY) at all. What am I doing wrong?Unknownhttp://www.blogger.com/profile/15299634454831159125noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-88742291334172286062016-10-11T13:20:44.579-07:002016-10-11T13:20:44.579-07:00The issue with Lewis' argument, and Elga's...The issue with Lewis' argument, and Elga's section 3, is that "new information" is not defined. Or more accurately, what allows a Bayesian update, usually called "new information" without definition, is never described in a robust manner.<br /><br />IMO, "new" is the wrong concept for this; "changed" is better. It's just that in the vast majority of examples, "changed" means "has something added," which is "new."<br /><br />A partition of a discrete sample space is a set of disjoint events whose probabilities sum to 1. So if you roll a die, {1,2,3,4,5,6}, {even,odd}, and {prime, non-prime} are all partitions. But {1,2,3} is not because the probabilities do not sum to 1, and {4,5,6,odd} is not because the events are not disjoint.<br /><br />Events that have zero probability can be included in a sample space, so {1,2,3,4,5,6,7} is also a partition. To express the idea I want, I need to introduce a "minimal partition." That's one that includes no zero-probability events.<br /><br />What allows an update, is any information state change that alters what is, or is not, a minimal partition. On Sunday, {Heads&Monday, Tails&Monday, Heads&Tuesday, Tails&Tuesday} is a minimal partition for potential experiment states in the future. Each - including Heads&Tuesday - has a 1/4 probability to represent the experiment's state at any moment in the next two days. But when SB is wakened, Heads&Tuesday is disqualified as a possible game state. This creates a change in the minimal partition, and so allows for an update.<br /><br />He said it differently, and I don't agree with how he said it, but I think this is what Elga meant in his Section 3. The idea is correct.<br /><br />To update a probability based on changed information, you divide the previous probability of the event in question by the sum of previous probabilities of the new minimal partition:<br /><br />Pr(Heads|Awake) = (1/4)/(1/4+1/4+1/4) = 1/3.JeffJohttp://www.blogger.com/profile/09110352332876400907noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-52070592745979605252016-10-10T08:18:47.642-07:002016-10-10T08:18:47.642-07:00This comment has been removed by the author.Emile Elefteriadishttp://www.blogger.com/profile/06682927722971051966noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-76766455963870996912016-10-04T16:03:50.474-07:002016-10-04T16:03:50.474-07:00Frankly, I have long thought of the bayesian appro...Frankly, I have long thought of the bayesian approach as one that is best digested as a of morning joe. Frankly, the frequentists are more like a protein shake with shopped up celery and spinach.<br /><br />However, non-parametric methods are the best. I love basic arithmetic. <br /><br />Sir Thomas was a priest after all..Thomas Baymanhttp://www.blogger.com/profile/08031681715056987599noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-80343390080598779312016-10-03T04:54:25.222-07:002016-10-03T04:54:25.222-07:00Great question, thanks! Yes, the chi2 would be co...Great question, thanks! Yes, the chi2 would be correct (and just as easy to implement). I think my normal approximation is pretty close, but when I get a chance, I will run both and see how close. If I did this again, I would use the chi2 distribution, since the normal approximation doesn't provide any advantage for this problem.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-47825582850333937752016-10-03T00:51:10.618-07:002016-10-03T00:51:10.618-07:00In your book Think Bayes, you use the same example...In your book Think Bayes, you use the same example to illustrate Approximate Bayesian Computation. And you use `scipy.stats.norm.logpdf(s, sigma, sigma/math.sqrt(2*(n-1)))` for the likelihood of the sample standard deviation under the (mu, sigma) hypothesis. And I wonder, won't me more appropiate to use the sampling distribution of the sample variance for that likelihood instead? Something like `loglike += scipy.stats.chi2.logpdf((s**2*(n-1))/(sigma**2),df=(n-1))`.<br /><br />I posted this as question in stackoverflow http://stats.stackexchange.com/questions/238046/what-is-the-likelihood-of-drawing-a-sample-with-standard-deviation-s-from-a-noecerulmhttp://www.blogger.com/profile/02005618894883008001noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-60050865019228855542016-09-28T00:09:19.649-07:002016-09-28T00:09:19.649-07:00Allen: I think the probability question, as stated...Allen: I think the probability question, as stated, has not one and not two, but infinitely many answers. Why?<br />1) If we stick to the problem as stated, there is no information whatsoever that we can use to pick a prior probability of rain. So any prior must be as good as any other. By implication, that any posterior probability of rain can take any value between 0 and 1 (including these). So in the absence of any further information/ assumptions, the correct answers is probability of rain = [0;1]. We can say that the probability has significantly increased on hearing the signal from the friends, but we cannot see to how much, since we do not know the prior.<br />2) if we assume that if the probability was known with certainty in advance there would be no need for asking, we can exclude the boundaries of the interval, so p = (0, 1)<br />2b) if, in addition, we assume that probability is measured with finite precision (sey as whole percentages), the minimum prior probability becomes 0.01, so the interval for the posterior becomes p = (0.07, 1).<br />3) If we make the alternative assumption that the traveler would only ask for the solicited information if it would have a chance to change the decision (interpreted as moving the posterior above or below 0.5), the (posterior) probability becomes p = (0.5, 1). The friends send a signal that is as strong as it gets, so in the most adversarial prior it must just about move the probability above 0.5 so that in can satisfy the assumption of having a chance to influence the decision.<br />4) Now, if we want to make even further assumptions about what is known and what can be known in the framework of the problem, we can find reasons to pin the prior and, by implication the posterior, to any number within this range but a) this is not obviously warranted by the setup of the problem, and b) we enter the decision-theoretic problem mentioned in my previous post.<br />In sum, there might be one correct way to update probabilities based on prior information and new data. But in the absence of prior information and assumptions, any posterior probability must be correct.Dimiter Toshkovhttp://www.blogger.com/profile/09098718685262606935noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-71351507574032455592016-09-27T17:32:47.682-07:002016-09-27T17:32:47.682-07:00And I didn't mean to sound snarky towards you....And I didn't mean to sound snarky towards you. Thanks for your comments!Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-19752113461864377832016-09-27T17:18:23.267-07:002016-09-27T17:18:23.267-07:00[Allen, I didn't mean for my comment to sound ...[Allen, I didn't mean for my comment to sound snarky towards you! My cheekiness was directed at the hypothetical interviewer asking this question :) ]<br /><br />Indeed, the question ends by asking for a probability. I totally agree with your solution to treating this as a toy probability puzzle. I agree there's no Bayes-vs-frequentist difference there. And I agree it's very important to distinguish Bayes' theorem from Bayesian inference.<br /><br />On the other hand, the question *starts* with "You want to know if you should bring an umbrella."<br />Let's take this seriously as an interview question, meant to help the interviewer decide which candidates would bring the most value to the company.<br /><br />What are they really getting at?<br />If they were asking:<br />"We want to know if our company should take action U. It only makes sense to invest in performing U if there's at least 50% chance that R is true. We can (just barely) afford to run 3 expensive tests. Each test independently has 2/3 chance of correctly identifying whether R is true. If a sensible prior on U is 10%, and all 3 tests come back True, what will be the probability that R is indeed true?"<br /><br />...then who would you rather hire?<br />Candidate A, who is content to stop after getting an answer of 47%?<br />Or Candidate B, who goes on to say:<br />"Look, even if all 3 tests agree in claiming that R is true, our estimated probability that R is true will *still* be under 50%. In all other cases, it'll be even lower. You're wasting money by running these 3 tests. If we can't afford more tests, let's just skip them and spend that money somewhere useful."<br /><br />The spirit of frequentism is to step back from the given data and understand the operating characteristics of your statistical procedures---not just the analyses but the data-collection (design) too. It's a useful habit of mind, whether your final inferences/analyses end up Frequentist or Bayesian. And yep, sometimes that means refusing to answer a particular question, when that question isn't what's actually needed.Jerzy Wieczorekhttp://www.blogger.com/profile/03611849167717252118noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-14353809901579712992016-09-27T12:57:40.183-07:002016-09-27T12:57:40.183-07:00Thanks, Dimiter. You make some great points and I...Thanks, Dimiter. You make some great points and I won't address them all, but I want to clarify one. The goal of my article is not to demonstrate the "added value of a Bayesian vs frequentist answer". The point I am trying to make is that the probability question, as stated, does not have two answers, one Bayesian and one frequentist answer. It has one answer that can be computed without any commitment to a Bayesian or frequentist interpretation of probability, and without any commitment to Bayesian or frequentist inference.<br /><br />It does, as you point out, require the choice of a reference class, but that is a general difficulty with many probability problems; it is not a special difficulty for Bayesianism or frequentism.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-11569655580038207642016-09-27T12:48:10.263-07:002016-09-27T12:48:10.263-07:00two long related comments: 1) Why not use the prio...two long related comments: 1) Why not use the priors for the particular days you will be visiting (I guess the chance of rain varies significantly over the year)? But then why not use as a prior the conditional probability of rain in Seattle given the atmospheric conditions at the moment? Or even, why not commission specific research to inform your prior: why stop at consulting the Western Regional Climate Center or the weather forecast? You can say - not worth it for the problem at hand, but this requires a separate analysis of how much effort it is worth spending on establishing a good prior for this problem. In our case, the answer is probably 'close to zero', as the costs of taking an umbrella are negligible. But then for somebody with zero knowledge about the weather in Seattle a flat prior of rain/no rain, or equivalently a frequentest analysis, would seem as justified as any other. 2) Which brings me to the second point. The problem is introduced as a decision-theoretic one (bring an umbrella or not) but then it asks for a probability that, however defined and computed, is not sufficient to answer the decision-theoretic motivating problem. And it seems to me that the added value of a Bayesian vs. a frequentist answer to the probability question cannot be demonstrated outside of a decision-theoretic setup in which the costs of establishing a prior are compared to the benefits of increased precision of the answer. (And you cannot just say, oh but everybody knows the prior chance of rain in Seattle is 0.5 or 0.1 or whatever, as this info is not provided in the set-up of the problem). Dimiter Toshkovhttp://www.blogger.com/profile/09098718685262606935noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-28998916468175189542016-09-27T11:48:02.294-07:002016-09-27T11:48:02.294-07:00Thanks, Jerzy. The question asks for a probabilit...Thanks, Jerzy. The question asks for a probability; I think your analysis answers a different question.<br /><br />But refusing to answer the question is certainly in the spirit of frequentism.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-79765744841293372322016-09-27T11:19:18.817-07:002016-09-27T11:19:18.817-07:00Perhaps a more Frequentist-spirited answer would b...Perhaps a more Frequentist-spirited answer would be to discuss the study design:<br /><br />Your prior prob. of rain is under 50%, and in fact it's so low (at 10%) that *nothing* your 3 friends say could convince you the (posterior) prob. of rain is over 50%, even when they all agree it is raining.<br /><br />In other words, your study has no power to change your mind! (from the prior decision that it's probably not raining.)<br /><br />So why did you hassle your friends by asking them in the first place? Maybe that's why they lie to you 2/3 of the time :)Jerzy Wieczorekhttp://www.blogger.com/profile/03611849167717252118noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-36817919654265621822016-09-27T09:14:51.665-07:002016-09-27T09:14:51.665-07:00Hi Russ. A helpful reader submitted the following...Hi Russ. A helpful reader submitted the following explanation, which I accidentally rejected instead of publishing. So, with apologies to the helpful reader:<br /><br />my test blog has left a new comment on your post "Bayes's Theorem is not optional": <br /><br />If you look at the answer linked to in the post: https://www.glassdoor.com/Interview/You-re-about-to-get-on-a-plane-to-Seattle-You-want-to-know-if-you-should-bring-an-umbrella-You-call-3-random-friends-of-y-QTN_519262.htm<br /><br />and subsitute 10% chance of rain for 25%, you should get the answer listed here:<br />0.1*(8/27) / ( 0.1*8/27 + 0.9*1/27 )<br />8/270 / 8/270 + 9/270<br />8/17<br />47.06%<br /><br />In Downey's version, he uses both odds and probabilities, which makes the calculations in this case easier but, maybe, harder to follow. <br /><br />Here's the top answer from the linked post:<br /><br />Bayesian stats: you should estimate the prior probability that it's raining on any given day in Seattle. If you mention this or ask the interviewer will tell you to use 25%. Then it's straight-forward:<br /><br />P(raining | Yes,Yes,Yes) = Prior(raining) * P(Yes,Yes,Yes | raining) / P(Yes, Yes, Yes)<br /><br />P(Yes,Yes,Yes) = P(raining) * P(Yes,Yes,Yes | raining) + P(not-raining) * P(Yes,Yes,Yes | not-raining) = 0.25*(2/3)^3 + 0.75*(1/3)^3 = 0.25*(8/27) + 0.75*(1/27)<br /><br />P(raining | Yes,Yes,Yes) = 0.25*(8/27) / ( 0.25*8/27 + 0.75*1/27 )<br /><br />**Bonus points if you notice that you don't need a calculator since all the 27's cancel out and you can multiply top and bottom by 4.<br /><br />P(training | Yes,Yes,Yes) = 8 / ( 8 + 3 ) = 8/11<br /><br />But honestly, you're going to Seattle, so the answer should always be: "YES, I'm bringing an umbrella!"<br />(yeah yeah, unless your friends mess with you ALL the time ;)<br /><br />Interview Candidate on Sep 12, 2013 Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-36036828069056310512016-09-27T02:24:01.914-07:002016-09-27T02:24:01.914-07:00Thanks a lot for sharing this!!Thanks a lot for sharing this!!Biagio Chiricohttp://www.blogger.com/profile/18044407043088889045noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-32960971274446582392016-09-26T22:15:26.763-07:002016-09-26T22:15:26.763-07:00I was hoping to understand how to use Bayesian rea...I was hoping to understand how to use Bayesian reasoning, but I was completely lost by the Bayesian argument. Would you mind elaborating the reasoning behind the segment of the post that starts at "A base rate of 10 ... " and ends at "Probability(8 Odds(p))". (I don't even understand how to read that final bit of notation!) Thanks.Russ Abbotthttp://www.blogger.com/profile/15431389045571531450noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-61692278484356774412016-09-19T18:04:57.020-07:002016-09-19T18:04:57.020-07:00I thought this is how you get tenure...I thought this is how you get tenure...Jason Moorehttp://www.blogger.com/profile/15362357639624306439noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-22300994389967245382016-09-19T06:11:03.899-07:002016-09-19T06:11:03.899-07:00I agree with Creosote. Halfers and thirders can no...I agree with Creosote. Halfers and thirders can not be separated by any decision problem - if they could, the puzzle would indeed be trivial. <br /><br />Both halfers and thirders would bet on 1/3 if asked to make one bet per wake-up, and on 1/2 if asked to make one bet per experiment (although their argumentation might be slightly different.)<br /><br />Further, if the experiment is modified so that we have different people (or Sleeping Beauty-clones) waking up at the possible wake-up events, everyone agrees on the 1/3 answer. <br /><br />The halfer position is an ordinary proposition about an ordinary probability. It is consistent with the facts and leads to correct decisions.Mikkel Schmidthttp://www.blogger.com/profile/06840892229326863085noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-87872123183485905252016-09-19T05:54:20.567-07:002016-09-19T05:54:20.567-07:00Regarding the red dice problem, a halfer would cla...Regarding the red dice problem, a halfer would claim (correctly I think) that once the experimental procedure has been explained, one should believe there is a 2/3 chance the last die is mostly red.<br /><br />When you then say that the actual outcome is red (an event which happens with probability one) that provides no new information, and the belief remains the same.Mikkel Schmidthttp://www.blogger.com/profile/06840892229326863085noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-73964928618752496382016-09-18T16:11:20.168-07:002016-09-18T16:11:20.168-07:00Hi Jason, Good to hear from you, and thanks for t...Hi Jason, Good to hear from you, and thanks for this comment. But maybe you should wait until you have tenure before you say things like this.<br /><br />Just kidding! (I hope.)Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.com