tag:blogger.com,1999:blog-6894866515532737257.comments2016-11-30T13:28:12.526-08:00Probably Overthinking ItAllen Downeyhttps://plus.google.com/111942648516576371054noreply@blogger.comBlogger688125tag:blogger.com,1999:blog-6894866515532737257.post-31672788098146932342016-11-30T13:28:12.526-08:002016-11-30T13:28:12.526-08:00Yes, this "natural frequency" way of sol...Yes, this "natural frequency" way of solving problems like this is excellent.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-43918002372392650512016-11-30T13:16:22.615-08:002016-11-30T13:16:22.615-08:00Simplest way to answer this question is to assume ...Simplest way to answer this question is to assume a number for total population.<br />Assume that 400 is the combined population of all three states. Then each of state1,state2,state3 will be with population 160,100,140 respectively. From the data provided we can derive that number of people supporting party1 in state1,state2,state3 are 0.5*160,0.6*10,0.35*140 (= 80,60,49). <br />From the above derived information we can evaluate the probability that a party1 supporter will be from state2 is 60/(80+60+49) = 0.317 Approx.Unknownhttp://www.blogger.com/profile/04089618677994568066noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-46134659329321860422016-11-29T18:25:50.126-08:002016-11-29T18:25:50.126-08:00^_^^_^张亮http://www.blogger.com/profile/11113984241425650034noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-63620806772327350092016-11-29T13:26:31.343-08:002016-11-29T13:26:31.343-08:00At least some of the forecasters had the uncertain...At least some of the forecasters had the uncertainty of the future included in their models, so they generated two predictions: one if the election were held today and another that took into account the remaining time until the election. The second was generally closer to 1/2 because of the greater uncertainty.<br /><br />But I don't think it's right to say that if the level of uncertainty is high, the probability is necessarily 1/2.<br /><br />To put that differently: a year ago I would not have bet on Trump with even odds.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-50178042642302586312016-11-29T08:24:53.292-08:002016-11-29T08:24:53.292-08:00I think your 41% differs from my 30% because you a...I think your 41% differs from my 30% because you actually computed your answer, while I just made mine up as an example. I'm actually mildly surprised that I was as close as I was. Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-38944453231781143462016-11-29T03:46:24.607-08:002016-11-29T03:46:24.607-08:00What do you think about the criticism which insist...What do you think about the criticism which insists that their probability must be 1/2 until the election approaches, since the poll is so stochastic that we can never predict the future in long term.Stanhttp://www.blogger.com/profile/16725096203497460549noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-35498289081639575272016-11-26T06:21:48.920-08:002016-11-26T06:21:48.920-08:00This comment has been removed by the author.Unknownhttp://www.blogger.com/profile/02468670897396780927noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-5114770962749636732016-11-23T03:10:59.223-08:002016-11-23T03:10:59.223-08:00I think the biggest problem was the bogus precisio...I think the biggest problem was the bogus precision: "71.4%" - this implicates that the model could give a precision of one per thousand - which is of course bs.vonjdhttp://www.blogger.com/profile/12488764399725481497noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-75230078812694086762016-11-16T18:06:17.255-08:002016-11-16T18:06:17.255-08:00Always enjoy your writing Allen, thank you.
For m...Always enjoy your writing Allen, thank you.<br /><br />For me, this whole post screamed 'Probably Overthinking it'. It seems like the problem fits neatly into a use case of a discriminative statistical model like logistic regression for example. I think anyone who claims that the results of such a model are unfair or not useful is probably overthinking it.Adam Levinhttp://www.blogger.com/profile/07717303642060714074noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-90890487169420147272016-11-02T16:10:40.004-07:002016-11-02T16:10:40.004-07:00TheRingshifter> Does the "amnesia" el...TheRingshifter> Does the "amnesia" element really change it that much?<br /><br />The amnesia is necessary in order to argue about Beauty's credence within the experiment itself, i.e. to prevent Tuesday Beauty saying "oh, hang on, you woke me yesterday ... this must be a Tails time-line".<br /><br />But should Beauty feel any differently about the coin flip within the experiment than she did prior to it? That's a key sticking-point of the Sleeping Beauty problem.Creosotehttp://www.blogger.com/profile/02571941331642177202noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-74342049259542223102016-11-01T09:04:43.904-07:002016-11-01T09:04:43.904-07:00Am I wrong saying this...
Probabilistically, the ...Am I wrong saying this...<br /><br />Probabilistically, the question is quite simple. Every time the coin is flipped, it IS 1/2 odds (it's a coin). But every time she is WOKEN UP it's 2/3 tails and 1/3 heads. Is it really that different to just saying something like, "I'm going to slap you twice on a tails, and once on a heads. How likely is it, at each point I've slapped you, that you got a heads or a tails?". Does the "amnesia" element really change it that much? It's the same here... if you are betting on the coin flip, it's still 1/2, but if you are being slapped, then it's more likely it's because you got a tails since 2/3rds of the slaps will end up being attributable to tails and only 1/3rd of them to heads. <br /><br />So to me... it seems like although the "wager" problem is interesting, you are indeed "overthinking" it. Surely the thing that makes the difference is that she is possibly being asked twice about what the coin landed on when she wakes up, but when the bet is resolved, she is only being asked once. <br /><br />Simply, if you do this, say, 6 times, and get 3 heads and 3 tails, and she bets on heads, she will earn money by being correct about the wager, but will also be MOSTLY correct about saying tails is more likely - because she has been asked about the coin NINE times (3 on heads, 6 on tails) and been correct SIX times.TheRingshifterhttp://www.blogger.com/profile/18363611127790007978noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-32594185956365346362016-10-31T19:14:02.503-07:002016-10-31T19:14:02.503-07:00Prof Downey,
Thanks for the article. How did you ...Prof Downey, <br />Thanks for the article. How did you get these values:P(H|FF) = 0.26 after the second, and P(H|FFF) = 0.07 after the third.<br /><br />Thanks<br />Anonymoushttp://www.blogger.com/profile/10202239324994703634noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-6883500172906903422016-10-14T15:04:15.174-07:002016-10-14T15:04:15.174-07:00It's hard to want to get married when I saw so...It's hard to want to get married when I saw so many people in the previous generation or two got divorced. As a man, the idea of losing half my stuff sucks. On top of that if there are kids involved I would only get to see them on Tuesdays and every other weekend, and I'd also get the pleasure of watching some other guy raise them. This makes me very cautious to anyone I'd consider marrying. Greg Bhttp://www.blogger.com/profile/08033946748326670909noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-72125112234101430192016-10-14T13:26:45.120-07:002016-10-14T13:26:45.120-07:00Excellent data viz! As a married 32yo millennial, ...Excellent data viz! As a married 32yo millennial, I can see the shift in my friend group, many of whom have never been married. My wedding was in 2014 when I was 30.Joshua Oliveirahttp://www.blogger.com/profile/04949363293575817646noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-36946818009232125082016-10-13T06:06:30.186-07:002016-10-13T06:06:30.186-07:00Hi. I just added a worksheet that shows how to so...Hi. I just added a worksheet that shows how to solve this problem using Bayes's Theorem.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-56656754707304130282016-10-13T05:50:47.838-07:002016-10-13T05:50:47.838-07:00Hi Emile, I'm glad it is making sense.
But I...Hi Emile, I'm glad it is making sense.<br /><br />But I want to clarify the point of my article. I am not saying that there are two answers to this question, one Bayesian and one frequentist, and the Bayesian one is right.<br /><br />I am saying that there is only one answer to this question, and it is neither Bayesian nor frequentist. It is just a consequence of the laws of probability.<br /><br />The issue Dimiter raised is called the reference class problem: https://en.wikipedia.org/wiki/Reference_class_problem<br /><br />And while the reference class problem is relevant to the problem, it is a general problem for all of probability, and not specifically Bayesian or frequentist, either.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-25120124062450485922016-10-12T18:28:14.111-07:002016-10-12T18:28:14.111-07:00I deleted my initial comment because I was probabl...I deleted my initial comment because I was probably overthinking it. I grappled with your explanation and the other posts, but now understand your statement that only a Bayesian approach can lead to the right answer. <br /><br />We want Prob(raining|YYY).<br />This is equal to Prob(raining and YYY)/Prob(YYY). We don't know Prob(raining and YYY).<br /><br />However Prob(raining and YYY) also equals Prob(raining)*Prob(YYY|raining)<br /><br />So we need Prob(raining)but don't know this. So as already mentioned by Dimiter, there are infinitely many answers in the absence of information. <br />However, we do know Prob(YYY|raining)=8/27<br /><br />Furthermore, <br />Prob(YYY) = Prob(YYY|raining)*Prob(raining) + Prob(YYY|not raining)* Prob (not raining)<br /><br />We also know Prob(YYY|not raining) = 1/27. <br /><br />The frequentist solution is really a relative odds ratio: Prob(YYY|raining)/{ Prob(YYY|raining) + Prob(YYY|not raining)}. It is not really a probability at all. <br /><br />Now when Prob(raining)=1/2 ( the best guess if you have no information) then the frequentist relative odds will equal the Bayesian probability. <br /><br />If instead of Seattle the trip was to Los Angeles, and the Prob(raining) =0.01, then having three friends saying "Yes it's raining" will result in Prob(raining|YYY) =~0.075.<br /><br />However if you asked 14 of your mostly truthful friends, and all 14 said,"yes dude, it's raining", then the conditional probability increase to ~0.993 (frequentist relative odds = ~0.99994). <br /><br />This makes sense.<br /> <br />Thanks for posting this.<br /><br />Emile Elefteriadishttp://www.blogger.com/profile/06682927722971051966noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-71304155594207886632016-10-11T23:24:50.902-07:002016-10-11T23:24:50.902-07:00Hello Allen, I'm having difficulty solving thi...Hello Allen, I'm having difficulty solving this problem using Bayes Theorem but have no idea where I'm going wrong. Could you please shed me a light?<br /><br />We have:<br /><br />P(rain|YYY) = P(YYY|rain)*P(rain) / P(YYY)<br /><br />P(YYY|rain) = 8/27 ?<br />P(rain) = 0.1<br />P(YYY) = 8/27 ?<br /><br />Are the previous values correct? If they are then P(rain|YYY) = 0.1*8/27 / 8/27 = 0.1<br /><br />This would imply that P(rain|YYY) is not dependent on P(YYY) at all. What am I doing wrong?Unknownhttp://www.blogger.com/profile/15299634454831159125noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-88742291334172286062016-10-11T13:20:44.579-07:002016-10-11T13:20:44.579-07:00The issue with Lewis' argument, and Elga's...The issue with Lewis' argument, and Elga's section 3, is that "new information" is not defined. Or more accurately, what allows a Bayesian update, usually called "new information" without definition, is never described in a robust manner.<br /><br />IMO, "new" is the wrong concept for this; "changed" is better. It's just that in the vast majority of examples, "changed" means "has something added," which is "new."<br /><br />A partition of a discrete sample space is a set of disjoint events whose probabilities sum to 1. So if you roll a die, {1,2,3,4,5,6}, {even,odd}, and {prime, non-prime} are all partitions. But {1,2,3} is not because the probabilities do not sum to 1, and {4,5,6,odd} is not because the events are not disjoint.<br /><br />Events that have zero probability can be included in a sample space, so {1,2,3,4,5,6,7} is also a partition. To express the idea I want, I need to introduce a "minimal partition." That's one that includes no zero-probability events.<br /><br />What allows an update, is any information state change that alters what is, or is not, a minimal partition. On Sunday, {Heads&Monday, Tails&Monday, Heads&Tuesday, Tails&Tuesday} is a minimal partition for potential experiment states in the future. Each - including Heads&Tuesday - has a 1/4 probability to represent the experiment's state at any moment in the next two days. But when SB is wakened, Heads&Tuesday is disqualified as a possible game state. This creates a change in the minimal partition, and so allows for an update.<br /><br />He said it differently, and I don't agree with how he said it, but I think this is what Elga meant in his Section 3. The idea is correct.<br /><br />To update a probability based on changed information, you divide the previous probability of the event in question by the sum of previous probabilities of the new minimal partition:<br /><br />Pr(Heads|Awake) = (1/4)/(1/4+1/4+1/4) = 1/3.JeffJohttp://www.blogger.com/profile/09110352332876400907noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-52070592745979605252016-10-10T08:18:47.642-07:002016-10-10T08:18:47.642-07:00This comment has been removed by the author.Emile Elefteriadishttp://www.blogger.com/profile/06682927722971051966noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-76766455963870996912016-10-04T16:03:50.474-07:002016-10-04T16:03:50.474-07:00Frankly, I have long thought of the bayesian appro...Frankly, I have long thought of the bayesian approach as one that is best digested as a of morning joe. Frankly, the frequentists are more like a protein shake with shopped up celery and spinach.<br /><br />However, non-parametric methods are the best. I love basic arithmetic. <br /><br />Sir Thomas was a priest after all..Thomas Baymanhttp://www.blogger.com/profile/08031681715056987599noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-80343390080598779312016-10-03T04:54:25.222-07:002016-10-03T04:54:25.222-07:00Great question, thanks! Yes, the chi2 would be co...Great question, thanks! Yes, the chi2 would be correct (and just as easy to implement). I think my normal approximation is pretty close, but when I get a chance, I will run both and see how close. If I did this again, I would use the chi2 distribution, since the normal approximation doesn't provide any advantage for this problem.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-47825582850333937752016-10-03T00:51:10.618-07:002016-10-03T00:51:10.618-07:00In your book Think Bayes, you use the same example...In your book Think Bayes, you use the same example to illustrate Approximate Bayesian Computation. And you use `scipy.stats.norm.logpdf(s, sigma, sigma/math.sqrt(2*(n-1)))` for the likelihood of the sample standard deviation under the (mu, sigma) hypothesis. And I wonder, won't me more appropiate to use the sampling distribution of the sample variance for that likelihood instead? Something like `loglike += scipy.stats.chi2.logpdf((s**2*(n-1))/(sigma**2),df=(n-1))`.<br /><br />I posted this as question in stackoverflow http://stats.stackexchange.com/questions/238046/what-is-the-likelihood-of-drawing-a-sample-with-standard-deviation-s-from-a-noecerulmhttp://www.blogger.com/profile/02005618894883008001noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-60050865019228855542016-09-28T00:09:19.649-07:002016-09-28T00:09:19.649-07:00Allen: I think the probability question, as stated...Allen: I think the probability question, as stated, has not one and not two, but infinitely many answers. Why?<br />1) If we stick to the problem as stated, there is no information whatsoever that we can use to pick a prior probability of rain. So any prior must be as good as any other. By implication, that any posterior probability of rain can take any value between 0 and 1 (including these). So in the absence of any further information/ assumptions, the correct answers is probability of rain = [0;1]. We can say that the probability has significantly increased on hearing the signal from the friends, but we cannot see to how much, since we do not know the prior.<br />2) if we assume that if the probability was known with certainty in advance there would be no need for asking, we can exclude the boundaries of the interval, so p = (0, 1)<br />2b) if, in addition, we assume that probability is measured with finite precision (sey as whole percentages), the minimum prior probability becomes 0.01, so the interval for the posterior becomes p = (0.07, 1).<br />3) If we make the alternative assumption that the traveler would only ask for the solicited information if it would have a chance to change the decision (interpreted as moving the posterior above or below 0.5), the (posterior) probability becomes p = (0.5, 1). The friends send a signal that is as strong as it gets, so in the most adversarial prior it must just about move the probability above 0.5 so that in can satisfy the assumption of having a chance to influence the decision.<br />4) Now, if we want to make even further assumptions about what is known and what can be known in the framework of the problem, we can find reasons to pin the prior and, by implication the posterior, to any number within this range but a) this is not obviously warranted by the setup of the problem, and b) we enter the decision-theoretic problem mentioned in my previous post.<br />In sum, there might be one correct way to update probabilities based on prior information and new data. But in the absence of prior information and assumptions, any posterior probability must be correct.Dimiter Toshkovhttp://www.blogger.com/profile/09098718685262606935noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-71351507574032455592016-09-27T17:32:47.682-07:002016-09-27T17:32:47.682-07:00And I didn't mean to sound snarky towards you....And I didn't mean to sound snarky towards you. Thanks for your comments!Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.com