tag:blogger.com,1999:blog-6894866515532737257.comments2017-02-21T12:48:02.271-08:00Probably Overthinking ItAllen Downeyhttps://plus.google.com/111942648516576371054noreply@blogger.comBlogger704125tag:blogger.com,1999:blog-6894866515532737257.post-6617413490974662692017-02-18T17:26:27.722-08:002017-02-18T17:26:27.722-08:00I have a question though. I mentioned that this qu...I have a question though. I mentioned that this question was taken from the MIT 2006 OCW final exam for computer science. The solution was also available on the internet. Even though they have mentioned the answer to be 5/13 I.e. 10/26 but there's some calculation error in their method. The correct answer according to their method is 25/63. But I really don't understand why they did the question that way. I'm adding the links to both the question and the answer here. It's the fourth question. <br /><a href="https://www.google.co.in/url?sa=t&source=web&rct=j&url=https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/exams/MIT6_042JF10_fnl_2006_sol.pdf&ved=0ahUKEwiWmOT3-5rSAhVCpo8KHbR6DbMQFgglMAA&usg=AFQjCNGh_rIRn7AyI4WO9Paj9m-iq_riUw&sig2=ZbyrZqKgd6YJaSjmAEWhIg" rel="nofollow">answer</a><br /><a href="https://www.google.co.in/url?sa=t&source=web&rct=j&url=https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/exams/MIT6_042JF10_final_2006.pdf&ved=0ahUKEwiWmOT3-5rSAhVCpo8KHbR6DbMQFggoMAE&usg=AFQjCNGfuqKhxH7H6Wpbt4nB9DjAWN4JvA&sig2=GmFaDfgVkq0T3cb3O3BeuA" rel="nofollow">question</a><br />Riyahttp://www.blogger.com/profile/05095295993427070799noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-12877653335172783042017-02-18T10:34:12.984-08:002017-02-18T10:34:12.984-08:00This comment has been removed by the author.Riyahttp://www.blogger.com/profile/05095295993427070799noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-78346079788040356192017-02-17T12:24:31.688-08:002017-02-17T12:24:31.688-08:00I have corrected that error. Thanks!I have corrected that error. Thanks!Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-14487104130235535242017-02-17T10:13:13.953-08:002017-02-17T10:13:13.953-08:00I enjoyed the problem, but I got a different answe...I enjoyed the problem, but I got a different answer. I thought the prior for TEST1 was 2/3.David Bodyhttp://www.blogger.com/profile/09987602296796651688noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-20559111740998241692017-02-17T07:26:52.253-08:002017-02-17T07:26:52.253-08:00Thanks for letting me know about the source of the...Thanks for letting me know about the source of the problem. If you have a link, I'll include it in the article.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-69936903023640887172017-02-17T07:16:54.521-08:002017-02-17T07:16:54.521-08:00Hey thanks Mr. Allen for taking up the question.I ...Hey thanks Mr. Allen for taking up the question.I would just like to mention that I came across this question from the MIT OCW final exam for computer science.<br />I have thought of two approaches and of course they're giving different answers, although I'm a little biased towards one of the approaches. I would be really glad if I could finally understand what's happening in this question. Riyahttp://www.blogger.com/profile/05095295993427070799noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-23560589514582152542017-02-16T07:39:09.813-08:002017-02-16T07:39:09.813-08:00Excellent question! I just turned it into a blog p...Excellent question! I just turned it into a blog post. I'll give readers a few days before I post a solution: http://allendowney.blogspot.com/2017/02/a-nice-bayes-theorem-problem-medical.htmlAllen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-46894475359124415992017-02-16T04:53:17.056-08:002017-02-16T04:53:17.056-08:00I have a question. Exactly 1/5th of the people in ...I have a question. Exactly 1/5th of the people in a town have Beaver Fever . There are two tests for Beaver Fever, TEST1 and TEST2. When a person goes to a doctor to test for Beaver Fever, with probability 2/3 the doctor conducts TEST1 on him and with probability 1/3 the doctor conducts TEST2 on him. When TEST1 is done on a person, the outcome is as follows: If the person has the disease, the result is positive with probability 3/4. If the person does not have the disease, the result is positive with probability 1/4. When TEST2 is done on a person, the outcome is as follows: If the person has the disease, the result is positive with probability 1. If the person does not have the disease, the result is positive with probability 1/2. A person is picked uniformly at random from the town and is sent to a doctor to test for Beaver Fever. The result comes out positive. What is the probability that the person has the disease?Riyahttp://www.blogger.com/profile/05095295993427070799noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-32237289467857960462017-02-12T16:05:30.273-08:002017-02-12T16:05:30.273-08:00Interesting! For a recent working paper on the eff...Interesting! For a recent working paper on the effect of Photo ID laws on turnout, see here [1]. They find an average suppression of 7.7% among Democrats and 4.6% among Republicans. <br /><br />[1] Voter Identification Laws and the Suppression of Minority Votes<br />http://pages.ucsd.edu/~zhajnal/page5/documents/voterIDhajnaletal.pdf<br /><br />Abstract:<br /><br />The proliferation of increasingly strict voter identification laws around the country has raised concerns about voter suppression. Although there are many reasons to suspect that these laws could harm groups like racial minorities and the poor, existing studies have been limited, with most occurring before states enacted strict identification requirements, and they have uncovered few effects. By using validated voting data from the Cooperative Congressional Election Study for several recent elections, we are able to offer a more definitive test. The analysis shows that strict identification laws have a differentially negative impact on the turnout of racial and ethnic minorities in primaries and general elections. We also find that voter ID laws skew democracy toward those on the political right.Anonhttp://www.blogger.com/profile/15513912296075560498noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-56785656768293897892017-01-18T21:13:56.190-08:002017-01-18T21:13:56.190-08:00Why is it that when calculating the variance in Ka...Why is it that when calculating the variance in Kaplan Meier survival curves, the underlying distribution of the population is not taken into account--only the number at risk is? This is counter-intuitive. The only references are to a report by Greenwood in 1926 which doesn't really seem to answer the question.Mark Phillipshttp://www.blogger.com/profile/03120283972507328308noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-47616136231365182042017-01-18T10:46:50.416-08:002017-01-18T10:46:50.416-08:00There was disagreement among statisticians about h...There was disagreement among statisticians about how to do hypothesis testing; what we ended up with is actually a weird hybrid that makes less sense than the alternatives. See https://en.wikipedia.org/wiki/Statistical_hypothesis_testing#Origins_and_early_controversy<br /><br />Regarding precision and recall, those are the terms most often used in the context of information retrieval. In the context of stats, it's usually false positives and false negatives (or type I and type II errors). And in the context of machine learning, it's usually a confusion matrix. But they are all representations of the same information.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-45725377773039094592017-01-17T15:02:17.176-08:002017-01-17T15:02:17.176-08:00I think there was some other place where I found a...I think there was some other place where I found an explanation that there are two approaches to hypothesis validation. One used p-values. The other one was something like precision and recall (false positives and negatives) or something similar. I think it mentioned some disagreement between statisticians before p-values became popular. May this be possible?trylks yeahhttp://www.blogger.com/profile/14821108955806430210noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-84018391878133304622017-01-05T09:34:34.881-08:002017-01-05T09:34:34.881-08:00Great examples. Congratulations! Great examples. Congratulations! Hugo Pireshttp://www.blogger.com/profile/00400200472463525909noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-73832185952477179532017-01-02T09:20:23.283-08:002017-01-02T09:20:23.283-08:00Have you seen my four-volunteer version? Each will...Have you seen my four-volunteer version? Each will be wakened at least once, and maybe twice, based on the same coin flip. Each will be left asleep in only one set of conditions, different for each, as defined by the cross product {Monday,Tuesday}x{Heads,Tails}. Each will be asked for her credence that the coin landed on the side that would let her sleep through one day.<br /><br />One of these volunteers is undergoing the identical problem as in the original problem. Three are undergoing a functionally equivalent one, that must have the same answer.<br /><br />On any day, exactly three volunteers will be wakened. On any day, exactly one of the three awake volunteers is in the set of conditions she is asked about. On any day, each of the three awake volunteers has the same information upon which to base her credence.<br /><br />Yet that credence is found by a "bunch of probability calculations." It is 1/3.JeffJohttp://www.blogger.com/profile/09110352332876400907noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-14019635407645045782016-12-20T14:23:45.564-08:002016-12-20T14:23:45.564-08:00Christopher> Am I missing something here?
No-o...Christopher> Am I missing something here?<br /><br />No-one's disputing the maths:<br /> * [A] One half of experiments in which Beauty wakes are those in which the coin is Heads.<br /> * [B] One third of Beauty's experimental wake-ups are those in which the coin is Heads.<br /><br />Beauty knows the facts, but she is obliged to pick just one of them as the basis of her "credence for Heads". Halfers insist she pick [A], Thirders insist on [B]. She's not allowed to state the reference class (w.r.t. experiments/wakeups) in her answer: her "credence" must be unqualified and absolute.<br /><br />So there's an interesting philosophical question lurking at the heart of this: when Beauty can see both perspectives, [A] and [B], what then do we mean, fundamentally, by her "credence"? If there is a correct answer to the Sleeping Beauty problem then it's a philosophical one, not bunch of probability calculations.Creosotehttp://www.blogger.com/profile/02571941331642177202noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-72168284068907806242016-12-17T21:28:06.322-08:002016-12-17T21:28:06.322-08:00Alright, let's assume Beauty believes the thir...Alright, let's assume Beauty believes the thirder perspective and decides that she will answer "I think it landed on Tails, the odds are 2 in 3" every time she is questioned because she thinks she'll be correct more often. She gets asked twice if it comes up Tails after all, and all these mathematicians can't be wrong.<br /><br />As soon as she enters sleep/stasis and the coin is flipped there emerge two timelines, T1 where the fair coin came up Heads and T2 where the fair coin came up Tails. Since it's a fair coin T1 happens 50% of the time and T2 happens 50% of the time. This means that in T1 Beauty says Tails and is incorrect, whereas in T2 Beauty says Tails and is correct.<br /><br />Doesn't this mean that she's correct in 50% of timelines and incorrect in 50% of timelines? There are only two possible timelines and they happen 50% of the time each since it's a fair coin. In fact she could be asked a thousand times in T2 and since her memories are wiped and reset she'll always answer Tails. Furthermore, doesn't this mean that if you repeat the experiment 1,000 times, you get 500 Heads and 500 Tails, and she answers Tails every time... that she's correct in 500 experiments and incorrect in 500 experiments?<br /><br />The fact that she answers incorrectly once in T1 and correctly twice in T2 doesn't make her answer twice as correct. You're only tracking whether she answered correctly, not how many times she gets it right. I mean, you're not giving her 100$ every time she gets the answer right. If you did then she'd always say Tails because she has a 50% of getting 200$ and a 50% of getting 0$, whereas if she said Heads she'd have a 50% chance of getting 100$ and a 50% chance of getting 0$. What you're looking at is her certainty that the coin came up Heads or Tails. And if my previous logic is correct, that means that guessing Tails makes her correct in 50% of timelines... so she'd break even instead of being ultimately correct approximately 666-667 times out of 1,000 experiments.<br /><br />I can't believe that the process of being interviewed would lead her to believe that the odds of it being Heads or Tails had changed. This isn't a Monty Haul problem, she's not getting any new information. She knows that she's going to be woken up at least once and has no capacity to distinguish between Heads(Monday), Tails(Monday), and Tails(Tuesday)... or the 997th Tails day for that matter.<br /><br />Am I missing something here?Christopherhttp://www.blogger.com/profile/16259364826167235453noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-31672788098146932342016-11-30T13:28:12.526-08:002016-11-30T13:28:12.526-08:00Yes, this "natural frequency" way of sol...Yes, this "natural frequency" way of solving problems like this is excellent.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-43918002372392650512016-11-30T13:16:22.615-08:002016-11-30T13:16:22.615-08:00Simplest way to answer this question is to assume ...Simplest way to answer this question is to assume a number for total population.<br />Assume that 400 is the combined population of all three states. Then each of state1,state2,state3 will be with population 160,100,140 respectively. From the data provided we can derive that number of people supporting party1 in state1,state2,state3 are 0.5*160,0.6*10,0.35*140 (= 80,60,49). <br />From the above derived information we can evaluate the probability that a party1 supporter will be from state2 is 60/(80+60+49) = 0.317 Approx.Unknownhttp://www.blogger.com/profile/04089618677994568066noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-46134659329321860422016-11-29T18:25:50.126-08:002016-11-29T18:25:50.126-08:00^_^^_^张亮http://www.blogger.com/profile/11113984241425650034noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-63620806772327350092016-11-29T13:26:31.343-08:002016-11-29T13:26:31.343-08:00At least some of the forecasters had the uncertain...At least some of the forecasters had the uncertainty of the future included in their models, so they generated two predictions: one if the election were held today and another that took into account the remaining time until the election. The second was generally closer to 1/2 because of the greater uncertainty.<br /><br />But I don't think it's right to say that if the level of uncertainty is high, the probability is necessarily 1/2.<br /><br />To put that differently: a year ago I would not have bet on Trump with even odds.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-50178042642302586312016-11-29T08:24:53.292-08:002016-11-29T08:24:53.292-08:00I think your 41% differs from my 30% because you a...I think your 41% differs from my 30% because you actually computed your answer, while I just made mine up as an example. I'm actually mildly surprised that I was as close as I was. Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-38944453231781143462016-11-29T03:46:24.607-08:002016-11-29T03:46:24.607-08:00What do you think about the criticism which insist...What do you think about the criticism which insists that their probability must be 1/2 until the election approaches, since the poll is so stochastic that we can never predict the future in long term.Stanhttp://www.blogger.com/profile/16725096203497460549noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-35498289081639575272016-11-26T06:21:48.920-08:002016-11-26T06:21:48.920-08:00This comment has been removed by the author.Unknownhttp://www.blogger.com/profile/02468670897396780927noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-5114770962749636732016-11-23T03:10:59.223-08:002016-11-23T03:10:59.223-08:00I think the biggest problem was the bogus precisio...I think the biggest problem was the bogus precision: "71.4%" - this implicates that the model could give a precision of one per thousand - which is of course bs.vonjdhttp://www.blogger.com/profile/12488764399725481497noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-75230078812694086762016-11-16T18:06:17.255-08:002016-11-16T18:06:17.255-08:00Always enjoy your writing Allen, thank you.
For m...Always enjoy your writing Allen, thank you.<br /><br />For me, this whole post screamed 'Probably Overthinking it'. It seems like the problem fits neatly into a use case of a discriminative statistical model like logistic regression for example. I think anyone who claims that the results of such a model are unfair or not useful is probably overthinking it.Adam Levinhttp://www.blogger.com/profile/07717303642060714074noreply@blogger.com