tag:blogger.com,1999:blog-6894866515532737257.comments2016-05-26T11:29:15.747-07:00Probably Overthinking ItAllen Downeyhttps://plus.google.com/111942648516576371054noreply@blogger.comBlogger613125tag:blogger.com,1999:blog-6894866515532737257.post-3845811131377724252016-05-26T11:18:04.755-07:002016-05-26T11:18:04.755-07:00Is "condition" as defined by trivers (bi...Is "condition" as defined by trivers (biology) related to any definition of socioeconomic condition? By any definition of "condition", people in the United States are at the peak economic condition. Trivers did not define "condition" as anything close to a value that can be picked off a database. Vijayhttp://www.blogger.com/profile/04598434756892672717noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-6600171632907093122016-05-22T13:14:37.288-07:002016-05-22T13:14:37.288-07:00> For the 18 SOBs we have actually observed
Fa...> For the 18 SOBs we have actually observed<br /><br />Fantastic variable choice!Alex Riinahttp://www.blogger.com/profile/16798593482677040333noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-68623836014249615002016-05-20T06:31:23.268-07:002016-05-20T06:31:23.268-07:00This is great! We have been using Bayesian methods...This is great! We have been using Bayesian methods in economic science for quite some time. I am glad to see that engineering is, finally, starting to pick up on it. Richard Sessahttp://www.blogger.com/profile/02172477842431895627noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-72007134263122731472016-05-18T03:04:06.249-07:002016-05-18T03:04:06.249-07:00Andrew Gelman wrote an article on a similar topic ...Andrew Gelman wrote an article on a similar topic a few years ago, if you've not seen it it's an interesting take on the problem: http://andrewgelman.com/2009/06/21/of_beauty_sex_a/Sam Masonhttp://www.blogger.com/profile/17261968597468673085noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-1408361391641392202016-05-16T07:46:12.411-07:002016-05-16T07:46:12.411-07:00Good questions. What I have done so far is one at...Good questions. What I have done so far is one attempt to find a generational effect, which failed. So there could still be an effect, but my experiment missed it.<br /><br />I am planning a follow-up that will try harder to show the effect, by searching specifically for questions that show a generational pattern and aggregating them.<br /><br />Watch this space.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-17535750982559315332016-05-16T07:38:11.344-07:002016-05-16T07:38:11.344-07:00If it did, how would you reconcile it with the res...If it did, how would you reconcile it with the results of this analysis?Arthur Grabovskyhttp://www.blogger.com/profile/06197260440427038143noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-45096780905262567342016-05-13T14:24:06.848-07:002016-05-13T14:24:06.848-07:00Interesting approach! My first instinct would have...Interesting approach! My first instinct would have been to plot age on the x-axis and an index of social and political attitudes on the y-axis, fit a spline, and examine jumps at the lower and upper bounds of each generational interval. I wonder if this would produce different results. Y.R. Velezhttp://www.blogger.com/profile/16807475997597988399noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-17561859733299112132016-05-10T07:18:47.260-07:002016-05-10T07:18:47.260-07:00Your correction on Scenario C is correct, and your...Your correction on Scenario C is correct, and your answer on Scenario D was correct all along.<br /><br />Interestingly, I made the same mistake on Scenario C (but it took me longer to catch it).<br /><br />Nice job!Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-68274079664777367192016-05-09T14:00:20.197-07:002016-05-09T14:00:20.197-07:00Let me see if I can get it right this time. Feel f...Let me see if I can get it right this time. Feel free to leave my earlier wrong answer up. I deserve to be at least mildly shamed.<br /><br />Scenario C: <br /><br />Let t1=0.2 and t2=0.4. Here are the ways to get a positive test result, with their associated probabilities:<br /><br />Sick: p s<br />Not sick, t=t1: (1-p) t1 / 2<br />Not sick, t=t2: (1-p) t2 / 2<br /><br />The probability that one person is sick, given a positive test result, is <br /><br />ps / (sum of all three terms above),<br /><br />which is 1/4. <br /><br />That's the probability that the first positive-tester is sick, in scenario C (as it was in A and B).<br /><br />In scenario C, each trial is independent, so the answer to question 2 is the square of the answer to question 1, i.e., 1/16.<br /><br />Scenario D: <br /><br />If we hypothesize any given value of t, then we can calculate the probability that any given positive-tester is in fact sick. That turns out to be<br /><br />psick = 1/3 if t= 0.2<br />psick = 1/5 if t = 0.4.<br /><br />In scenario D, we never find out any information that tells us which value of t is correct, so they remain equiprobable. There's a 50% chance that psick=1/3 and a 50% chance that psick = 1/5, so when you meet that first positive-tester, the probability that he's sick is the average of the two.<br /><br />psick =0.5 (1/3+1/5) = 4/15. [Scenario D, question 1]<br /><br />Under each hypothesis for t, the probability that any given positive-tester is sick is independent of the others, so the probability that the first two testers are both sick is the square of the probability that one is sick. We still have no information about which value of t is correct, so the probability that both folks are sick is the average of the probabilities for each of the two t's:<br /><br />0.5 ( 1/9 + 1/25 ) = 17/225 [Scenario D, question 2].Ted Bunnhttp://www.blogger.com/profile/12230509214302717664noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-63146745203133838772016-05-09T13:20:48.420-07:002016-05-09T13:20:48.420-07:00Dammit, I got C wrong, didn't I? More of the p...Dammit, I got C wrong, didn't I? More of the positive results come from hypothesis t = 0.4 than from t=0.2. I'll revise, but I wanted to get this comment into the record right away.Ted Bunnhttp://www.blogger.com/profile/12230509214302717664noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-33895584212847705952016-05-09T12:28:12.361-07:002016-05-09T12:28:12.361-07:00For any given t, the probability that a person who...For any given t, the probability that a person who tests positive is actually sick is <br /><br />psick(t) = p s / (p s + (1-p) t).<br /><br />For the given parameters, the two relevant values are <br /><br />psick(0.2) = 1/3<br />psick(0.4) = 1/5<br /><br />In both scenarios C and D, the probability that the first person is sick is simply the average of the two:<br /><br />P1 = (1/3 + 1/5)/2 = 4/15.<br /><br />That's because the two possible values of t remain equiprobable throughout these scenarios.<br /><br />In scenario C, each new positive-testing person is an independent event with this same probability, so the probability that the first two people are both sick is<br /><br />P2C = P1^2 = 16/225.<br /><br />In scenario D, the two events are not independent, because they have the same underlying value of t. But for any given t, they would be independent. So we can compute the probability that both people are sick under hypothesis t=0.2, and the probability that both people are sick under hypothesis t=0.4, and average the two:<br /><br />P2D = (p(0.2)^2 + p(0.4)^2) / 2 = 17/225.<br />Ted Bunnhttp://www.blogger.com/profile/12230509214302717664noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-79810019169197517432016-05-09T08:07:39.631-07:002016-05-09T08:07:39.631-07:00So far you are 4 for 4 (two scenarios, two questio...So far you are 4 for 4 (two scenarios, two questions each). Want to lock in your answers for C and D before I publish solutions?Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-2357609086349128032016-05-06T15:16:56.099-07:002016-05-06T15:16:56.099-07:00I got the same answers as you for scenario A. The ...I got the same answers as you for scenario A. The probability that the first patient is sick is<br /><br />p s / ( p s + (1-p) t1 / 2 + (1-p) t2 / 2).<br /><br />The three terms in the denominator are the probabilities associated with the three ways of getting a positive test: true positive, false positive with test t1, false positive with test t2.<br /><br />This works out to 1/4.<br /><br />In scenario A, the second patient's outcome is independent of the first, so the probability that both are sick is just the square of the above probability.<br /><br />In scenario B, the answer to question 1 is the same. If I'm not mistaken, the answer to question 2 is <br /><br />p^2 s^2 / ((p s + (1-p) t1)^2 / 2 + (p s + (1-p) t2)^2)/2)<br /><br />Sorry that's a bit hard to read. The numerator is the probability of getting two true positive. The denominator is of the form A^2/2 + B^2/2, where A is the probability of getting a single positive result (true or false) under hypothesis t1, and B is the probability under hypothesis t2. Under either hypothesis, the outcomes for patients 1 and 2 are independent, so the probability of getting two false positives under hypothesis t1 is A^2, and similarly for hypothesis t2.<br /><br />Anyway, the numerical value of the above expression is 1/17, so I claim that that's the answer to scenario B, question 2.<br />Ted Bunnhttp://www.blogger.com/profile/12230509214302717664noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-48864716457072065042016-05-06T12:02:56.410-07:002016-05-06T12:02:56.410-07:00It gives me some hope that even you guys are havin...It gives me some hope that even you guys are having some difficulty moving the complexity up a level, I can understand the base case fine For Bayes, but keep getting lost as I try to move on.. So, glad to hear it is not just me! I really thought I was not bright enough to "get it"dartdoghttp://www.blogger.com/profile/15756184463450075364noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-67473247334004551952016-05-06T07:09:32.966-07:002016-05-06T07:09:32.966-07:00For what it's worth, I was one of those who we...For what it's worth, I was one of those who were puzzled (I wouldn't say annoyed) by the apparent mismatch between the question as posed and Scenarios C and D. <br /><br />At the moment, I'm mostly curious about where this is going. All of the calculations in the notebook seem correct to me, and none seem particularly counterintuitive or surprising. I gather that you're building to something surprising, and I look forward to seeing what it is.Ted Bunnhttp://www.blogger.com/profile/12230509214302717664noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-82289660885084655962016-05-06T06:29:36.554-07:002016-05-06T06:29:36.554-07:00That is correct in Scenario B (where I choose a di...That is correct in Scenario B (where I choose a die once and then roll it repeatedly).Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-39076968221371630872016-05-06T06:26:28.545-07:002016-05-06T06:26:28.545-07:00if A is the event that you have the R-favorable di...if A is the event that you have the R-favorable die, then as above it's probability P(A) after one roll is now 2/3.<br />So P(R) = P(R|A)P(A) + P(R|Ac)P(Ac) = (2/3)*(2/3)+(1/3)*(1/3)=5/9Harold Shiphttp://www.blogger.com/profile/10470259780261074552noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-20624350118115859042016-05-05T14:05:00.275-07:002016-05-05T14:05:00.275-07:00Good so far!
But I added a followup question: wha...Good so far!<br /><br />But I added a followup question: what's the probability that the outcome of the next roll is red, too? Hint: the scenario is deliberately ambiguous.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-62587387318361480192016-05-05T13:46:48.561-07:002016-05-05T13:46:48.561-07:00In this case, you can straightforwardly calculate ...In this case, you can straightforwardly calculate favorable probability out of total probability, so<br /><br />[ (1/2).(4/6) ] / [ (1/2).(4/6) + (1/2).(2/6) ] <br />= [ 4/6 ] / [ (4/6) + (2/6) ]<br />= [2] / [2 + 1] = 2/3<br /><br />But you already knew that. Anyways, it's nice to be able to comment something; most of your posts are way above my paygrade.Nunohttp://www.blogger.com/profile/09089984219014317886noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-64822021863035887942016-04-07T08:45:01.158-07:002016-04-07T08:45:01.158-07:00What you are proposing is what I called E-fairness...What you are proposing is what I called E-fairness in this article. I explained two problems with E-fairness, and then proposed two alternative definitions of "fair". And I discuss the question of what the relevant population of comparison should be. By proposing alternatives and evaluating their consequences, I am not assuming anything.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-43010803997726436602016-04-07T08:06:51.750-07:002016-04-07T08:06:51.750-07:00If the male record is about 2:02 and the female re...If the male record is about 2:02 and the female record is about 2:15, then the gender gap should only be about 13 minutes. It seems like BQ is currently unfair to men. I think you need to be careful of your statistics. Are we comparing to pure ability or are we comparing to the pool of people who currently run? There may be a statistically significant difference between the % of men who are talented for running and actually run compared with the % of women. Your analysis assumes a similar spreadUnknownhttp://www.blogger.com/profile/18005421939677771382noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-88707806975541368302016-03-29T13:27:38.508-07:002016-03-29T13:27:38.508-07:00I'd go with option 3 and add a sub-reason 3A, ...I'd go with option 3 and add a sub-reason 3A, that of training in an authoritarian setting. Not that the profs are clones of the North Korean leader or anything, but there is no dispute that the profs know the material and the undergrads don't. Nor is it a matter of debate like in many of the liberal arts where a well-crafted argument presents another point of view even if it disagrees with the prof. Your beam calculation is right or it isn't. Your transistor is properly biased or it isn't. Year after year students are under the tutelage of those with much more knowledge than they, who cannot be meaningfully challenged or questioned. And perhaps those with personalities attracted to or at least compatible with this stick it out for their degree. The transfer of loyalties to an extremist group with unchallenged leaders and claiming it has all the answers should be apparent...miket29https://miket29.wordpress.com/noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-87218905812928068762016-03-08T06:06:25.751-08:002016-03-08T06:06:25.751-08:00Yes, good point. Time series data is particularly...Yes, good point. Time series data is particularly good at producing spurious correlations.Allen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-4322586454495643172016-03-08T05:52:10.704-08:002016-03-08T05:52:10.704-08:00Hello Allen
I knew "Think Python" from ...Hello Allen<br /><br />I knew "Think Python" from long ago, and recently I discovered the rest of your books, which are great, thank you.<br /><br />I just wanted to comment on the "correlation does not imply causation" thing. As I see it, this statement usually refers to heavily autocorrelated series, this is, series with actually a few independent points. It is very easy to find spurious correlations in this kind of series, as the global warming and number of pirates example. When you have two samples of n=1000 points each, with no autocorrelation, and find a 0.9 correlation then there is almost certainly a causal link behind.Markelhttp://www.blogger.com/profile/02405351151568774377noreply@blogger.comtag:blogger.com,1999:blog-6894866515532737257.post-59345034728425643072016-03-04T07:39:16.466-08:002016-03-04T07:39:16.466-08:00Fun problem, thanks! I posted a solution here: ht...Fun problem, thanks! I posted a solution here: https://github.com/AllenDowney/ThinkBayes2/blob/master/code/examples/voter.ipynbAllen Downeyhttp://www.blogger.com/profile/01633071333405221858noreply@blogger.com