Monday, November 21, 2011

Comment on "Racism and Meritocracy"

WARNING:  This article is on a topic that elicits emotional reactions.  I welcome comments, but please make them thoughtful and keep them civil.

Eric Ries wrote an article for TechCrunch last week, talking about racism and meritocracy among Silicon Valley entrepreneurs.  It's a good article; you should read it and then come back.

Although I mostly agree with him, Ries undermines his argument with a statistical bait-and-switch: he starts out talking about race, but most of the article (and the slide deck he refers to) are about gender.  Unfortunately, for both his argument and the world, the race gap is bigger than the gender gap, and it is compounded because racial minorities, unlike women, are minorities.

To quantify the size of the gap, I use data from Academically Adrift, a recent book that reports the results from the Collegiate Learning Assessment (CLA) database, collected by the Council for Aid to Education, "a national nonprofit organization ... established in 1952 to advance corporate support of education and to conduct policy research on higher education..."

"The CLA consists of three types of prompts within two types of task: the Performance Task and the Analytic Writing Task...The Analytic Writing Task includes a pair of prompts called Make-an-Argument and Critique-an-Argument.  
"The CLA uses direct measures of skills in which students perform cognitively demanding tasks... All CLA measures are administered online and contain open-ended prompts that require constructed responses. There are no multiple-choice questions. The CLA tasks require that students integrate critical thinking and written communication skills. The holistic integration of these skills on the CLA tasks mirrors the requirements of serious thinking and writing tasks faced in life outside of the classroom. "
This is not your father's SAT.  The exam simulates realistic workplace tasks and assesses skills that are relevant to many jobs, including (maybe especially) entrepreneurship.

On this assessment, the measured differences between black and white college students are stark.  For white college students, the mean and standard deviation are 1170 ± 179.  For black students, they are 995 ± 167.

To get a sense of what that difference looks like, suppose there are just two groups, which I call "blue" and "green" as a reminder that I am presenting an abstract model and not a realistic description.  This figure shows Gaussian distributions with the parameters reported in Academically Adrift:

The difference in means is 175 points, which is about one standard deviation.  If we select people from the upper tail, the majority are blue.  But the situation is even worse if greens are a minority.  If greens make up 20% of the population, the picture looks like this:

The fraction of greens in the upper tail is even smaller.  If, as Ries suggests, "Here in Silicon Valley, we’re looking for the absolute best and brightest, the people far out on the tail end of aptitude," the number of greens in that tail is very small.

How small?  That depends on where we draw the line.  If we select people who score above 1200, which includes 37% of the population, we get 6% greens (remember that they are 20% of the hypothetical population).  Above 1300 the proportion of greens is 3%, and above 1400 only 2%.

And that's not very "far out on the tail end of aptitude."  Above 1500, we are still talking about 3% of the general population, but more than 99% of them are blue.  So in this hypothetical world of blues and greens, perfect meritocracy does not lead to proportional representation.

Ries suggests that blind screening of applicants might help.  I think the system he proposes is a good idea, because it improves fairness and also the perception of fairness.  But if the racial gap in Y Combinator's applicant pool is similar to the racial gap in CLA scores, making the selection process more meritocratic won't make a big difference.

These numbers are bad.  I'm sorry to be reporting them, and if I know the Internet, some people are going to call me a racist for doing it.  But I didn't make them up, and I'm pretty sure I did the math right.  Of course, you are welcome to disagree with my conclusions.

Here are some of the objections I expect:

1) The CLA does not capture the full range of skills successful entrepreneurs need.

Of course it doesn't; no test could.  But I chose the CLA because I think it assesses thinking skills better than other standardized tests, and because the database includes "over 200,000 student results across hundreds of colleges."  I can't think of a better way to estimate the magnitude of the racial gap in the applicant pool.

2) The application process is biased against racial minorities and women.

The statistics I am reporting here, and my analysis of them, don't say anything about whether or not the application process is biased.  But they do suggest (a) We should not assume that because racial minorities are underrepresented among Silicon Valley entrepreneurs, racial bias explains a large part of the effect, and (b) We should not assume that eliminating bias from the process will have a large effect.

Of course, trying to eliminate bias is the right thing to do, whether the effect is big or small.


NOTE: The range of scores for the CLA was capped at 1600 until 2007, which changed the shape of the distribution at the high end.  For those years, the Gaussian distributions in the figures are not exactly right, but I don't think it affects my analysis much.  Since 2007, scores are no longer capped, but I don't know what the tail of the distribution looks like now.

EDIT 11-28-11: I revised a few sentences to clarify whether I was talking about representation or absolute numbers.  The fraction of greens in the population affects the absolute numbers in the tail but not their representation.


  1. The education gap is not the fault of individuals within minority groups, more so it is representative of the inequality that still persists within the educational and socio-economic system.

    Under this analysis it is hard to lay blame to the cause of the disparities. Its not a enough of answer to just say that minorities are less educated (which as a whole is unfortunately true) than their white counterparts. Access to quality education is still weighed heavily towards the majority and we are fighting an uphill battle if we are to compete for the same "meritocracy" assigned jobs. At the same time, we cannot discount the natural biases that still persists and understand that even though we're not the sharpest tool in the shed, entrepreneurship isn't based on who has the highest IQ.

    This is not to say that as minorities we need hand-outs in Silicon Valley to perform. All we're asking for is a chance to prove the naysayers wrong. However that "chance" to do it comes rare and far in between for minorities in an environment where failure is tolerated.

  2. @Unknown: These are good points. The analysis here says nothing about causes. And entrepreneurship is not an IQ test -- although one of the things I like about the CLA is that it is easy to see connections between the assessment tasks and important job skills, unlike Raven's matrices, for example. Thanks for the comment.

  3. Your comments aren't racist, far from it. They are devoid of critical nuances. These nuances are effectively covered in Eric's piece.

    Race is a complex matter. Bias is just as complex. The key item in all of this is "pattern recognition". We humans are "beasts of bias".

    The solution still boils down to under privileged or minorities understanding that for equality bootstrap one has to work harder and over achieve. That's the only solution!

  4. Of course the most disturbing data in Academically Adrift is the demonstration that college does *very little* to impove the performance of blacks relative to whites; I think it shows that the disparity goes *up* during college.

    1. That's true. The study tested first year students in 2005, and the same students again after two years of college. The average change was about 34 points, which is less than 1/5 of a standard deviation. For black students the average change was only six points. So after two years of college, the gap between black students and others had grown.

  5. What makes you suspect the scores are normally distributed? They could be.a power law distribution: it's certainly more similar to how income ends up distributed.
    IQ is only distributed normally because they explicitly adjust it to be one.

    1. Since the range of scores is restricted, it can't really be a power law. I don't know that it is normal, but I think it is a reasonable assumption. In this article I report raw SAT scores, which are a little flatter than Gaussian:

  6. The scores can't be power law distributed but the underlying attributes (which is after all what were really interested in) could be.

    Entrepreneurial skill could be power law distributed and this is a censored neasurement of it's distribution. Your arguments about meritocracy rest on this being a reasonable picture of true ability in this domain.

    1. I think you would have a hard time constructing believable distributions, with the observed mean and standard deviation for the two groups, that yield results substantially different from what I reported. So I don't think my argument rests on the choice of the Gaussian model.

      It does depend on my claim that the CLA is measuring important skills that are relevant to entrepreneurship. You can read about the CLA and see if you agree.

  7. Near the end of this criticism [1] of the way the authors of Academically Adrift report their conclusions is a hint that the CLA may not be a "reliable" test. Would this affect your argument?

    If the standard deviations reported are not only due to properties of the population but also to properties of the test, that would complicate the interpretation especially of the differences between groups or differences between years, right?


    1. Thanks for the pointer to the Chronicle article -- it's interesting and (in my opinion) right. One of the results from Academically Adrift that gets the most press is that 45% of students show no statistically significant improvement after two years of college. That is, if you run a test for each student, many of the results are negative.

      But as the author of the Chronicle article points out, if repeated tests for the same person yield substantial variation, the negative test doesn't mean much. So I agree that the authors of Academically Adrift should be more careful about the way they present this result.

      But for the argument I made above, none of this matters very much. I was making comparisons between groups, and the difference is large and statistically significant. So the reliability of the test is not really an issue.

      If the test is biased, of course, that would be an issue. But I have not seen anyone make that argument.