- Information
- AI Chat
Was this document helpful?
How To Lie With Statistics Summary - How to Read a Paper
Course: StuDocu Summary Library EN
999+ Documents
Students shared 1436 documents in this course
University: Studocu University - USA
Was this document helpful?
How To Lie With Statistics Summary
1. The Sample with the Built-in Bias
Response Bias: Tendency for people to over- or under-state the truth
Non-response: People who complete surveys are systematically different from those who fail to
respond. Accessibility/Pride.
Representative Sample: One where all sources of bias have been removed. (Literary Digest)
Questionnaire wording/Interviewer effects
Recall Bias: Tendency for one group to remember prior exposure in retrospective studies
The sample with a built-in bias : the origin of the statistics problems - the sample. Any statistic is
based on some sample (because the whole population can't be tested) and every sample has some
sort of bias, even if the person wanting the statistic tries hard to not create any. The built-in bias
comes from the respondents not replying honestly, the market researcher picking a sample that gives
better numbers, personal biases based on the respondent's perception of the market researcher, data
not being available at a certain past time are a few of the biases that creep in when building a
statistic. One of the example (from the 1950s) that the author mentions is a readership survey of two
magazines. Respondents were asked which magazine they read the most - Harpers or True love story.
Most respondents came back that they read the True Love Story, but that publisher's figures came
back that the True Love Story had a much higher circulation than Harpers - refuting the results from
the sampling. The reason for this discrepancy - people were not willing to respond due to their own
bias. As Dr.House says - Everybody Lies ! Summary of the chapter - given any statistic, question the
sample that was taken. Assume that there is always a bias in the sample
2. The Well-Chosen Average
Arithmetic Mean: Evenly distributes the total among individuals. Can be unrepresentative when
measurements are highly skewed right. (e.g. per capita income)
Median: Value dividing distribution into two equal parts. 50th percentile. (e.g. median household
income)
Mode: Most frequently observed outcome (rarely reported with numeric data)
The well-chosen average: how not qualifying an average can change the meaning of the data. Before I
delve into this, quickly, when I say, average - what comes to your mind? Sum(x1....xn) / N - right? The
arithmetic mean. But I said average, not arithmetic average did I? Not many people know that there
are 3 averages
Arithmetic average / mean - sum of quantities / number of quantities
Median - the middle point of the data which separates the data, the midpoint when data is sorted
Mode - the data point that occurs the most in a given set of data
And when someone says average, leaving it unqualified, there is a lot of room for juggling. The author
mentions a very simple example. If an organization publishes a statistic that the average pay of the
employees is $1000, what does this mean? This makes most of us think that almost everyone makes
around $2000 - the reader thinks it is the median. But, the corporation can be talking about an
arithmetic mean, where the boss might be earning say $10,500 and the rest of the 19 employees
earn $500 each - the arithmetic average. Just by not qualifying the average the published fact can be
completely twisted out of form from the real facts.The way out - always ask what is the kind of the
average that someone is talking about.
Students also viewed
Related documents
- How Languages are Learned
- The Ideological Origins of the American Revolution
- Summary and Analysis of Oroonoko or the Royal Slave
- The Giver THE Giver Summary - The Federalist Papers
- Style Lessons in Clarity and Grace Style BIG Collection Compiled - Symphony No. 2
- Simulacra and Simulation by Jean Baudrillard – summary