An IQ test is a standardised assessment designed to measure cognitive abilities and produce a score that represents an individual's intellectual capacity relative to the general population. The letters stand for Intelligence Quotient, a term coined by German psychologist William Stern in 1912. Modern IQ tests no longer calculate a literal quotient — they use deviation scoring — but the name has stuck. What they actually measure, how the scoring works, and what a result means in practice is considerably more nuanced than the shorthand suggests.
The History Behind the Test
The first practical intelligence test was commissioned in 1904 by the French Ministry of Education, which asked Alfred Binet and Théodore Simon to develop a tool for identifying children who needed additional educational support. The Binet–Simon scale, published in 1905, assigned tasks of increasing difficulty to age levels — a child who passed tasks typical of older children was said to have a higher "mental age."
William Stern's contribution was simple but consequential: he proposed dividing mental age by chronological age and multiplying by 100 to produce a single score. A child whose mental age matched their chronological age had an IQ of 100. Lewis Terman at Stanford adapted Binet's test for American use in 1916 — producing the Stanford–Binet — and the framework became dominant. David Wechsler introduced the deviation IQ method in 1939, replacing the age-ratio formula with a system that compared an individual's score to the distribution of scores for their age group. That method remains standard today.
What IQ Tests Actually Measure
IQ tests do not measure intelligence as a single entity. They measure a sample of cognitive abilities, and the composite score reflects how that sample compares to population norms. The most widely used batteries — the Wechsler Adult Intelligence Scale (WAIS), the Cattell Culture Fair Intelligence Test, the Stanford–Binet 5, and the Woodcock–Johnson IV — each measure somewhat different combinations of:
- Fluid reasoning — solving novel problems without relying on prior knowledge (matrix patterns, analogical reasoning)
- Verbal comprehension — vocabulary, verbal concept formation, the ability to reason with language
- Working memory — holding and manipulating information simultaneously
- Processing speed — how quickly cognitive operations are executed
- Spatial/perceptual reasoning — reasoning about visual patterns, orientations, and spatial relationships
The composite IQ score aggregates these separate indices. Profiles across the indices can differ substantially — a person might have a very high fluid reasoning score and average processing speed, or excellent verbal comprehension and lower working memory. The overall score obscures these distinctions.
How IQ Scoring Works
Modern IQ scores use a deviation system in which the mean score for any age group is set to 100, and the standard deviation is set to 15 — producing the familiar IQ bell curve. This means:
- About 68% of the population scores between 85 and 115 (within one standard deviation of the mean)
- About 95% scores between 70 and 130 (within two standard deviations)
- About 99.7% scores between 55 and 145 (within three standard deviations)
A score of 130 or above is conventionally used as a threshold for "gifted" classifications. A score of 70 or below is one criterion (alongside adaptive functioning assessments) for intellectual disability diagnoses. The vast majority of scores fall in the middle range, where differences of 10–15 points have modest practical implications.
Test scores are normed periodically because average performance rises over time — the Flynn Effect. A test standardised in 1990 and used unchanged today would produce inflated scores because the 1990 norms are now outdated relative to current average performance.
Major IQ Tests in Use
Several batteries dominate professional and research use:
| Test | Age range | Primary use |
|---|---|---|
| WAIS-IV / WAIS-5 | 16–90 | Clinical assessment of adults; most widely used |
| WISC-V | 6–16 | Clinical assessment of children and adolescents |
| Stanford–Binet 5 | 2–85+ | Broad age range; used in gifted identification |
| Woodcock–Johnson IV | 2–90+ | CHC-theory-aligned; educational and clinical contexts |
| Raven's Progressive Matrices | 5–65 | Culture-fair fluid intelligence measure; research and selection |
| Cattell Culture Fair Test | Adults | Minimises verbal and educational bias; industrial use |
Each battery has different strengths. The Wechsler scales are the most clinically comprehensive. Raven's Matrices are valuable when language or cultural factors could bias a broader battery.
What IQ Scores Predict
The predictive validity of IQ scores is among the most thoroughly replicated findings in psychology. IQ correlates meaningfully with:
- Academic achievement — correlation around 0.5, making it the single strongest individual-difference predictor of school performance
- Job performance — particularly for complex jobs; correlations typically 0.3–0.5
- Income — modest but consistent positive correlation
- Health outcomes — higher IQ is associated with lower risk of several chronic diseases, though the mechanisms are debated
- Longevity — replicated across multiple large samples, particularly in Scottish cohort studies
What IQ does not predict well: creativity beyond a threshold, practical social intelligence, emotional regulation, specific artistic or athletic talent, or life satisfaction. These require different assessments or simply can't be captured in a cognitive battery.
The Limits and Criticisms of IQ Testing
IQ testing has attracted persistent criticism, some well-founded and some overstated:
- Cultural and educational bias — verbal and knowledge-dependent subtests disadvantage people from cultures or educational backgrounds less represented in the norming samples. This is a genuine concern, which is why culture-fair measures (Raven's, Cattell) were developed.
- Narrow sampling — the abilities measured are a subset of cognitive capacity. Musical ability, kinaesthetic intelligence, and several other capacities are not captured. Howard Gardner's multiple intelligences theory was largely a response to this narrowness.
- Contextual variability — IQ scores are not perfectly stable. They are affected by test anxiety, sleep deprivation, nutritional status, and test familiarity. The score reflects performance at a point in time, not a fixed biological constant.
- Misuse in policy — the history of IQ testing includes serious abuse: forced sterilisation programmes, discriminatory immigration policy, and racially motivated research agendas. This history doesn't invalidate the measurement, but it warrants vigilance about application.
The mainstream scientific consensus is that IQ tests measure real and consequential aspects of cognitive ability, that they have meaningful predictive validity, and that they are imperfect instruments requiring careful interpretation — not oracles of destiny. A score is a calibrated estimate, not a verdict. If you're curious about your own cognitive profile, our free IQ test gives a structured assessment across several ability dimensions.
Frequently Asked Questions
What does IQ stand for and what does it mean?
IQ stands for Intelligence Quotient. It was originally calculated by dividing mental age by chronological age and multiplying by 100. Modern tests no longer use this formula — instead, they compare an individual's score to the distribution for their age group, with 100 set as the population mean and 15 as the standard deviation.
Is an IQ test the same as an intelligence test?
An IQ test is a type of intelligence test, but the category is broader. IQ tests measure a sample of cognitive abilities and produce a standardised score. Some intelligence tests focus on specific abilities (like fluid reasoning or working memory) without producing an overall IQ score. All IQ tests are intelligence tests; not all intelligence tests produce an IQ score.
What is a good IQ score?
100 is exactly the population average by design. Scores between 90 and 110 are considered average range. Above 120 is often described as superior. Above 130 is typically used as a threshold for gifted programmes. The meaning of a score depends heavily on what you're comparing it to and what you're trying to predict — for most everyday purposes, the differences between scores in the average range are modest.
Can IQ change over time?
Yes, though scores are relatively stable in adulthood. IQ scores rise significantly during childhood development. Fluid intelligence peaks in early adulthood and declines gradually with age. Crystallised intelligence continues growing into middle age. Scores can also be affected by education, health interventions, and cognitive engagement — the Flynn Effect shows that population-level IQ has risen about 3 points per decade over the 20th century.
Are online IQ tests accurate?
Most online IQ tests are not clinically validated and should not be taken as authoritative. They vary considerably in quality. Tests that use items similar to professional batteries (matrix reasoning, analogies, pattern completion) tend to be more informative than tests that measure only vocabulary or numerical sequences. Professional clinical assessment by a psychologist using a standardised battery remains the most accurate option when clinical-level precision matters.
