What Makes a Personality Test Scientifically Valid?
Before comparing specific personality assessments, it's worth understanding what scientific validity actually means in this context. A personality test earns scientific credibility through two core properties:
- Reliability — does the test produce consistent results? If you take the test twice two weeks apart without any meaningful life change, you should get the same or very similar results. This is measured as test-retest reliability, with values above 0.80 considered excellent.
- Validity — does the test measure what it claims to measure, and do those measurements predict real-world outcomes? Predictive validity is most important for professional applications: does a high Conscientiousness score actually predict better job performance in practice?
The history of personality assessment is littered with tools that feel insightful but fail one or both of these criteria. Understanding the difference protects you from acting on personality data that doesn't reliably represent your actual traits.
The Big Five: The Scientific Gold Standard
The Big Five model (also called OCEAN or the Five-Factor Model) is the most empirically validated personality framework in the history of psychology. It emerged independently from multiple research traditions starting in the 1960s — lexical studies of personality-describing words, factor analyses of questionnaire data, and clinical observation — and consistently converged on the same five dimensions: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism.
Key validation evidence:
- Cross-cultural replication — the five-factor structure has been replicated in over 50 countries and languages (McCrae & Costa, 2003)
- Test-retest reliability — 0.85–0.90 over 4-week intervals; trait stability demonstrates these measure actual enduring characteristics
- Heritability — twin studies show 40–60% heritability for each trait, confirming biological substrate
- Predictive validity — meta-analyses consistently show meaningful prediction of job performance (r ≈ 0.23 for Conscientiousness), relationship satisfaction, physical health outcomes, and longevity
- Incremental validity — Big Five scores add meaningful prediction beyond IQ alone in most outcome domains
The free Big Five assessment on JobCannon measures all five dimensions with approximately 50 items, providing reliable trait estimates comparable to commercial versions.
MBTI: Useful but Scientifically Limited
The Myers-Briggs Type Indicator (MBTI) is the world's most commercially used personality assessment — over 2.5 million administrations per year — and also one of the most academically criticized. Understanding both dimensions prevents both dismissal and overcrediting.
Scientific limitations of MBTI:
- Forced categorical typing — the binary I/E, N/S, T/F, J/P system doesn't match empirical evidence that personality traits are continuous, not categorical. People near the middle of any dimension get inconsistent type assignments on retesting.
- Test-retest reliability concerns — studies show 39–76% type consistency over 4-week intervals, meaning 24–61% of people get a different 4-letter type on retest. This primarily affects people near type midpoints.
- Weaker predictive validity — MBTI types predict job performance less reliably than Big Five trait scores in most meta-analyses
- Jungian theoretical foundation — the underlying theory predates modern empirical personality science and hasn't been directly validated
Genuine practical strengths of MBTI:
- High face validity — most people find their type descriptions accurate and resonant, which drives engagement with self-reflection
- Accessible framework — four dimensions with memorable 4-letter codes are easier for teams to internalize than five continuous traits
- Moderate dimensional convergence — MBTI I/E correlates at r ≈ 0.72 with Big Five Extraversion; N/S correlates with Openness; J/P correlates with Conscientiousness
- Widespread organizational familiarity — MBTI language is understood in most large organizations, making it a useful common vocabulary
Take the free MBTI assessment for self-reflection and communication use; use Big Five for consequential career and clinical decisions.
Enneagram: Insight Without Strong Scientific Validation
The Enneagram is the least scientifically validated of the three major frameworks discussed here. Its theoretical origins are not empirically derived — the system emerged from spiritual and contemplative traditions and was formalized as a personality system in the 1970s. Peer-reviewed validation research is limited relative to Big Five, and test-retest reliability varies significantly across different Enneagram instruments.
Despite these limitations, the Enneagram provides genuine psychological insight that users report as valuable for self-understanding and growth work. Its strength is motivational depth — explaining the underlying fears and desires that drive behavior patterns, rather than just describing the behavior. This makes it a useful complement to Big Five and MBTI for developmental work, even if it's not appropriate for predictive or selection contexts.
What Personality Tests Can and Cannot Predict
Understanding the limits of personality assessment prevents misapplication:
| Outcome | Predictive Tool | Effect Size |
|---|---|---|
| Job performance (overall) | Big Five Conscientiousness | r ≈ 0.23 (moderate) |
| Leadership effectiveness | Big Five Extraversion + Openness | r ≈ 0.25–0.31 |
| Team harmony | Big Five Agreeableness | r ≈ 0.20 |
| Burnout risk | Big Five Neuroticism | r ≈ 0.35 (strong) |
| Creative achievement | Big Five Openness | r ≈ 0.45 (strong) |
| Sales performance | Big Five Extraversion + Conscientiousness | r ≈ 0.28 |
Effect sizes in personality prediction are typically smaller than those for IQ in cognitive tasks. This doesn't mean personality is unimportant — it means personality is one of several important predictors, not the only one. The most predictive approach combines cognitive assessment, personality assessment, structured interview, and work sample tests (Roberts, 2009).
Why People Get Different Results on Retesting
Many people notice they get different results when retaking personality assessments, which understandably creates skepticism. The variation has several explanations:
- State vs. trait measurement confusion — if you take a test when stressed, depressed, or in an unusual emotional state, your responses reflect current state, not stable trait. Validated assessments minimize this through question design, but can't eliminate it completely.
- Near-midpoint instability (especially MBTI) — if your true score is near the midpoint of a dimension, small random variation in responses produces type-switching. This is a measurement artifact, not personality instability.
- Genuine trait development — Big Five traits do slowly change with age and life experience. Conscientiousness increases significantly between 20 and 40; Neuroticism tends to decrease with age. Retesting years later may reflect real change.
- Test quality differences — different implementations of the "same" framework vary in reliability. A 10-question free quiz and a 300-item validated instrument measure the same constructs with very different accuracy.
How to Get the Most From Personality Assessment
Practical guidelines for using personality data wisely:
- Use validated instruments — longer assessments (40+ items) consistently outperform short quizzes in reliability. The Big Five assessment at JobCannon uses 50 items for this reason.
- Answer in a neutral state — take assessments when not stressed, sleep-deprived, or emotionally activated for most stable results
- Read trait descriptions as tendencies, not destiny — a score represents a statistical tendency, not a fixed behavior. Everyone can act against their tendencies; personality just predicts the direction of natural effort.
- Use results for environmental matching, not limitation — personality data is most valuable when used to identify high-fit environments, not to justify avoiding growth or label other people
- Cross-validate across frameworks — if your Big Five Extraversion score and your MBTI I/E dimension agree, your confidence in that trait should be higher than if they conflict
The combination of Big Five and MBTI assessments provides both the scientific rigor of trait-based measurement and the accessible self-reflection framework that helps people actually use the results in daily professional life.