A Systematic Review of Theoretical Alignment and Empirical Support Against 100 Years of Personnel Selection Research
Abstract. Performance-based Hiring (PBH), developed by Lou Adler over 40+ years of executive search and recruiting practice, is an end-to-end hiring methodology built on the premise that defining jobs by performance objectives — rather than skills, credentials, or experience — produces superior hiring outcomes. This review systematically maps each component of PBH against the published industrial-organizational (I/O) psychology evidence base, including the landmark meta-analyses of Schmidt & Hunter (1998), the Sackett et al. (2022) revision, and the science of individuality articulated by Rose (2016). The review finds that PBH's component methods align with or incorporate the highest-validity predictors identified in 100 years of personnel selection research, and that PBH's composite methodology — combining rigorous job analysis, structured behavioral interviewing anchored to performance objectives, work-sample-equivalent assessment, and person-job fit evaluation — has an estimated composite operational validity exceeding any single selection method studied in the meta-analytic literature. The review also identifies gaps where formal empirical validation would strengthen PBH's evidence base.
Performance-based Hiring has been used by more than 50,000 recruiters and hiring managers across a wide range of industries and geographies over four decades. It has received legal validation from David Goldstein of Littler Mendelson (the largest employment law firm in the U.S.), an I/O psychology review by Dr. Charles Handler, theoretical alignment endorsement from Dr. Todd Rose of Harvard's Mind, Brain, and Education program, and an endorsement from Dr. Tom Janz — the industrial psychologist who founded Behavior Description Interviewing, the progenitor of all modern behavioral interviewing. A 1995 UCLA review also assessed the methodology's foundations.
Yet a common critique persists: PBH lacks the kind of peer-reviewed, journal-published empirical research that structured behavioral interviewing or cognitive ability testing can claim. This critique is accurate as far as it goes — PBH has not been the subject of a randomized controlled trial published in the Journal of Applied Psychology. But this critique also misunderstands how hiring methodology validation works in practice.
No comprehensive, end-to-end hiring methodology has ever been validated as a complete system in a single peer-reviewed study. Schmidt & Hunter (1998) studied 19 individual selection procedures. Sackett et al. (2022) revised the validity estimates for those same individual procedures. What has never been meta-analyzed is a methodology — an integrated system that specifies how job analysis, sourcing, interviewing, assessment, and recruiting interconnect. This is because methodologies are practiced in the field, not in labs.
What can be done — and what this paper does — is decompose PBH into its constituent components, map each component against the best available empirical evidence, estimate a composite validity, and assess the theoretical coherence of the overall system. This is the approach used by practitioners in I/O psychology when evaluating complex, multi-method selection systems.
The foundation of modern personnel selection research rests on two landmark publications and one critical revision:
Frank Schmidt and John Hunter's "The Validity and Utility of Selection Methods in Personnel Psychology" meta-analyzed 85 years of research across 19 selection procedures. Their findings established the hierarchy that dominated hiring science for two decades1:
| Selection Procedure | Validity (r) | Notes |
|---|---|---|
| GMA + Work Sample Test | .63 | Highest composite; experienced applicants only |
| GMA + Structured Interview | .63 | Usable for all applicants |
| GMA + Integrity Test | .65 | Highest composite overall |
| Work Sample Tests | .54 | Experienced applicants only |
| Structured Interviews | .51 | Highest standalone method for all applicants |
| General Mental Ability (GMA) | .51 | Previously considered the #1 standalone predictor |
| Job Knowledge Tests | .48 | Experienced applicants only |
| Unstructured Interviews | .38 | Typical corporate practice |
| Conscientiousness | .31 | Best single personality predictor |
| Reference Checks | .26 | As typically practiced |
| Years of Experience | .18 | Weak predictor despite ubiquitous use |
| Years of Education | .10 | Very weak standalone predictor |
In a paper Paul Sackett called "the most important paper of my career," the authors demonstrated that prior meta-analyses had systematically overcorrected for restriction of range, particularly for cognitive ability tests. Their revised estimates fundamentally reordered the validity hierarchy2:
| Selection Procedure | Schmidt & Hunter (1998) | Sackett et al. (2022) | Change |
|---|---|---|---|
| Structured Interviews | .51 | .42 | Now #1 standalone predictor |
| Job Knowledge Tests | .48 | .40 | Rose to #2 |
| Empirically Keyed Biodata | .35 | .38 | Rose to #3 |
| Work Sample Tests | .54 | .33 | Decreased, still strong |
| General Mental Ability | .51 | .31 | Dropped from #1 to #5 |
| Integrity Tests | .41 | .31 | Decreased |
| Situational Judgment Tests | — | .26 | New in revision |
| Person-Job Fit (Interests) | .10 | .24 | Large increase when measured as fit |
| Conscientiousness | .31 | .19 | Decreased substantially |
Critical finding for PBH: The Sackett et al. (2022) revision is profoundly favorable to PBH's theoretical framework. The methods that rose in the validity ranking are exactly the methods PBH emphasizes: structured interviews grounded in job analysis (.42), job knowledge assessment (.40), and person-job fit (.24, up from .10 when measured as fit rather than generic interests). The methods that fell — generic cognitive ability tests (.31, down from .51) and generic personality measures (.19, down from .31) — are exactly the methods PBH de-emphasizes in favor of job-specific assessment. As Sackett et al. noted: "The top selection procedures in terms of validity build on comprehensive job analysis" — which is PBH's foundational step.
In a follow-up analysis, Griebe, Bazian, Demeke, Priest, Sackett & Kuncel (2022) conducted a meta-analysis limited to cognitive ability validation studies conducted in the 21st century. They found a mean corrected validity of just .23 — dropping cognitive ability's rank from 5th to 12th among all predictors. They attributed this to the reduced role of manufacturing jobs and the growing importance of team-based, knowledge-economy work3.
Implication for PBH: PBH was designed for knowledge-economy hiring — roles where success is defined by deliverables, collaboration, and judgment rather than cognitive processing speed. The declining predictive validity of cognitive ability tests in modern work environments further validates PBH's emphasis on performance-objective-based assessment over general ability testing.
PBH consists of four integrated steps, each of which can be decomposed into empirically assessable components. The following analysis maps each PBH component to the corresponding I/O psychology evidence base.
Instead of listing required skills, credentials, and years of experience, PBH begins by defining 5-6 Key Performance Objectives (KPOs) — measurable outcomes that a top performer would accomplish within the first year. Each KPO starts with an action verb and specifies a deliverable, not a trait. Example: "Rebuild the marketing automation system within 6 months, increasing qualified lead flow by 30%" rather than "5+ years marketing automation experience, Marketo certified."
Job Analysis as Validity Foundation. Multiple meta-analyses confirm that assessments based on thorough job analysis show greater criterion-related validity than those developed without one (Dye, Reck & McDaniel, 1993; McDaniel, Morgeson, Finnegan, Campion & Braverman, 2001; Tett, Jackson & Rothstein, 1991; Wiesner & Cronshaw, 1988)4. Weekley et al. (2019) found a strong correlation (r ≈ .50) between job analysis importance ratings and actual criterion-related validities of measures of the same constructs — confirming that well-conducted job analysis directly predicts hiring accuracy5.
Sackett et al. (2022) Common Thread. Across the revised meta-analysis, the top-performing selection methods share one feature: they are anchored to comprehensive job analysis. Structured interviews (.42), job knowledge tests (.40), empirically keyed biodata (.38), and work sample tests (.33) are all job-specific methods built on understanding what the job requires. Generic, job-independent methods (cognitive ability at .31, generic personality at .19) showed lower and declining validity2.
Todd Rose's Context Principle. Rose (2016) demonstrated that individual behavior cannot be predicted apart from the specific situation. His Context Principle states that "a better starting point is to focus on the performance that we need the employee to perform, and the context in which that performance will occur."6 This is a precise articulation of what PBH's performance profile accomplishes — it defines the context-specific performance required, not abstract traits.
Person-Job Fit. Sackett et al. (2022) found that person-job fit (interests measured as fit between personal interests and specific job demands) had a validity of .24 — more than double the .10 validity of generic interests. When the criterion is job-specific, fit-based measurement dramatically outperforms generic measurement2. PBH's performance profile enables fit-based assessment by making the job's actual requirements explicit.
Evidence alignment: STRONG. PBH's insistence on beginning with performance-objective-based job analysis is supported by the strongest theme in modern selection research — that job-specific, analysis-grounded methods consistently outperform generic methods. The Sackett revision elevates this finding to the single most important factor in hiring validity.
PBH's core assessment tool is the MSA question: "Can you describe the most significant accomplishment you've had that is most comparable to [specific KPO]?" The interviewer then conducts a deep-dive "peeling the onion" fact-finding process — exploring the specific situation, the candidate's individual role versus team contributions, the technical details, the challenges overcome, the measurable results, and the environment in which the accomplishment occurred. This is repeated for each major KPO. The candidate's response is assessed using a structured scorecard against the performance objectives defined in Step 1.
Structured Behavioral Interviewing. The MSA question is a form of structured behavioral interview, which Sackett et al. (2022) identified as the single strongest predictor of job performance (r = .42)2. Specifically, it is a past-behavioral question — a format that Janz (1982) established and that multiple meta-analyses have shown to be among the most valid interview question types (Motowidlo et al., 1992; McDaniel et al., 1994; Huffcutt et al., 2014)7,8.
Work Sample Equivalence. The MSA question goes beyond typical behavioral questions because it asks candidates to describe accomplishments directly comparable to the actual job requirements. This functions as a verbal work sample — assessing what the candidate has actually produced in a comparable context. Work sample tests have a validity of .33-.54 depending on the meta-analysis and era (Schmidt & Hunter, 1998; Sackett et al., 2022; Roth, Bobko & McFarland, 2005)1,2. The MSA question captures work sample validity without the cost and logistics of physical work sample testing.
Tom Janz's Foundational Principle. Dr. Tom Janz, who introduced Behavior Description Interviewing in 1986, established the principle that "the best predictor of future performance is past performance in similar circumstances."9 The MSA question operationalizes this principle with greater precision than standard behavioral questions because it anchors the "similar circumstances" to the specific performance objectives of the target role. Janz's endorsement of PBH as an improvement on BDI is significant — the founder of behavioral interviewing considered PBH a better implementation of his own foundational insight.
Deep-Dive Fact-Finding vs. STAR Format. Standard behavioral interviewing uses the STAR (Situation-Task-Action-Result) framework, which often elicits rehearsed, surface-level responses. PBH's "peeling the onion" technique goes deeper: it probes the specific environment (team size, budget, resources, organizational culture), the candidate's individual contribution versus team effort, the technical methodology, the metrics of success, and the timeline. This level of detail functions as a de facto reference check and integrity verification — candidates cannot fabricate accomplishment details at this level of specificity without detection. This addresses the "faking" problem that plagues standard behavioral interviews and that Topgrading's TORC (Threat of Reference Check) attempts to solve through external verification rather than interview technique.
Evidence alignment: VERY STRONG. The MSA question combines the three highest-validity standalone predictors identified by Sackett et al. (2022) — structured behavioral interviewing (.42), job knowledge assessment (.40), and work sample equivalence (.33) — in a single, integrated assessment technique. No other interview methodology explicitly combines all three in one question format.
PBH uses a structured Talent Scorecard that assesses candidates across multiple dimensions tied to the performance objectives. The scorecard includes assessment of technical competence (via MSA responses), management and organizational fit, team skills, motivation and cultural fit, and a "Trend Line" analysis examining the candidate's pattern of growth over their career. Each dimension is rated on a consistent scale, and the scorecard is completed independently by each interviewer before group deliberation.
Behaviorally Anchored Rating Scales (BARS). Google's research, published through re:Work, found that structured assessment tools (what they call "rubrics" and researchers call BARS) are significantly more predictive than unstructured interview impressions. Rejected candidates who experienced structured interviews with rubrics were 35% more satisfied than those who didn't — indicating the approach is also perceived as fairer by candidates10.
Independent Assessment Before Deliberation. Research on group decision-making in hiring shows that anchoring effects and authority bias distort group consensus when assessors share impressions before completing independent evaluations (e.g., Dipboye, 1982; Posthuma, Morgeson & Campion, 2002). PBH's requirement that each interviewer complete the scorecard independently before the hiring debrief mitigates these effects.
Multi-dimensional Assessment (Jaggedness). Todd Rose's Jaggedness Principle — that "talent is always jagged" and "we cannot apply one-dimensional thinking to understand something that is complex" — is directly embodied in PBH's multi-dimensional Talent Scorecard6. Rather than reducing a candidate to a single "A/B/C Player" rating (as Topgrading does) or a single interview score, PBH maps a jagged profile across job-relevant dimensions. Rose specifically cites performance-based hiring in The End of Average as an example of an approach that correctly matches individuals to contexts rather than ranking them against averages.
Evidence alignment: STRONG. PBH's scorecard methodology is consistent with best-practice structured assessment research and explicitly embodies Rose's Jaggedness Principle. The multi-dimensional, job-anchored, independently-rated approach addresses known biases in hiring deliberation.
PBH reconceives recruiting as a mutual career assessment rather than a sell-or-screen binary. The methodology includes: (a) presenting the job as a career opportunity rather than a lateral move, (b) the "30% non-monetary increase" framework — assessing whether the role offers at least 30% more combined growth, challenge, and job satisfaction, (c) recruiting passive candidates who are not actively seeking but are open to career conversations, and (d) making offers based on opportunity rather than compensation alone. This transforms the employer's approach from "does this person meet our requirements?" to "does this role represent a genuine career move for this person?"
Person-Job Fit and Retention. The Sackett et al. (2022) revision showed that person-job fit, when measured as the match between individual interests and specific job demands, had a validity of .24 — significantly higher than the .10 found for generic interests2. PBH's career-move assessment is an informal person-job fit evaluation conducted during the recruiting process — before an offer is made.
Rose's Pathways Principle. Rose's third principle — that "for any given goal, there are many equally valid ways to reach the same outcome" — validates PBH's approach of evaluating diverse candidate backgrounds against role-specific outcomes rather than against a standardized credential profile. This opens the talent pool to non-traditional candidates who can demonstrate equivalent accomplishments through different career pathways6.
Passive Candidate Theory. PBH's estimate that 75-85% of the talent market consists of passive candidates (fully employed, not actively looking) and that the best candidates are disproportionately represented in this group is supported by LinkedIn's research showing that approximately 70% of the global workforce is passive talent, and that passive candidates are 120% more likely to want to make an impact at their new company (LinkedIn Talent Solutions, 2023).
Evidence alignment: MODERATE-STRONG. The recruiting component of PBH has strong theoretical alignment but less direct empirical validation than the assessment components. The person-job fit data and passive candidate research support the approach; what is lacking is a controlled study comparing PBH-recruited candidates' retention and performance against traditionally-recruited candidates.
Schmidt & Hunter (1998) demonstrated that combining multiple selection methods produces composite validities significantly higher than any single method. The three highest-validity composites they identified were1:
| Combination | Composite Validity |
|---|---|
| GMA + Integrity Test | .65 |
| GMA + Work Sample Test | .63 |
| GMA + Structured Interview | .63 |
PBH's MSA-based interview methodology effectively combines elements of multiple high-validity methods in a single, integrated process:
| PBH Element | Equivalent Selection Method | Sackett (2022) Validity |
|---|---|---|
| Performance Profile (KPOs) | Comprehensive Job Analysis | Foundation — amplifies all other methods |
| MSA Question (past behavior) | Structured Behavioral Interview | .42 |
| MSA Deep-Dive (comparable work) | Work Sample Test (verbal) | .33 |
| KPO-anchored assessment | Job Knowledge Test | .40 |
| Career-move evaluation | Person-Job Fit (interest-based) | .24 |
| Talent Scorecard (multi-rater) | Behaviorally Anchored Rating Scale | Amplifies structured interview validity |
| Trend Line analysis | Biographical Data (career pattern) | .38 |
Using Schmidt & Hunter's framework for estimating composite validity from multiple predictors (accounting for intercorrelations between methods), we can construct a conservative estimate of PBH's composite operational validity:
The primary PBH assessment combines structured interviewing (.42) with work sample equivalence (.33) and job knowledge assessment (.40), anchored by comprehensive job analysis. Using the intercorrelation data from Berry, Lievens, Zhang & Sackett (2024) and applying the standard composite validity formula with estimated intercorrelations of r = .30-.40 between the component methods11:
Conservative composite validity estimate: .50 to .60
This range is derived from the composite of structured interview + work sample + job knowledge + person-job fit assessment, anchored by job analysis, using the Sackett (2022) revised validity estimates and published intercorrelation matrices. The lower bound (.50) assumes high intercorrelation among methods (r ≈ .40, meaning significant overlap in what they predict); the upper bound (.60) assumes moderate intercorrelation (r ≈ .25, meaning each method adds meaningful unique predictive variance). Either estimate exceeds the validity of any single selection method in the Sackett et al. (2022) ranking and approaches the composite validities of the best two-method combinations identified by Schmidt & Hunter (1998).
Important caveat: This is a theoretical composite validity estimate based on the assumption that PBH, as implemented, achieves the full validity potential of each component method. Actual validity depends on implementation quality — the quality of the performance profile, the skill of the interviewer, the rigor of the scorecard assessment. This is equally true of structured interviews in general: Sackett et al. (2022) noted an 80% credibility interval for structured interviews ranging from .18 to .66, meaning implementation quality creates enormous variation. PBH's training methodology and structured tools (scorecard, KPO framework, MSA question format) are designed to reduce this implementation variance, but the claim of .50-.60 composite validity should be understood as the ceiling achievable with competent implementation, not a guaranteed floor.
Todd Rose's The End of Average (2016) provides the most robust theoretical framework for understanding why PBH works and why traditional hiring fails. Rose's three principles of individuality map precisely onto PBH's methodology6:
Rose demonstrates that all meaningful human qualities are "jagged" — they consist of multiple dimensions that are weakly correlated with one another. A person who excels at strategic thinking may be mediocre at execution speed; a brilliant individual contributor may struggle with team leadership. Traditional hiring collapses this jagged profile into a single "thumbs up/thumbs down" judgment or a one-dimensional "A Player" ranking. PBH's Talent Scorecard preserves the jagged profile by assessing candidates across multiple, independently-rated dimensions tied to the specific performance requirements of the role. This allows hiring managers to see where a candidate's strengths align with the role's demands and where development will be needed — rather than making a binary accept/reject decision based on an averaged impression.
Rose's Context Principle states that "individual behavior cannot be explained or predicted apart from a particular situation." A person who was a top performer at a fast-moving startup may struggle in a bureaucratic enterprise — not because their abilities changed, but because the context did. PBH's performance profile specifies the context in which performance must occur, not just the outcomes. By defining the team structure, organizational culture, available resources, pace of change, and stakeholder landscape in which the KPOs must be achieved, PBH enables assessment of context-specific fit rather than generic capability.
Rose explicitly names this approach in his book: "Instead of focusing on the 'essence' of the employee, the context principle suggests that a better starting point is to focus on the performance that we need the employee to perform, and the context in which that performance will occur."6 He then specifically labels this "performance-based hiring."
Rose's Pathways Principle holds that "for any given goal, there are many, equally valid ways to reach the same outcome." In hiring terms: there is no single "right" background for a role. By defining jobs by outcomes rather than credentials, PBH inherently opens the talent pool to candidates who arrived at comparable accomplishments through non-traditional pathways — career changers, self-taught professionals, people from underrepresented backgrounds who lacked access to traditional credential-building institutions. This is the theoretical foundation for PBH's DEI advantage: when you assess what people have done rather than where they went to school or which companies they've worked for, you remove structural barriers that credential-based hiring perpetuates.
Theoretical alignment: COMPLETE. Rose's three principles of individuality — developed through rigorous research at Harvard's Mind, Brain, and Education program — provide a comprehensive theoretical foundation for PBH. The alignment is not incidental; it reflects a shared insight that aggregate-based, trait-based hiring fails because it ignores the fundamental nature of individual human potential. Rose's work gives PBH a theoretical grounding in the science of individuality that is stronger than the theoretical framework for any competing hiring methodology.
PBH has received validation from multiple independent sources across legal, academic, and practitioner domains:
David Goldstein, a partner at Littler Mendelson (the largest employment law firm in the United States), conducted a comprehensive legal review of Performance-based Hiring and provided a formal whitepaper validating its compliance with employment law. His conclusion: PBH's emphasis on performance objectives rather than credentials, combined with its structured, evidence-based assessment, provides a legally defensible hiring process that complies with the complex array of statutes, regulations, and common law principles governing the workplace12. This legal validation is particularly significant because PBH's approach — de-emphasizing credentials in favor of demonstrated performance — could theoretically raise legal questions about departures from traditional job qualification standards. Goldstein's analysis confirmed that defining jobs by performance outcomes is not only legally permissible but arguably more legally defensible than credential-based job descriptions, which can have disparate impact on protected classes.
Dr. Charles Handler, an I/O psychologist and founder of Rocket-Hire, contributed a whitepaper to the 3rd edition of Hire with Your Head reviewing the methodology's alignment with I/O psychology best practices. Handler bridges the academic-practitioner divide in assessment psychology and his endorsement carries weight in the professional assessment community13.
Dr. Tom Janz, the industrial psychologist who introduced Behavior Description Interviewing in 1986 — the methodology that became the foundation for all modern behavioral interviewing — endorsed PBH as an improvement on standard behavioral interviewing. Janz established the foundational principle that "the best predictor of future performance is past performance in similar circumstances."9 PBH operationalizes this principle with greater precision than standard BDI by anchoring "similar circumstances" to specific, pre-defined performance objectives rather than generic competency questions. The endorsement of the founder of behavioral interviewing is significant: it represents the originator of the field acknowledging that PBH advances beyond his own seminal contribution.
Todd Rose, director of Harvard's Mind, Brain, and Education program, explicitly identifies "performance-based hiring" as the application of his Context Principle in The End of Average (2016). His three principles of individuality (Jaggedness, Context, Pathways) provide the most comprehensive theoretical framework for PBH's methodology and explain why traditional, average-based hiring systematically fails6.
The University of California, Los Angeles conducted an early review of PBH's foundational methodology, providing academic institutional assessment of the approach. While older, this review represents early academic engagement with PBH's core principles.
PBH has been implemented by more than 50,000 recruiters and hiring managers across a wide range of industries, company sizes, and geographies over four decades. While practitioner adoption is not a substitute for controlled empirical research, the scale and duration of this implementation is itself a form of evidence: methodologies that do not produce results do not sustain adoption over four decades and 50,000+ practitioners across diverse contexts.
The critical insight from the meta-analytic literature is that no single selection method is sufficient. Schmidt & Hunter (1998) demonstrated that composites always outperform single methods. But in practice, most organizations implement selection methods in silos: they use an ATS for screening (keyword-based, low validity), an unstructured interview for assessment (.38 at best), and a gut-feel decision for final selection (validity unknown but likely near zero).
PBH's distinctive contribution is not that it invented a new assessment technique — it's that it integrated the highest-validity techniques into a coherent, end-to-end system that hiring managers can actually use. Consider how PBH compares to common practices:
| Hiring Approach | Methods Used | Estimated Composite Validity |
|---|---|---|
| Typical Corporate Hiring | Resume screen + unstructured interview | .20-.30 |
| Skills-Based Hiring | Skills test + semi-structured interview | .30-.40 |
| Google-Style Structured | Structured behavioral + rubric scoring | .35-.45 |
| Topgrading | Chronological interview + TORC references | .35-.45 |
| AI-Powered Screening | Resume AI + video assessment | .20-.35 |
| Performance-based Hiring | Job analysis + structured MSA + work sample equiv. + job knowledge + fit + scorecard | .50-.60 |
Note: The comparative validity estimates for competing methods are approximations based on the published validity of their component methods and reasonable assumptions about implementation quality. They have not been empirically validated as composites. The same caveat applies to PBH's estimate — it represents theoretical ceiling with competent implementation.
This review identifies PBH as theoretically well-grounded and empirically supported at the component level. However, intellectual honesty requires acknowledging the gaps:
Gap 1: No controlled outcome study. PBH has not been the subject of a published predictive validation study comparing the job performance of PBH-selected employees versus traditionally-selected employees, controlling for job type, level, and organizational context. This is the single most important evidence gap. Recommendation: Partner with an I/O psychology research group (university or independent) to conduct a quasi-experimental study across 3-5 organizations that have recently implemented PBH, comparing quality-of-hire metrics (first-year performance ratings, retention at 12 months, hiring manager satisfaction) for PBH-selected versus pre-PBH-selected cohorts.
Gap 2: No published inter-rater reliability data. The Talent Scorecard's value depends on different interviewers producing consistent assessments of the same candidate. Inter-rater reliability data has not been published. Recommendation: Collect and analyze scorecard data from organizations where multiple interviewers independently assess the same candidates using PBH tools. Report Cohen's kappa or ICC (intraclass correlation) coefficients.
Gap 3: Limited diversity-outcome data. PBH's DEI claims — that performance-objective-based hiring opens the talent pool to more diverse candidates — are theoretically strong (grounded in Rose's Pathways Principle and in the legal analysis of credential-based disparate impact). But published data on diversity outcomes for PBH-implementing organizations would substantially strengthen the claim. Recommendation: Aggregate anonymized demographic data from PBH implementations to measure whether PBH-selected candidate pools and hires are more diverse than traditionally-selected pools at the same organizations.
Gap 4: No longitudinal retention data. PBH's "hire for the anniversary date" and career-move assessment components predict that PBH-selected employees should have higher retention. This prediction has not been empirically tested in a published study. Recommendation: Conduct a longitudinal study tracking 12-month and 24-month retention rates for PBH-selected versus traditionally-selected employees.
This review reaches five principal conclusions:
First, PBH's component methods align with the highest-validity predictors identified in 100 years of personnel selection research. The 2022 Sackett et al. revision is particularly favorable to PBH's framework because it elevated job-analysis-grounded, structured assessment methods and diminished generic trait-based methods — exactly the shift PBH has advocated since its inception.
Second, PBH's composite methodology — combining rigorous job analysis, structured behavioral interviewing anchored to performance objectives, work-sample-equivalent assessment, job knowledge evaluation, and person-job fit assessment — has an estimated composite operational validity of .50-.60, exceeding any single selection method and approaching the best two-method composites in the meta-analytic literature.
Third, PBH has strong theoretical grounding in the science of individuality, as articulated by Todd Rose's three principles (Jaggedness, Context, Pathways). Rose explicitly cites performance-based hiring as an application of his Context Principle. This theoretical framework is more comprehensive than the theoretical grounding of any competing hiring methodology.
Fourth, PBH has been independently validated or endorsed by a Littler Mendelson employment attorney (legal), an I/O psychologist (Handler), the founder of behavioral interviewing (Janz), a Harvard researcher in the science of individuality (Rose), and 50,000+ practitioners over 40+ years. This is a broader validation portfolio than most commercial hiring methodologies can claim.
Fifth, the most important evidence gap is the absence of a published, controlled outcome study. Filling this gap — through a partnership with an I/O psychology research group conducting a quasi-experimental study across multiple organizations — would transform PBH's evidence profile from "theoretically well-grounded and component-validated" to "empirically validated as a complete system." This study should be PBH's highest strategic priority for evidence development.
In sum: Performance-based Hiring does not lack evidence. It lacks one specific type of evidence — the published, controlled outcome study — while possessing legal validation, I/O psychology endorsement, theoretical grounding in the science of individuality, alignment with 100 years of meta-analytic research, and 40+ years of large-scale practitioner validation. By any reasonable standard, this is a strong and credible evidence base. Strengthening it further with formal empirical research would not build the case from scratch — it would close the last gap in an already substantial foundation.
1 Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274.
2 Sackett, P. R., Zhang, C., Berry, C. M., & Lievens, F. (2022). Revisiting meta-analytic estimates of validity in personnel selection: Addressing systematic overcorrection for restriction of range. Journal of Applied Psychology, 107(11), 2040–2068.
3 Griebe, S., Bazian, N., Demeke, S., Priest, E., Sackett, P. R., & Kuncel, N. (2022). Meta-analysis of cognitive ability and job performance: New data. Poster presented at the Annual Conference of the Society for Industrial and Organizational Psychology.
4 McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001). Use of situational judgment tests to predict job performance: A clarification of the literature. Journal of Applied Psychology, 86(4), 730–740. See also: Dye, D., Reck, M., & McDaniel, M. A. (1993); Tett, R. P., Jackson, D. N., & Rothstein, M. (1991); Wiesner, W. H., & Cronshaw, S. F. (1988).
5 Weekley, J. A., Hawkes, B., Guenole, N., & Ployhart, R. E. (2019). Job analysis ratings and criterion-related validity: Are they related and can validity be used as a measure of accuracy? Journal of Occupational and Organizational Psychology, 92(4), 764–786.
6 Rose, T. (2016). The End of Average: How We Succeed in a World That Values Sameness. New York: HarperOne.
7 Motowidlo, S. J., Carter, G. W., Dunnette, M. D., et al. (1992). Studies of the structured behavioral interview. Journal of Applied Psychology, 77(5), 571–587.
8 McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The validity of employment interviews: A comprehensive review and meta-analysis. Journal of Applied Psychology, 79(4), 599–616.
9 Janz, T., Hellervik, L., & Gilmore, D. C. (1986). Behavior Description Interviewing: New, Accurate, Cost Effective. Newton, MA: Allyn and Bacon.
10 Google re:Work. (n.d.). Guide: Use structured interviewing. Retrieved from https://rework.withgoogle.com/guides/hiring-use-structured-interviewing/
11 Berry, C. M., Lievens, F., Zhang, C., & Sackett, P. R. (2024). Insights from an updated personnel selection meta-analytic matrix: Revisiting general mental ability tests' role in the validity-diversity trade-off. Journal of Applied Psychology, 109(10), 1611–1634.
12 Goldstein, D. (2013). Legal validation whitepaper. In L. Adler, The Essential Guide for Hiring & Getting Hired. Workbench Media.
13 Handler, C. (2007). I/O psychology review whitepaper. In L. Adler, Hire with Your Head (3rd ed.). John Wiley & Sons.
14 Sackett, P. R., Zhang, C., Berry, C. M., & Lievens, F. (2023). Revisiting the design of selection systems in light of new findings regarding the validity of widely used predictors. Industrial and Organizational Psychology, 16(3), 283–300.
15 Adler, L. (2021). Hire with Your Head: Using Performance-Based Hiring to Build Outstanding Diverse Teams (4th ed.). Hoboken, NJ: John Wiley & Sons.
16 Huffcutt, A. I., Culbertson, S. S., & Weyhrauch, W. S. (2014). Moving forward indirectly: Reanalyzing the validity of employment interviews with indirect range restriction methodology. International Journal of Selection and Assessment, 22(3), 297–309.
17 Wingate, S., et al. (2025). Evaluating interview criterion-related validity for distinct constructs: A meta-analysis. International Journal of Selection and Assessment.
18 Roth, P. L., Bobko, P., & McFarland, L. (2005). A meta-analysis of work sample test validity: Updating and integrating some classic literature. Personnel Psychology, 58(4), 1009–1037.
Disclosure: This working paper was prepared for The Adler Group, Inc. and represents a systematic review of published research aligned to Performance-based Hiring methodology. It is not a peer-reviewed journal article. Composite validity estimates are theoretical and based on published component validities; they have not been empirically validated for PBH as a complete system. The author recommends formal empirical validation as described in Section 8.
© 2026 The Adler Group, Inc. All rights reserved.