Fact-checked persuasion playbook with effect sizes — and the popular ideas that failed checking.

Making Leaders Willing to Dig In — The Evidence

The hardest problem in Jer's market isn't teaching character — it's getting successful, defended 45–60-year-olds to want the mirror. This is the fact-checked playbook: each approach verified against meta-analyses (empirical) and named theory (theoretical). Effect sizes quoted where they exist; weak or failed-replication claims flagged honestly — including the popular ones we should NOT use.

How to read the evidence

Every claim below carries the statistics it rests on. The letters, once:

d (Cohen's d) — the size of a difference between two groups, measured in standard deviations: (treated group's average − control group's average) ÷ the spread of scores. Rule of thumb: 0.2 = small, 0.5 = medium, 0.8 = large. Concretely: d = .65 means the average person who got the intervention ended up better off than roughly 3 out of 4 people who didn't.
g (Hedges' g) — the same number with a correction for small studies. Read it exactly like d.
r (correlation) — how tightly two things move together, from −1 to +1: .1 weak, .3 moderate, .5 strong. r says "these travel together," not "one causes the other."
N — the total number of people studied. "(94 studies, N=8,461)" means a meta-analysis pooling 94 separate experiments covering 8,461 people.
meta / meta-analysis — a study of studies: instead of trusting any single experiment, it pools every available one on a question and reports the combined effect. Every headline number in this document is meta-analytic unless noted.
ns — not statistically significant: the data can't rule out a zero effect.

Calibration: behavioral interventions rarely exceed d = .6 in honest meta-analyses, so on this list .3 is respectable and .65 is top of the class. Be suspicious of any program quoting d > 1 from a single study — that's usually how the claims this document rejects were sold.

Ranked: what works for this audience

1. Motivational interviewing principles (Miller & Rollnick) — STRONG

Evidence: Lundahl et al. 2010 meta (119 studies): g = 0.28 vs no-treatment; parity with longer treatments in less time. Mechanism metas (Magill 2014, 2018): MI-consistent behavior → client "change talk" (r = .55), which predicts change; confrontation evokes "sustain talk," which predicts failure.
The key concept — the righting reflex: telling a defended person what's wrong and what to do generates counter-argument. Evoking their own reasons generates commitment.
Apply: every results page and enrollment call asks, never asserts: "Which of these rings most true?" / "You said you've won at everything that doesn't matter — say more." Jer's natural style is already close to MI; this just bans the one failure mode (prescribing before evoking).

2. Self-affirmation before threat (Steele; Cohen & Sherman 2014) — STRONG

Evidence: two independent metas (Epton et al. 2015; Sweeney & Moyer 2015): affirmation before threatening feedback → behavior d ≈ .27–.32. Boundary condition that decides the design: works only BEFORE the threat (Critcher et al. 2010 — once someone has rationalized, affirmation does nothing), and the affirmed value must be UNRELATED to the threatened domain.
Apply: the Scorecard's affirmation screen (values question before results); never "you're clearly a strong leader, but…" — affirm family/faith/craft, then deliver the hard result.

3. Autonomy-supportive framing + real deselection (Self-Determination Theory) — STRONG

Evidence: Ng et al. 2012 (184 datasets): autonomy support → need satisfaction → autonomous motivation → maintained behavior; Ntoumanis et al. 2021 intervention meta: effects persist at follow-up via autonomous motivation. Controlling language produces introjected motivation, which decays — fatal for a 9–15-month formation arc.
The deselection paradox, corroborated: "but you are free to refuse" roughly doubled compliance in Carpenter's 2013 meta (honesty note: g = 0.11 and not statistically significant among the 7 most rigorous studies — treat as plausible, not proven); scarcity/commodity theory (Lynn 1991 meta) confirms restricted availability raises perceived value. Boundary: the screening and caps must be REAL — sophisticated buyers detect manufactured exclusivity and trust collapses.
Apply: keep and sharpen Jer's "this isn't for you if…" — it's structurally autonomy-supportive. Add the explicit free-choice close everywhere: "Whatever you decide, the diagnostic is yours to keep."

4. Psychological safety engineering + the AA-derived group structure — STRONG

Evidence: Frazier et al. 2017 meta (136 samples): psych safety robustly predicts voice, learning, performance; chief antecedent = leader behavior. The striking comp: Cochrane 2020 (Kelly, Humphreys & Ferri) — manualized 12-step facilitation beat CBT for continuous abstinence (RR 1.21, high-certainty evidence, persisting at 24–36 months). The rare case where a structured peer confession/accountability group outperforms professional treatment.
Transferable mechanisms (structure, not outcome claims): leader discloses first; senior members model self-disclosure; accountability dyads (the sponsor analogue); long duration; structured moral inventory + disclosure (Steps 4–5 are literally an immunity map plus a Dinner of Truth).
Apply: two cohort norms — (1) Jer (and later alumni) open every session with their own current character failure before anyone else speaks; (2) assigned accountability dyads between sessions. This converts the course's length from a marketing liability into its active ingredient.

5. Implementation intentions / facilitated WOOP (Gollwitzer; Oettingen) — STRONG

Evidence: Gollwitzer & Sheeran 2006 (94 studies, N=8,461): if-then planning d = .65 on goal attainment — the largest reliable effect on this list. WOOP meta (Wang et al. 2021): g = .34 overall, g = .47 live-facilitated vs .28 self-guided — facilitation nearly doubles it. Boundary: amplifies existing commitment; doesn't create it; pure positive visualization reduces effort (Oettingen).
Apply: every session's application step ends with a WOOP done live in cohort: wish → outcome → the inner obstacle (their named pattern from the Scorecard/immunity map) → if-then plan ("If I feel the urge to dominate the meeting, then I ask a question and wait").

Worth using, with eyes open

Anticipated inaction regret — STRONG correlations (Brewer et al. 2016 meta, N=45,618: intentions r = .50): frame regret of not acting ("who will you have become at 70 if nothing changes?"), always paired with a feasible next step.
Effort-justified admission (Aronson & Mills 1959; replicated; IKEA-effect descendants) — MODERATE: voluntary, meaning-linked effort at entry raises valuation and commitment. Apply: application essay + interview + pre-work before acceptance. Jer's mutuality screening, made effortful, becomes a commitment device.
Public, difficult, group goal-setting (Epton et al. 2017 meta, d ≈ .34; strongest exactly when goals are public + difficult + group) — Apply: cohort launch ritual — each member states one formation goal aloud.
Adult character change is real — cite the right literature: Hudson & Fraley's volitional-change studies (2020 mega-analysis): adults of all ages who set trait- change goals and practice trait-consistent behavior show measured trait growth in ~16 weeks. This is the honest answer to "I am who I am at 55" — use it instead of pop growth-mindset claims.
Narrative identity / life review (McAdams; Adler 2012: agency themes rise before mental-health improvement) — MODERATE, perfect demographic fit: the "re-author your first mountain" session (what it cost, what it was for) is empirically aligned, and it's the Clinton timeline exercise's research twin.
Best Possible Self (Carrillo et al. 2019 meta: optimism d = .33, positive affect d = .51) — the hope counterweight after hard feedback; feeds WOOP's wish/outcome steps. Apply: "best possible self at 70" written exercise, session 1.

Do NOT build on these (fact-check results)

Academic growth-mindset effect sizes — Sisk et al. 2018: d = 0.08; best-practice replications null (Macnamara & Burgoyne 2023). Real but tiny and concentrated in at-risk adolescents. Don't quote Dweck numbers at executives; use Hudson & Fraley instead.
Noun/identity micro-copy ("be a leader who…" vs "lead") — the famous "be a voter" effect failed large-scale replication (Gerber et al. 2018). Fine as brand poetry; never claim it as a mechanism.
Scarcity theatrics — only real caps, real screening.
Certificates as incentive — SDT warning: external rewards can undermine autonomous motivation for intrinsically-framed work. Keep Jer's certificate but reframe it as a commemoration of a publicly stated commitment (ties to the Epton public-goal moderator), not a credential.

How the persuasion architecture matches Jer's instincts (the happy audit)

Jer already does	The evidence says
Deselection ("not for you if…")	SDT autonomy support + choice framing — keep, sharpen, add free-choice close
Mutuality/fit screening	Effort-justification — make entry more effortful, not less
Crock-pot length	Psych-safety antecedents + AA duration — the length IS the mechanism; add leader-discloses-first + dyads to cash it in
"Heavy lifting" language	Anticipated-regret + difficult-goal moderators — heavy is attractive to the right buyer
Rohr-adjacent soul framing	Honor → reframe → invite (validate the first mountain as required curriculum, then make falling the elite move: "most successful people never make it to the second half")

The three highest-leverage, lowest-cost additions (nothing existing changes): pre-results affirmation screen, MI-style reveal/enrollment copy, and facilitated WOOP as the homework spine.

Making Leaders Willing — The Evidence