
Trust Calibration: The UX Problem That Breaks AI Adoption
The Feature That No One Uses
Metrics After 3 Months:
- AI accuracy: 92% (exceeds target)
- User adoption: 18% (misses target by 62pp)
User interview #1: "I don't trust it. What if it's wrong?"
User interview #2: "I trust it completely. It's AI!"
User interview #3: "I tried it once. It gave a weird answer. Never used it again."
The diagnosis: Not an accuracy problem. A trust calibration problem.
Your users don't know when to trust the AI and when to double-check. So they default to extremes: never trust, or always trust. Both kill adoption.
The Trust Calibration Spectrum
Under-Reliance Appropriate Reliance Over-Reliance
(Zero Adoption) (Goldilocks Zone) (Dangerous)
↓ ↓ ↓
User ignores AI User checks AI on hard User blindly accepts
even when it's cases, accepts on easy all AI outputs,
correct cases including errors
Click to examine closelyThe Goal: Design UX that pushes users toward appropriate reliance—trust when the AI is confident and correct, double-check when it's uncertain or error-prone.
Why Trust Calibration Fails (Three Anti-Patterns)
Anti-Pattern 1: No Confidence Signal
Bad UX:
AI Result: "The patient likely has Type 2 Diabetes." [No indication of confidence]Click to examine closely
User Mental Model: "Is this 60% confident or 99% confident? I have no idea. Better ignore it."
Good UX:
AI Result: "The patient likely has Type 2 Diabetes." Confidence: High (94%) Reasoning: Elevated HbA1c (7.2%), fasting glucose (140 mg/dL), BMI 32Click to examine closely
Why It Works: User knows this is a high-confidence prediction. They can trust without blind acceptance (they see the reasoning).
Anti-Pattern 2: Invisible Errors
Bad UX:
- AI makes mistake on edge case
- User discovers error during critical moment (e.g., client meeting)
- User loses trust permanently
User Mental Model: "It was wrong once. I can't trust it anymore."
Good UX:
- AI flags uncertain predictions: "Low Confidence (61%)—manual review recommended"
- User expects occasional low-confidence outputs
- Trust isn't binary (perfect or broken)—it's calibrated per prediction
Why It Works: Users develop mental model: "Green = trust, yellow = verify, red = don't use." They don't abandon the tool after one error.
Anti-Pattern 3: No Feedback Loop
Bad UX:
- User corrects AI mistake
- AI doesn't learn
- Same mistake repeats
User Mental Model: "Why bother correcting it if nothing changes?"
Good UX:
- User marks AI output as incorrect
- System logs feedback: "Thanks! We'll improve this prediction type."
- Next week, similar case → AI gets it right
- User sees: "We improved accuracy on [case type] based on your feedback"
Why It Works: User feels agency. Trust isn't "take it or leave it"—it's a partnership.
Real Example: Legal Research AI
Feature: AI suggests relevant case law for attorneys.
Initial Design (Under-Reliance):
- AI returns 20 cases
- No confidence scores
- No reasoning
- Attorneys ignore AI, manually search Westlaw (zero adoption)
Redesign 1: Add Confidence:
- AI returns 20 cases with confidence scores (High/Medium/Low)
- Attorneys trust High-confidence cases (75% adoption on those)
- Still ignore Medium/Low (overall adoption: 35%)
Redesign 2: Show Reasoning:
- High-confidence cases show why (keyword match, citation frequency, jurisdiction)
- Medium-confidence cases flag risk: "This case is from a different jurisdiction—verify applicability"
- Attorneys now use Medium-confidence cases as research leads (adoption: 62%)
Redesign 3: Feedback Loop:
- Attorneys mark cases as "relevant" or "not relevant"
- AI learns: "Cases from 9th Circuit often irrelevant for this attorney (practices in 2nd Circuit)"
- Precision improves from 68% → 79% over 3 months
- Adoption hits 81% (attorneys trust the AI because it adapts to their practice)

