The Klarna case: what it actually proves (and what it doesn't)
In 2024, Klarna announced its AI support agent handled 2.3 million customer conversations in its first month — equivalent to 700 full-time employees — cutting average resolution time from 11 minutes to 2 minutes. It became the most cited AI support case study in business media.
What most summaries omit: Klarna also reported that the AI produced higher error rates on complex cases and had to be pulled back from certain query types. The lesson isn't "AI replaces support teams." It's "AI handles 60–80% of ticket volume extremely well, and the other 20–40% still needs humans — but those humans are now handling only interesting, complex problems."
That's the realistic frame for this guide. AI customer support agents work when designed with clear scope, proper escalation, and ongoing calibration.
Step 1: Audit your ticket types before building anything
Export the last 90 days of support tickets and categorize them. For most businesses, the breakdown looks like this:
- 40–55%: order status, shipping questions, account access — answerable from a database lookup
- 15–20%: product questions — answerable from a knowledge base
- 10–15%: returns and refunds — answerable with policy, may require system action
- 10–15%: complaints and escalations — requires human judgment and empathy
- 5–10%: complex, unusual, or high-value cases — requires senior human involvement
The first three categories — 65–90% of volume — are strong candidates for automation. The last two categories should always go to humans. Design your agent around this split from the start.
If you skip this audit and build a generic AI agent, you'll hit two problems: the agent will handle easy tickets well but fail publicly on hard ones, and you won't have a baseline to measure improvement against.
Step 2: Build the knowledge base (this is where projects succeed or fail)
An AI support agent is only as good as the information it can access. The knowledge base needs:
- Product catalog — full specs, compatibility information, variants, pricing
- Policy documents — return policy, shipping policy, warranty terms, refund timelines
- FAQ content — but written in natural language, not Q&A pairs
- Procedural knowledge — "how to track an order," "how to reset your password," etc.
- Edge cases and exceptions — what to do when standard policy doesn't apply
Most teams underestimate how long this takes. A well-maintained knowledge base for a mid-size e-commerce operation takes 2–4 weeks to build properly. Document everything in plain language, not internal jargon. The AI interprets what you've written; if the document says "contact T2 escalation" it won't know what that means.
Plan for maintenance: product changes, policy updates, and new exception types need to be added to the knowledge base within 24–48 hours of going live, or the agent starts giving wrong answers.
Step 3: Design the escalation paths (before you write a single message)
Escalation design is the most important architecture decision in AI support. You need explicit rules for:
- Immediate escalation — keywords: "lawyer," "refund now," "this is illegal," "I want to speak to a manager," "I'm going to post about this" → route to senior human instantly
- After-3-turns escalation — if the agent has tried 3 different answers and the customer still expresses dissatisfaction → human handoff
- Confidence threshold — if the AI isn't confident in its answer (score below a threshold you set), it should say "let me connect you with a specialist" rather than guessing
- High-value customer escalation — connect to CRM to detect VIP customers and route them preferentially
The handoff message matters. "Let me transfer you to a human agent" is better than "I can't help with that." Give the human agent the full conversation context automatically — they should never have to ask the customer to repeat themselves.
Step 4: Choose your architecture
There are three main technical approaches, each with different trade-offs:
Option A: Plug-in to existing helpdesk (fastest, least flexible)
Tools like Intercom, Zendesk, and Freshdesk have built-in AI features or AI app integrations. Set up time: 1–2 weeks. Best for: teams already using these platforms who want AI on top of existing workflows. Limitation: you're bound by the platform's AI capabilities and pricing.
Option B: Purpose-built AI agent with helpdesk integration (balanced)
Build an AI agent on a platform like n8n, Make, or a custom stack, integrated with your existing helpdesk via API. More flexible, allows custom escalation logic and deeper knowledge base integration. Set up time: 3–5 weeks. Best for: businesses with specific policies or multi-channel needs.
Option C: Fully custom AI agent (most powerful, highest investment)
Custom LLM-powered agent with direct system integrations (ERP, CRM, order management). Handles complex multi-step tasks like initiating returns, updating orders, or applying discount codes. Set up time: 6–12 weeks. Best for: high-volume operations where the AI needs to take actions, not just answer questions.
Step 5: Run shadow mode before going live
Shadow mode means the AI generates responses to every incoming ticket but doesn't send them — your team reviews and approves each one. Run shadow mode for 1–2 weeks. During this period:
- Measure what percentage of AI responses your team would have sent as-is (target: 70%+ before going live)
- Identify the top 10 ticket types where the AI consistently fails or needs editing
- Update the knowledge base for each failure category
- Adjust escalation triggers based on what you see
Teams that skip shadow mode and go directly to live often have a rough first week that damages customer satisfaction scores and creates agent distrust of the AI system.
Cost benchmarks
Industry data on cost per interaction:
- Human support agent handling ticket: $6–15 per interaction (fully loaded cost including overhead)
- AI agent handling ticket: $0.10–0.70 per interaction (LLM API cost + platform cost + maintenance allocation)
- Hybrid (AI handles first contact, human closes complex): $1.50–4.00 per interaction
For a support operation handling 5,000 tickets/month at $8 average cost, moving 70% to AI at $0.40/interaction saves approximately $26,000/month — before accounting for faster resolution and higher availability.
The 30/60/90-day optimization plan
Days 1–30 (stabilize): target 50% autonomous resolution rate. Track daily: escalation rate, customer satisfaction score (CSAT), false escalation rate (customer escalated but issue was resolvable by AI). Fix knowledge base gaps weekly.
Days 31–60 (expand scope): add 2–3 new ticket categories to AI scope based on what you've learned. Target 65% autonomous resolution. Start tracking containment rate (how often AI fully resolves without any human touchpoint).
Days 61–90 (optimize): analyze the most common human-handled tickets. Which ones can be automated with new knowledge base content? Which truly require human judgment? Set final scope. Target 70–80% autonomous resolution for stable operation.
Most implementations that fail do so because they stop at day 30 — the initial setup — and don't do the iterative improvement that generates most of the ROI. The first month is setup. The second and third months are where efficiency gains compound.
Realistic expectations by business type
- E-commerce (high volume, structured queries): 75–85% autonomous resolution achievable by month 3
- SaaS / tech product: 60–70% autonomous resolution; technical complexity limits higher rates
- Professional services: 40–55% autonomous; most value comes from triage and scheduling, not resolution
- B2B enterprise accounts: 30–45% autonomous; relationship complexity means humans stay central
For a complete ROI model tailored to your ticket volume, see our AI automation ROI calculation guide. For total project cost benchmarks, the AI automation pricing guide has current market rates.