All case studies
Financetwilio

Cutting Voice AI Costs 34% by Exposing Hidden Duplicate Calls

A leading digital mortgage broker's voice AI spend was climbing without explanation — Sherlock revealed duplicate calls and inefficient TTS usage draining their budget.

34%

Cost reduction

<2 min

Avg debug time

$12K

Monthly savings

TL;DR — What Sherlock found

  1. 1

    A digital mortgage broker's voice AI spend was climbing without explanation — the team suspected pricing changes but had no way to audit thousands of daily calls.

  2. 2

    Sherlock analyzed per-call cost breakdowns, surfacing duplicate outbound attempts and excessive TTS tokens consumed on calls that failed before completing.

  3. 3

    Redundant calls were eliminated, prompt lengths optimized, and the team renegotiated their plan — saving $12K per month immediately.

The Problem

Monthly voice AI costs were climbing with no clear explanation. The team suspected provider pricing changes, but had no way to audit thousands of daily calls.

The Process

Sherlock analyzed cost breakdowns per call, identified duplicate outbound attempts and excessive TTS token usage on failed calls, and surfaced optimization opportunities.

The Solution

Armed with Sherlock's per-call cost data, the team eliminated redundant calls, optimized prompt lengths, and renegotiated their plan — saving $12K/month.

Results

  • 34% reduction in monthly voice costs
  • Average debug time dropped to under 2 minutes
  • $12K saved every month

Use Cases

Questions the team asked Sherlock

  • SC
    Why is my Twilio bill higher than expected this month?
  • SC
    Which calls are consuming the most TTS tokens?
  • SC
    Show me all duplicate outbound attempts in the last 7 days
  • SC
    What is my per-call cost breakdown by agent?
  • SC
    Find calls where TTS was used but the call failed before connecting

Deep Dive

The full story

For a digital mortgage broker processing thousands of AI-assisted calls each day, unexpected cost growth is an operational alarm bell. When the team noticed their voice AI spend climbing month-over-month without a corresponding increase in successful conversations, they suspected provider pricing changes — but had no way to verify the hypothesis across thousands of individual call records.

Sherlock's cost investigation started with a per-call breakdown across all outbound attempts. Two patterns immediately stood out: a significant number of calls were being attempted two or three times in rapid succession — duplicate attempts triggered by a race condition in the orchestration layer — and many failed calls were consuming substantial TTS tokens before the connection error occurred. Together, these two inefficiencies accounted for a large share of the unexplained cost growth.

Armed with Sherlock's evidence, the team made three changes: eliminating the race condition causing duplicate outbound attempts, shortening prompts used during connection negotiation to minimize TTS consumption on failed calls, and renegotiating their usage plan based on corrected volume projections. The combined result was a 34% reduction in monthly voice AI spend — approximately $12,000 saved every month — and a dramatic drop in the average time to debug any call cost question, from over an hour to under two minutes.

Cost visibility in voice AI operations doesn't require engineering overhead. Sherlock makes per-call cost data queryable from Slack in plain English — so operations and finance teams can audit spend, spot anomalies, and take action without waiting on engineering cycles.

FAQ

Frequently asked questions

How does Sherlock surface hidden voice AI costs that standard dashboards miss?

Standard provider dashboards show aggregate cost metrics — total spend, average call cost — but don't expose per-call cost breakdowns or cross-call patterns like duplicate attempts. Sherlock queries the full call record set, computes per-call costs including TTS usage, and lets you ask questions like 'which calls consumed the most tokens without converting' or 'show me calls where we paid for TTS on a call that failed' in plain English from Slack.

What is duplicate call detection and how does it reduce costs?

Duplicate call detection identifies outbound calls placed to the same number within a short time window that weren't triggered by a user callback. These often result from race conditions in orchestration systems — the call is placed, a timeout triggers a retry, and the original and retry both connect, doubling the cost of that conversation. Sherlock flags these patterns so engineering teams can fix the root cause rather than paying for the redundancy indefinitely.

How quickly do cost savings appear after implementing Sherlock?

Cost visibility is immediate from day one — as soon as Sherlock is connected to your provider accounts, you can query your call cost data and identify inefficiencies. Actual savings depend on how quickly your team implements the changes Sherlock surfaces, but most teams see meaningful reductions within the first billing cycle.

Can non-technical teams use Sherlock to audit call costs without engineering support?

Yes. Sherlock is designed for operations, finance, and product teams — not just engineers. You ask questions in plain English from Slack, and Sherlock returns sourced answers with the data behind them. No SQL, no API calls, no dashboard navigation required. Engineering is only involved when implementing fixes, not in the investigation itself.

How do I set up cost alerts in Sherlock for unusual spending patterns?

From the Sherlock Slack app, you define a cost threshold — for example, 'alert me if total daily spend exceeds $500' or 'alert me if any single call costs more than $2' — and Sherlock monitors your call data continuously. Alerts are posted to the Slack channel of your choice as soon as the condition is met, with the call details attached.

What is prompt length optimization and how does it reduce TTS costs?

TTS (text-to-speech) costs are billed per character or per second of synthesized audio. Prompts used during call setup, confirmation steps, or failed connection scenarios often contain unnecessary filler text that inflates costs without improving the conversation. Sherlock identifies which prompts are being used on failed or short calls, allowing teams to trim them without affecting successful conversations.

Ready to stop guessing?

Let Sherlock investigate your voice calls. Find failures, cut costs, and get answers — all from Slack.