vapi-webhook-timeoutVapihigh

Webhook Timeout

Vapi webhook delivery timed out — your server did not respond within the required time window.

What this error means

This error occurs when Vapi sends a webhook event to your configured server URL and the server fails to respond within Vapi's timeout window (typically 5 seconds for synchronous webhooks). Vapi uses webhooks to deliver real-time call events such as assistant-request (to determine which assistant to use for an inbound call), function-call (to execute a tool call from the AI assistant), end-of-call-report, and others. When a synchronous webhook — particularly assistant-request or function-call — times out, Vapi cannot continue the call flow, which typically results in the call being dropped or the assistant falling back to an error state. Unlike asynchronous webhooks (status updates, transcripts), synchronous webhooks are on the critical path of every call.

Root causes

critical

Webhook handler performs synchronous database queries or external API calls that block the response

Common

high

Server is under high load and cannot process the webhook request within the timeout window

Common

high

Cold start latency from serverless functions (Lambda, Vercel Edge, Cloud Functions) exceeding the timeout

Common

high

Webhook handler awaiting a response from a slow downstream service (LLM, database, CRM)

Occasional

medium

Network latency between Vapi's servers and your webhook server is too high for the timeout budget

Occasional

high

Webhook handler stuck in an infinite loop or blocking synchronous I/O operation

Rare

How to fix it

  1. 1

    Measure and profile your webhook handler response time

    Add timing instrumentation to your webhook handler to measure end-to-end processing time. Log the duration for every webhook call. Identify which operations are taking the most time — database queries, external API calls, or business logic. Any synchronous webhook handler must consistently respond in under 4 seconds to leave margin before Vapi's 5-second timeout.

  2. 2

    Move heavy processing out of the synchronous webhook path

    For Vapi webhooks where you need to return a response (like assistant-request or function-call), respond immediately with the minimum required data and defer any non-blocking work to a background queue. Never await slow operations like database writes, email sends, or analytics events inside a synchronous webhook handler.

    // Vapi webhook handler — respond fast, process asynchronously
    app.post('/vapi/webhook', async (req, res) => {
      const { message } = req.body;
      
      if (message.type === 'function-call') {
        const { functionCall } = message;
        
        // For fast, cacheable lookups — respond inline
        if (functionCall.name === 'get_customer_name') {
          const name = await cacheLayer.get(`customer:${functionCall.parameters.phone}`);
          return res.json({ result: name || 'Unknown' });
        }
        
        // For slow operations — process async, return a placeholder quickly
        setImmediate(() => processSlowOperation(functionCall, req.body.call.id));
        return res.json({ result: 'Processing your request...' });
      }
      
      // Acknowledge non-critical events immediately
      res.sendStatus(200);
    });
  3. 3

    Reduce cold start latency for serverless functions

    If your webhook runs on AWS Lambda, Vercel, or another serverless platform, cold starts can add 500ms-3000ms of latency. Use provisioned concurrency (Lambda) or configure minimum instances (Cloud Run) to keep your webhook handler warm. Move the function to an always-on compute option if cold starts are a recurring issue.

  4. 4

    Pre-cache data needed for webhook responses

    Identify data your webhook needs to construct responses (e.g., assistant configurations, customer records, context data) and pre-populate a fast cache (Redis, Memcached, in-memory) so that webhook handlers can retrieve this data in under a millisecond rather than querying a database.

    // Pre-warm assistant configuration cache at startup
    const assistantCache = new Map();
    
    async function warmAssistantCache() {
      const assistants = await db.query('SELECT * FROM assistants WHERE active = true');
      for (const assistant of assistants) {
        assistantCache.set(assistant.phone_number, assistant);
      }
      console.log(`Cached ${assistants.length} assistants`);
    }
    
    // Refresh cache periodically
    setInterval(warmAssistantCache, 60 * 1000);
    
    // In webhook handler — uses cache, not DB
    app.post('/vapi/webhook', (req, res) => {
      const { message } = req.body;
      if (message.type === 'assistant-request') {
        const to = message.call.to;
        const assistant = assistantCache.get(to);
        if (!assistant) return res.status(404).json({ error: 'No assistant configured' });
        return res.json({ assistant: buildAssistantConfig(assistant) });
      }
      res.sendStatus(200);
    });
  5. 5

    Add a request timeout to all downstream API calls within the webhook

    Set explicit timeouts on every downstream API call made from within your webhook handler (e.g., database queries, CRM lookups, LLM calls). This prevents a single slow external service from causing your entire webhook to miss Vapi's timeout. If a downstream call times out, return a sensible default instead of waiting indefinitely.

  6. 6

    Monitor webhook latency and alert on P95 > 3 seconds

    Instrument your webhook endpoint with percentile latency metrics (P50, P95, P99). Alert when P95 latency exceeds 3 seconds — this gives you a 2-second buffer before Vapi's timeout and time to investigate before user impact occurs.

  7. 7

    Deploy your webhook server geographically close to Vapi's infrastructure

    Check where Vapi routes traffic from (US-based infrastructure) and deploy your webhook server in the same or a nearby region to minimize network round-trip time. A webhook server in the same AWS region as Vapi's services will have significantly lower network latency than one hosted on a different continent.

  8. 8

    Implement webhook idempotency to safely handle Vapi retries

    Vapi may retry webhooks after a timeout. Ensure your webhook handler is idempotent — processing the same event twice should produce the same result without side effects (double-charges, duplicate records, etc.). Use the call ID and event type as an idempotency key.

    const processedEvents = new Set();
    
    app.post('/vapi/webhook', async (req, res) => {
      const { message } = req.body;
      const eventKey = `${message.call?.id}:${message.type}:${Date.now()}`;
      
      // Idempotency check for retried events
      if (processedEvents.has(eventKey)) {
        console.info('Duplicate webhook event, skipping:', eventKey);
        return res.sendStatus(200);
      }
      processedEvents.add(eventKey);
      // TTL cleanup — remove after 5 minutes
      setTimeout(() => processedEvents.delete(eventKey), 300000);
      
      // Process the event...
      res.sendStatus(200);
    });

Prevention

Prevent Vapi webhook timeouts by treating the 5-second response window as a strict SLA and engineering your webhook handlers to respond in under 2 seconds under normal load. Pre-cache all data needed for synchronous webhook responses at startup. Use in-process synchronous operations (cache hits, preloaded configuration) for the critical path and defer all I/O-bound work to background queues. Run load tests on your webhook endpoint before go-live to validate latency under production traffic levels. Set up continuous latency monitoring with automated alerts so degradation is detected and resolved before it causes call failures.

Debugging this right now?

Sherlock diagnoses vapi-webhook-timeout automatically. Just ask in Slack and get an instant root-cause analysis.

Add to Slack — Free