How to Build an AI Receptionist with Vapi (Beginner Tutorial)

Why Build on Vapi Instead of Retell

If you already read our Retell AI tutorial, you know the fastest path to a working receptionist is a platform that bundles everything into one per-minute price. Vapi is the opposite philosophy, and for some agencies it is the better one.

Vapi is modular. Instead of one bundled price, you assemble the agent from parts you choose: the speech-to-text engine, the language model, the voice, and the telephony. You pay each provider separately, and Vapi adds a thin orchestration fee on top — around 0.05 dollars per minute. That sounds cheaper than Retell, and the raw platform fee is. But the honest all-in number is higher than the headline, because the transcriber, the model, and the voice each carry their own cost. We will add those up at the end so you price correctly.

The reason agencies choose Vapi anyway: control. You can swap any component, route calls to custom code, and build genuinely multi-tenant infrastructure through the API. If you have light development capability and you want to own the whole stack, this is the build. If you want it to work in an hour with no assembly, Retell is the gentler start. Our white-label platform guide compares the two side by side for resellers.

We will use a dental office as the running example, the same as the Retell walkthrough, so you can compare the two builds directly.

What You Need Before You Start

You can gather all of this in about 20 minutes:

A Vapi account (free to start, pay as you go after)
A Twilio account for the production phone number (about 1 dollar a month per local number) — Vapi also gives you a free number for testing
An LLM provider key (OpenAI or Anthropic) — unlike Retell, Vapi expects you to bring your own model key on most plans
A Cal.com account for booking (free tier, integrates cleanly through Vapi tools)

That is the stack. The extra key compared to Retell is the LLM provider — that is the trade-off for Vapi letting you pick and swap the model freely.

Step 1: Create the Assistant

Inside Vapi, go to Assistants and create a new one. An assistant in Vapi is the bundle of model, voice, transcriber, and prompt that defines how the agent behaves. You pick four things up front:

Setting	Recommended choice	Why
Model	GPT-4o mini or Claude Haiku	Fast and cheap for high call volume; upgrade only if reasoning is weak
Transcriber	Deepgram Nova-2	The default; lowest latency speech-to-text, about 0.01 per minute
Voice	Cartesia or PlayHT	Natural, low-latency, far cheaper than ElevenLabs for everyday use
First message mode	Assistant speaks first	A receptionist should greet the caller, not wait

Name the assistant something you can find later across many clients, like "Bright Smiles Dental — Reception." When you run 10 clients in one Vapi org, naming discipline saves you.

Step 2: Write the System Prompt

This is where 90 percent of the quality lives, and it is identical in importance whether you build on Vapi or Retell. A good receptionist prompt has five parts: identity, scope, knowledge, behavior rules, and a booking instruction.

Identity. You are the front desk assistant for Bright Smiles Dental in Austin, Texas. You are warm, brief, and efficient. If asked, you confirm you are an AI assistant for the practice.

Scope. You answer questions about hours, location, services, insurance, and new-patient onboarding. You book, reschedule, and cancel appointments. You do not give medical or clinical advice.

Knowledge. Hours: Monday to Friday, 8am to 5pm. New patient exams are 297 dollars or covered by most PPO plans. We do not currently accept Medicaid. Parking is free in the rear lot.

Behavior rules. Keep replies under two sentences. Ask one question at a time. If the caller describes a dental emergency with severe pain or bleeding, say you will connect them to staff and trigger a transfer. Never invent prices or insurance details — if you do not know, offer to take a message.

Booking instruction. When the caller wants an appointment, collect their full name, phone number, and preferred day, then call the booking tool to check availability and confirm.

Keep the whole prompt under 500 words. Long prompts slow the agent down and make it hallucinate. This scaffold is the single biggest time sink when you onboard clients, which is exactly why a ready-made prompt library and setup blueprint pays for itself on the first build.

Step 3: Set the Greeting and Interruption Behavior

Set the first message the assistant speaks. Disclose the business and invite a request in one breath:

Thank you for calling Bright Smiles Dental, this is the front desk assistant — how can I help you today?

That opener sounds professional and satisfies the AI-disclosure expectation in states like California where identifying as an automated assistant matters. For inbound calls it is a soft, friendly disclosure rather than a legal warning.

In Vapi, tune two settings that decide whether the agent sounds human. Set the interruption sensitivity so the agent stops talking the moment the caller jumps in, and set the response delay low so it does not leave awkward pauses. A receptionist that talks over people or pauses for two seconds sounds robotic instantly. Vapi exposes these as the start-speaking and stop-speaking plan settings — leave them near default at first, then tighten after your test calls.

Step 4: Connect Calendar Booking With a Tool

This is the feature that turns a toy into a paid service. A receptionist that only answers questions is worth maybe 99 dollars a month. One that books appointments straight into the client's calendar is worth 297 to 497.

In Vapi this is done with a tool (a function the assistant can call mid-conversation). You have two paths:

Native Cal.com tool. Vapi ships a built-in Cal.com integration. Create an API key in Cal.com, grab the event type ID for "New Patient Exam," and connect it as a tool on the assistant. This is the no-code path and the one most agencies use.
Custom function via server URL. For anything Cal.com cannot do, define a custom tool and point it at a server URL you control (a small endpoint, an n8n flow, or a Make scenario). Vapi sends the function call to your URL, your code books the slot, and returns the result to the agent. This is the developer path that makes Vapi powerful.

Then tell the agent in the prompt when to call the tool — after it has collected name, phone, and preferred time. Add a second tool for checking availability so the agent never offers a slot that is already full. Now when a caller says "can I come in Thursday morning," the agent checks real availability, offers two open slots, and writes the booking. That is the demo that closes clients.

Step 5: Attach a Phone Number and Test

For testing, use the free number Vapi assigns. For production, buy a local Twilio number that matches the client's area code (callers trust local numbers) and import it into Vapi, then assign it to your assistant.

Before you ever go live, call the number yourself and run these scenarios:

A normal booking ("I want a cleaning next week")
A pricing question ("how much is a new patient exam")
An out-of-scope question ("can you tell me if my tooth needs a root canal")
An interruption (start talking over the agent mid-sentence)
An emergency phrase ("my mouth is bleeding and I'm in a lot of pain")

You are listening for three things: latency under one second, no invented facts, and a clean handoff or message when the agent hits its limits. Open the call log in Vapi after each test — every transcript shows you where the prompt is weak. Expect to revise two or three times. The first version is never the deployed version.

Step 6: Go Live and Monitor

Point the client's existing business line to the Twilio number using call forwarding, or set the agent to handle overflow and after-hours only if the client is nervous. After-hours-only is a great low-risk way to land a skeptical first client.

Watch the first 20 to 30 calls in the Vapi call log. If you wired a server URL for booking, watch your endpoint logs too — a silent webhook failure is the most common reason a Vapi booking "works in testing" but drops live calls. Confirm every booking tool call returns a success the agent can read back to the caller.

What This Actually Costs You to Run

Here is the honest all-in number, per client, per month, at roughly 500 minutes of calls. This is where Vapi's modular pricing surprises beginners — the 0.05 per minute is only the platform fee.

Item	Cost
Vapi platform fee (about 0.05 per minute)	25 dollars
Deepgram transcription (about 0.01 per minute)	5 dollars
LLM tokens (GPT-4o mini or Claude Haiku)	4 to 8 dollars
Voice (Cartesia or PlayHT, usage based)	5 to 12 dollars
Twilio local number and minutes	3 to 6 dollars
Your all-in cost	about 42 to 56 dollars

Charge the client 297 dollars a month and your gross margin is still north of 80 percent. The lesson: Vapi's headline rate is lower than Retell's bundled rate, but once you stack the transcriber, model, and voice, the two land in the same neighborhood. Choose Vapi for control and flexibility, not because you expect it to be dramatically cheaper.

Common Mistakes That Sink Beginner Vapi Builds

Forgetting the stacked costs. Pricing a client off the 0.05 platform fee alone will quietly eat your margin. Use the all-in table above.
Unmonitored server URLs. If your booking runs through a custom endpoint, a failed webhook fails silently on the call. Log and alert on every tool call.
Bloated prompts. Over 500 words and the agent slows down and starts inventing answers. Cut ruthlessly.
No "I don't know" path. Always give the agent a fallback (take a message, offer a callback) so it never guesses on prices or insurance.
Premium voice you don't need. Cartesia and PlayHT are excellent and cheap. Do not add an ElevenLabs bill until a client asks for voice cloning.
Outbound calling without consent. Inbound is the safe zone. Outbound triggers TCPA rules — do not freelance there.

Vapi or Retell — Which Should You Resell?

Both end at the same place: a working AI receptionist you charge 297 to 497 a month for. The difference is the path.

Choose Retell if you want the fastest no-code build, a bundled price that is easy to quote, and a clean agency sub-account model. Start with the Retell tutorial.
Choose Vapi if you have light dev capability, want to swap providers freely, and plan to build custom booking or multi-tenant infrastructure through the API.

For a deeper reseller-economics breakdown of both, plus Synthflow and Bland, read the white-label platform guide.

Where to Go From Here

You now have a working AI receptionist on Vapi and the real cost math to price it. The slow part is doing it again for the next client — rewriting prompts, rebuilding the booking tool, and packaging the offer.

If you want the shortcut, the AI Receptionist Agency Launch System gives you the done-for-you version of this build: four guides covering client acquisition, pricing, the sales call, and onboarding, plus the templates you reuse on every deployment. The standalone prompt library and setup blueprint drop straight into the steps above so your second build takes 20 minutes instead of an hour. And the free ROI calculator is the demo you put in front of a prospect to justify the retainer on the first call.

Build one, demo it, charge for it. The platform is ready — the only thing between you and your first paying client is the offer.