A car dealership phone call looks simple from the outside: someone wants to ask about a vehicle, book a test drive, change an appointment, or reach the service desk.
Under the hood, it is a good example of why voice agents need workflow design, not just a nice prompt. The caller may mention a vehicle, a preferred time, a trade-in, finance questions, and location constraints in the same conversation. If the system only “chats”, it creates a summary. If it is designed as an agent, it can actually move the booking forward.
Here is the architecture pattern we use for dealership-style test drive calls.
The core call flow
A useful test-drive agent needs to handle a few jobs in order:
- Identify the caller’s intent.
- Capture the vehicle or category they are interested in.
- Check whether the next step is sales, service, finance, or human escalation.
- Offer appointment slots from the dealership calendar.
- Create or update the booking through a tool call.
- Send a clean handoff note to the sales team.
The important part is that the LLM should not be the source of truth. It should interpret the conversation and decide what tool to call. Stock availability, staff calendars, lead ownership, and appointment writes should come from deterministic systems.
A practical stack
A typical implementation looks like this:
Phone call
-> telephony provider
-> streaming speech-to-text
-> dialogue state + intent classifier
-> LLM response planner
-> dealership tools: stock, calendar, CRM, SMS
-> text-to-speech
-> caller hears the response
That “dialogue state” layer matters. Without it, every turn becomes a fresh guess. With it, the agent can remember that the caller asked about a used SUV, prefers Saturday, and wants a salesperson to follow up about finance.
A simplified state object might look like this:
{
"intent": "book_test_drive",
"vehicle_interest": {
"make": null,
"model": null,
"stock_id": null,
"notes": "caller described the car from the website"
},
"caller": {
"name": null,
"phone": null,
"preferred_contact_method": "sms"
},
"appointment": {
"preferred_day": null,
"confirmed_slot": null,
"location": null
},
"handoff_required": false
}
The LLM can help fill and update this object, but the booking should only happen after validation.
Tool calls beat long prompts
For this kind of agent, the safest design is a small set of explicit tools:
search_stock(query)check_test_drive_slots(location, vehicle_id)create_test_drive_booking(slot, caller, vehicle_id)send_confirmation(contact, booking_id)handoff_to_sales(reason, summary)
This keeps the model from inventing inventory or promising times that do not exist. The model’s job is to ask the next useful question and call the right tool when it has enough information.
Escalation is a feature, not a failure
A dealership voice agent should not try to answer everything. Finance, complaints, unusual vehicle history questions, and negotiation are often better handled by a person.
The agent should be confident about routine workflows and conservative about everything else:
- If stock data is unclear, hand off.
- If the caller is frustrated, hand off.
- If the caller asks for a binding finance answer, hand off.
- If the booking write fails, explain clearly and create a callback task.
This is how you avoid the worst version of AI automation: a caller stuck in a polite loop while the system pretends it can help.
Operational details that matter
The unglamorous pieces are usually what make the system feel reliable:
- Idempotent booking writes so a repeated tool call does not create duplicate appointments.
- Structured call summaries so sales staff can see the caller’s intent quickly.
- Consent-aware recording and transcription based on the market and business policy.
- Tool error handling with a fallback path instead of vague apologies.
- Human transfer rules that are visible to the dealership, not buried in a prompt.
The takeaway
A test-drive voice agent is not just “an LLM on the phone”. It is a conversational interface wrapped around dealership systems.
The agent should understand natural speech, but the workflow should still be boringly explicit: validate the vehicle, check the calendar, create the booking, confirm with the caller, and hand off anything risky.
That balance is what turns a demo into a production system.
Full VoiceFleet article: AI voice agent for car dealership test-drive booking













