A restaurant phone agent script is eight sections. Here is the template, read off a real 102 second call.
Most guides about generating a script for a restaurant AI phone agent stop at principles. Be context-aware. Upsell naturally. Handle modifiers. None of that builds a working ticket. This page is the eight-section template, every section anchored to a timestamp on a real PieLine production call.
Direct answer (verified 2026-05-05)
Is there a public script generator that produces working restaurant phone scripts? Not one that hits POS modifier IDs.
Voiceflow, Vapi, and Synthflow templates stop at the conversation layer. The eight-section template below covers the conversation layer plus the four sections (menu-lookup buffer, modifier elicitation, non-menu modifier confirm-back, POS placement) where most rolled-your-own projects stall. PieLine ships all eight sections during onboarding from a menu scrape plus a POS modifier map.
The script is in the file, not in the principles
The captions array on src/components/voice-activity-data.ts has 46 rows, each with a speaker, a start time, an end time, and a line of text. The file was generated by Deepgram multichannel from public/audio/dennys-order.mp3. It is a real Denny's order PieLine ran end to end. Total duration 102.36 seconds. Total dollar amount on the ticket $34.11. Pickup quoted at 12:45 AM.
Read the captions in order and a structure falls out. There is a greeting block, a buffer block, a modifier elicitation block, a recap block, an upsell block, a non-menu-modifier block, a final confirm block, and a POS placement block. Eight sections, in that order. Every working restaurant phone script we have seen has those eight, in roughly those proportions. That is the template.
The rest of this page walks each section, with the literal lines and the literal timestamps, so you can write or evaluate a script against something concrete instead of a list of adjectives.
Section timing on the Denny's call
How the template gets generated for your restaurant
The eight sections stay the same across cuisines. The contents of section 3 (modifier elicitation) and section 6 (non-menu modifier handling) read your menu and your POS, which is where most in-house projects burn weeks. PieLine handles this in a four-step onboarding.
- 1
Menu scrape
PieLine pulls your live online menu and parses items, prices, and dish-level descriptions. Spiciness, sweetness, ingredients, allergen flags.
- 2
POS modifier map
Each dish maps to a POS item ID. Each cuisine modifier (half-and-half toppings, eggs prep, spice level, protein sub) maps to a POS modifier ID.
- 3
Cuisine grammar wire
Modifier types declared per dish so section 3 of the script asks for them in the right order. No free-text notes.
- 4
Live call refinement
First month is hands-on. PieLine listens to real calls, tunes the upsell line, the buffer wording, and any cuisine-specific phrases the brand wants in the agent voice.
The eight sections, with literal lines and timestamps
Each section below shows the exact text and the timing window from the Denny's call. Use it as a template against your own scripts. If a section is missing or runs in the wrong order, the ticket will be wrong.
1. Greeting plus recorded-line disclosure
Three short sentences in the first 3.44 seconds. Identifies the brand, sets the recording expectation, and asks one open question.
Literal Denny's opener at 0.00s to 3.44s:
“Hi. This is Denny on a recorded line. What can we get for you?”
Three lines, one question. The recorded-line disclosure is in the opener, not buried later. The open question is “what can we get for you,” not “how can I help you today,” because this is an order line, not a help line.
2. Initial order capture plus menu-lookup buffer
Customer states the order. Agent says one short verbal placeholder while it fetches menu items and modifier defaults from the POS. Reserve a 4-second window.
Captured order at 5.36s to 9.36s:
Customer: “Hi. Yeah. Can I get one lumberjack slim and one Coke?”
Buffer line at 10.96s to 11.84s:
Agent: “One moment, please.”
The next agent word does not land until 15.98s. That is a 4.14 second silence on purpose, while the agent fetches modifier defaults for the lumberjack slam (“how would you like your eggs cooked, what bread”) and writes the Coke onto the ticket as the soft drink modifier. Without the “one moment” line, the caller will start talking over the lookup. With it, the caller waits.
3. Modifier elicitation in declared order
Agent reads the modifier types attached to each ordered dish and asks for them in a fixed order. Cuisine-specific. Per-dish defaults applied silently when the caller does not specify.
Elicitation block at 15.98s to 25.66s:
“For your lumberjack slam, how would you like your eggs cooked, and what kind of bread would you like? White, brown, multigrain, or sourdough?”
“For your Coke, I'll add a soft drink to your order.”
Two modifier types on the entree (eggs preparation, bread choice), declared in that order. The drink line is interesting: the caller said “Coke,” the agent quietly maps that to the “soft drink” modifier on the POS. That is the cuisine ontology working under the script. A pizza order would elicit half-and-half toppings here. A sushi order would elicit fish swaps and nori orientation. Same section, different declared types.
4. Order recap plus open-ended add-on prompt
Agent reads back the line items, then asks one open question. This is the gate before the upsell. If the caller says yes here, the upsell does not fire.
Recap and prompt at 37.56s to 43.16s:
“Great. So that's a lumberjack slam with scrambled eggs and sourdough bread plus a soft drink. Anything else for your order?”
The recap is structured: dish, modifiers in elicitation order, drink. No price yet. The add-on prompt is general (“anything else for your order”), not pushy. Caller says “no, that's it” at 47.41s and gives a name. That is the trigger to advance to section 5.
5. Branded upsell line
Two-sentence dessert or side suggestion in the brand voice, fired after the caller has said the order is complete and provided a name. Single attempt, not a sequence.
Upsell at 52.52s to 59.44s:
“Thanks, Rob. Before I finish up, would you like to add a sweet treat like a slice of New York style cheesecake? It's so good. It might make your Coke jealous.”
Six and a half seconds of upsell. The voice carries the brand. One attempt. If the caller says no, the script does not retry with a second item. PieLine reports a 15 to 20 percent lift in average order value from this single section.
6. Non-menu modifier acceptance with confirm-back
If the caller asks for a modifier the menu does not list, the agent confirms the literal phrase back with the dish name attached. Does not refuse. Does not silently swallow. Does not ask the kitchen to interpret a free-text note.
Caller invents a modifier at 65.98s to 68.31s:
Customer: “Can you add strawberries, if that's an option?”
Agent confirms back at 71.34s to 75.34s:
Agent: “You got it. One slice of New York style cheesecake with strawberry topping.”
The phrase “strawberry topping” carries the dish name (cheesecake) attached, so when the recap fires three seconds later the modifier shows up where the caller expects it. This section is where the most polished generic voice bots break. The strawberries either silently disappear, or end up as a free-text note the cook has to read.
7. Final recap plus single confirm
Full order read back, all modifiers including the non-menu one, single closed-form confirm. Caller says yes once and the order goes to the POS.
Final recap at 75.42s to 84.06s:
“Can I confirm your order is a lumberjack slam with scrambled eggs and sourdough bread, a soft drink, and a New York style cheesecake with strawberry topping? Is that correct?”
Closed-form yes-or-no. Caller says “yeah, that's right” at 86.81s. The strawberry topping is in the recap because section 6 wired it onto the cheesecake row, not as a kitchen note.
8. POS placement, total, and pickup time
Agent places the order in the POS, quotes the total the POS calculated (not a guess), and quotes the ready-by time the POS returned (not a hard-coded duration).
Placement and close at 89.12s to 99.28s:
“Placing your order now.”
“Done.”
“Your total is $34.11, and your order will be ready for pickup at 12:45 AM.”
“Thank you for calling Denny's, Rob.”
The $34.11 figure is what the POS computed once the line items and modifiers were applied, not a precomputed estimate. The 12:45 AM pickup time is the POS's ready-by quote based on current kitchen load. Both are dynamic. The script just reads them out.
What the source file actually looks like
For anyone who wants to verify the timestamps without taking my word for it, here is the row shape on src/components/voice-activity-data.ts and a sample of the actual captions array.
Where the rolled-your-own scripts stall
Four of the eight sections are conversation work. Four are integration work. The conversation sections look easy on a Voiceflow flow diagram or a Vapi prompt. The integration sections are where the project gets stuck for two to six weeks.
Roll your own vs PieLine on the eight-section template
Sections 1, 4, 5, 7 are easy. You write the greeting, the recap, the upsell, and the confirm. Sections 2, 3, 6, 8 are where the project stalls.
- Section 2: you have to design and time the menu-lookup buffer yourself
- Section 3: you author 180 dish modifier definitions and elicitation orders by hand
- Section 6: you build a non-menu modifier ontology that maps phrases to POS modifier IDs
- Section 8: you implement POS placement, total quoting, and pickup-time fetching
“The experience was better than speaking to a human. No hold time, no confusion, no rushing.”
Want the eight-section template wired against your menu?
A 20-minute call walks the template against your cuisine, your POS, and a sample order. We ship the script during onboarding, you do not author it.
Common questions about generating restaurant phone scripts
Is there an open-source script generator I can drop into my own voice stack?
Not one that produces tickets your POS will accept. The public templates floating around (Voiceflow blog flows, Vapi starter prompts, Synthflow demos) all stop at the conversation layer. They give you a personality and a set of intents and trust you to wire the rest. The hard part of a restaurant script is not the words; it is the modifier ontology that maps a sentence like 'half pepperoni, half veggie, gluten-free crust' onto three modifier IDs in Toast and a fractional-pricing rule. PieLine builds that mapping during onboarding from a menu scrape and a POS connection. The eight-section template on this page describes the conversation layer those mappings sit underneath.
Where does the 102.36-second number come from?
It is the literal duration field on src/components/voice-activity-data.ts in the PieLine repo, generated by Deepgram multichannel transcription of public/audio/dennys-order.mp3. Forty-six captions, two amplitude envelopes (customer left, AI right), 60Hz sample rate. Every timestamp on this page is read directly off that file, not reconstructed from memory or rounded.
Why is the menu-lookup buffer 4.14 seconds and not faster?
Because the agent says 'One moment, please' at 10.96s and the next AI word lands at 15.98s. The 4.14s gap is the time the system spends fetching menu items and modifier defaults from the POS. Pretending the agent is faster than that is how you get scripts that feel robotic in production. A real script reserves the lookup window with a verbal placeholder so the caller does not start talking over the agent. That is section 2 of the template.
Does the upsell line on every PieLine script say 'It might make your Coke jealous'?
No. That specific line was tuned for the Denny's voice. The structural piece every script keeps is the upsell trigger position: it fires after the customer says 'that's it' and gives a name, before the order recap. The line itself is rewritten per restaurant from the dish list, the cuisine, and the brand voice. The Denny's line landed cheesecake on the ticket. PieLine reports a 15-20 percent average order value lift from the upsell pass.
What happens when the caller asks for a modifier that is not on the menu, like the strawberries on the cheesecake?
Section 5 of the template is 'non-menu modifier acceptance with explicit confirm-back.' At 65.98s on the Denny's call, the customer says 'Can you add strawberries, if that is an option?' Three seconds later the agent confirms 'One slice of New York style cheesecake with strawberry topping' and the topping shows up in the order recap at 75.42s. The script does not silently swallow the modifier and it does not refuse it; it confirms the literal phrase back with the dish name attached, then includes it in the recap. This is the line where most generic voice bots fail.
How long does it take a real script to take a four-line order end to end?
On the Denny's call, the first AI word is at 0.00s and the last AI word is at 99.28s. Order placement (line items in the POS) lands at 89.12s with the total quoted at 92.00s. The whole call is 102.36s. That is one entree with two modifiers, one drink, one dessert with a non-menu modifier, plus an upsell, a name, and a recap. Anything claiming sub-30-second order takes is either dropping the recap or skipping POS placement.
Do I need to write spice levels and protein substitutions into the script myself?
No. Cuisine-specific modifier types (half-and-half pizzas, spice levels, protein subs, custom sushi rolls) are declared on the dish ontology during PieLine onboarding, not authored line by line in the script. The script template stays the same across cuisines; the modifier elicitation in section 3 reads which modifier types apply to the dish the customer ordered and asks for them in a fixed order. Owners review the resulting flow on a real call; they do not author 180 dish definitions.
What does this cost compared to building a script myself on Voiceflow or Vapi?
$350 per month covers up to 1,000 calls on PieLine including the script build, menu scrape, POS mapping, and active call refinement during the first month. Vapi or Synthflow charge per minute of conversation plus the LLM and TTS provider costs underneath, plus your engineering time to wire and maintain the modifier ontology and POS integration. Most restaurants that try to roll their own stop at the conversation layer because the POS-side work is the expensive part. The money-back guarantee on the first month is the price-equivalent of an agency proof-of-concept.
The same Denny's call from different angles
Keep reading
AI phone agent restaurants: a 102 second real call read line by line
The same Denny's call, narrated as one continuous order rather than as a script template. Useful if you want to read the call in real time.
Cuisine grammar: the part of a restaurant phone agent you cannot demo
Why the eight-section template stays constant while section 3 looks different on a pizza shop than on a sushi bar. Five cuisines compared.
POS integration: where modifier IDs come from
The wiring under section 8 of the template. Why a webhook is not a POS integration and how the strawberry topping ends up as a modifier ID, not a note.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.