Restaurant technology, evaluated by asking for the receipts
Most articles about restaurant technology read like a menu of categories: AI, IoT, kiosks, robotics, loyalty, delivery. They stop at the category name. A buyer never learns whether the product inside a category actually works. This guide starts from the opposite end. We publish a real 102.36 second AI phone order in this repo, walk you through the seven layers of the stack that produced it, and give you a six item checklist to run against every restaurant tech vendor you talk to.
Why category roundups are not enough anymore
Open any restaurant technology article from a trade publication in 2026 and you will find the same shape. A headline like “Top 10 restaurant technology trends.” A bulleted list. AI. Internet of Things. Self-order kiosks. Loyalty platforms. Robotic makelines. Kitchen display systems. Third party delivery aggregators. A picture of a stainless steel kitchen. A one paragraph summary each. A closing line about “the future of hospitality.”
The format is not wrong; categories are real. The format is incomplete. A restaurant operator reading that article cannot tell, after reading it, whether any specific product in any of those categories actually works in their kitchen, on their POS, at their phone volume, during their peak. The article sends them back out into the world to evaluate vendors using the same glossy spec sheets the article was written from.
The fix is not another list. The fix is to change what a restaurant technology buyer is allowed to accept as evidence. A category is a claim. A recording, a transcript, a POS screenshot, and a reference call are receipts. This guide is built on receipts.
The receipt we keep in this repo
Four artifacts. All four live in the aiphoneordering.com source tree and any operator can download them, run them, and reproduce the result. None require a sales conversation.
The stereo WAV puts the customer on the left channel and the AI on the right, which is why we call Deepgram with multichannel=true. Duration, caption count, and envelope length are taken from the actual API response, not from our marketing page.
The regeneration script, annotated
Shortened from the real scripts/build-voice-activity-data.py in this repo. It posts the WAV to Deepgram's nova-3 multichannel endpoint, pulls the duration and per-channel word timings out of the response, computes an amplitude envelope at 60 Hz per channel, and writes the result to voice-activity-data.ts.
Run this on any call, from any vendor, and you get the same kind of structured object we use to drive the bar animation and live captions on the landing page. If a restaurant technology vendor will not let you point this script at their call recording, that is the entire evaluation.
One call, minute by minute
The recording runs 102.36 seconds. Pulled from the captions file, here are the six beats that determine whether the call was worth the subscription.
Timeline of public/audio/dennys-order.mp3
0:00 to 0:06 — greeting and context
AI opens with 'Hi. This is Denny on a recorded line. What can we get for you?' The recorded-line disclosure is not a throwaway; it is the consent pattern Deepgram and every major call-analytics provider require before you store audio for training and transcription.
0:06 to 0:22 — order capture with modifiers
Customer orders 'one lumberjack slim and, one Coke.' AI maps the slip to Lumberjack Slam, asks 'how would you like your eggs cooked, and what kind of bread would you like? White, brown, multigrain, or sourdough?' The modifier tree is not boilerplate; it is tied to the POS item ID during onboarding.
0:22 to 0:48 — confirmation and playback
AI reads the order back: 'a lumberjack slam with scrambled eggs and sourdough bread plus a soft drink. Anything else for your order?' Read-back is how you hit 95%+ accuracy. Every caption above 0:30 is the model demonstrating it listened, not just transcribed.
0:48 to 1:10 — the upsell beat
The AI offers dessert and lands the line 'It might make your Coke jealous.' The customer laughs and adds 'one slice of the cheesecake. Can you add strawberries, if that's an option?' This single beat is where restaurants see 15 to 20 percent average order value lift; the upsell is scripted but the delivery is conversational.
1:10 to 1:33 — final confirmation
AI restates the full order including the dessert and strawberry topping, asks 'Is that correct?' and the customer confirms. This confirmation is the last check before the POS write; once it fires, the order is committed and a floor ticket prints.
1:33 to 1:42 — close and receipt
AI says 'Placing your order now. Done. Your total is $34.11, and your order will be ready for pickup at 12:45AM. Thank you for calling Denny's Rob.' The total and pickup time are returned from the POS, not synthesized. The call ends at 102.36 seconds. Transcript and envelope both stop at the same timestamp.
“The experience was better than speaking to a human. No hold time, no confusion, no rushing. The total was right, the pickup time was right, the upsell landed, and the order printed in the POS before I hung up.”
PieLine customer on aiphoneordering.com/llms.txt, April 2026
The seven layers under the hood
Every caption in that 102.36 seconds flowed through seven systems. A category list flattens all of them into the word “AI.” A receipts based evaluation treats each layer as its own buying decision, with its own failure mode, its own vendor market, and its own verifiable artifact.
Layer 1: Carrier and number routing
Twilio, Telnyx, or your existing VoIP. Owns the dial plan, the overflow rules, and the 10 minute phone-forward that kicks off the whole call. Verifiable artifact: a call log with caller ID and destination.
Layer 2: Speech-to-text
Deepgram nova-3 multichannel in our case. Alternatives: Whisper, Rev, Google. Verifiable artifact: a word-level JSON transcript with confidence scores, not a marketing accuracy number.
Layer 3: Reasoning model
The tuned LLM that holds the menu, modifier tree, rules (delivery zones, hours, minimums), and decides what to say next. Verifiable artifact: a side by side diff of two calls where the rule set changed.
Layer 4: Text-to-speech
Synthesized voice returning to the caller. Vendors: ElevenLabs, PlayHT, Cartesia, OpenAI. Verifiable artifact: a playable MP3 plus the seed so you can regenerate the same utterance.
Layer 5: POS integration (the hard one)
Writes the final order, including half-and-half pizza, spice levels, substitutions, and allergy notes, into Clover, Square, Toast, NCR Aloha, Revel, or one of the other 50+ we support. Verifiable artifact: a screenshot of the order inside the POS.
Layer 6: Payment processor
Over the phone credit card capture for phone-paid pickup and delivery, through the POS integration. Verifiable artifact: a test charge, a refund, and the reconciliation record.
Layer 7: Analytics plane
Concurrency, missed calls, average handle time, upsell conversion, popular items. Verifiable artifact: a dashboard export with timestamps that match the call recording timestamps.
What actually happens during a single call
The same seven layers, drawn as the signals that move between them for one phone order. Everything on the left is inbound to a human, everything on the right is outbound to the POS. The reasoning model in the center is the one layer every vendor wants to sell you first; in practice it is the easiest one to replace.
How a 102.36-second phone order fans out across the stack
The exact messages that cross the wire
Timing and order of the API calls behind the single order we published. Every arrow is observable: carrier logs, STT response JSON, POS webhook, reconciliation row. If a vendor cannot draw this for their own product, their layer five is probably a Zapier shim.
Single phone order, crossing the seven layers
Real numbers from the recording
These are not marketing rounding. Every figure below comes from the JSON Deepgram returned or from the POS confirmation line in the audio itself.
The six receipt checklist
Use this on every restaurant technology vendor call. A vendor that produces all six inside 24 hours is real. A vendor that produces zero is selling you a category, not a product.
Six artifacts every restaurant technology vendor should ship
- A playable recording of a completed end to end order in your cuisine, hosted somewhere you can download it
- A word level transcript of that recording from a named speech-to-text model (Deepgram, Whisper, Rev), not edited by hand
- A screenshot of the resulting order inside the POS, with every modifier visible and the total matching the recording
- A measured peak concurrency number, the behavior at ceiling plus one, and the date the measurement was taken
- A printed price (dollars per month or dollars per call) with a first month money back guarantee, not 'contact sales'
- A named reference customer in your cuisine format who will take a 10 minute phone call without filtering through the vendor
How PieLine scores on its own checklist
We wrote the checklist this way because we knew we could ship every row. Side by side with how most category leader roundups describe restaurant technology today.
Receipts based evaluation, PieLine vs typical glossy pitch
What any restaurant technology vendor should be able to hand you in 24 hours.
| Feature | Typical category-list vendor | PieLine |
|---|---|---|
| Playable end to end recording in your cuisine | Gated demo video, heavily edited, no hosted audio file. | public/audio/dennys-order.mp3 in this repo, 102.36 seconds, raw. |
| Word level transcript from a named STT model | Marketing accuracy number, no source file, hand edited. | Deepgram nova-3 multichannel, 46 captions, in voice-activity-data.ts. |
| POS order screenshot with modifiers and total | Screenshot of a different POS than yours, no modifiers shown. | Live test call into your own POS, shown before contract signing. |
| Tested peak concurrency and ceiling-plus-one behavior | 'Unlimited concurrent calls' with no measurement date. | 20 simultaneous per location, measured; queues on overflow. |
| Printed price | 'Contact sales for a custom quote.' | $350/mo up to 1,000 calls, $0.50/call beyond, money-back first month. |
| Cuisine-specific reference customer | Logo grid with no phone numbers or cuisines attached. | Mylapore (Indian, 11 locations), Idly Express, Amber India, China Village. |
| Same-day time to first dollar | 6 to 12 week install, POS migration sometimes required. | Same day. Menu scrape, POS map, go live before close of business. |
Behaviors described are as of April 2026. Any vendor can ship more receipts at any time; ask for the date on every artifact they produce.
A subset of the POS systems the PieLine layer five writes to today. 50+ integrations ship live; every one is covered by a cuisine specific reference call.
“The experience was better than speaking to a human. No hold time, no confusion, no rushing.”
Before and after you ask for the receipts
The side effects of changing what you accept as evidence in a restaurant technology evaluation. Both columns show the same three month window of a multi location operator looking at voice AI.
Buying on categories
0 weeks
evaluation cycle, no artifacts reviewed
Operator reads three roundups, maps six vendors onto a 2x2 grid, asks for a demo, sits through a pre-recorded slideshow, and ends the cycle with a signed contract that still has not landed a real call in the operator's own POS. Time to first dollar: 6 to 12 weeks, sometimes longer.
Buying on receipts
0 days
evaluation cycle, six artifacts requested
Operator sends the six-item checklist to each vendor. Most respond with partial artifacts or go dark. One responds in 24 hours with a hosted recording, a transcript, a POS screenshot, a concurrency number, a printed price, and a cuisine reference. That one gets the signature and ships same day.
How to run this evaluation in your next vendor call
Copy-paste ready. Most operators hit a decision in under a week once they stop letting category sheets set the agenda.
Conversation opener for restaurant technology vendor calls
- Ask for a hosted MP3 of one completed end-to-end order in your cuisine, not a demo video
- Ask which STT model produced the transcript and whether you can see the raw JSON with confidence scores
- Ask whether the live demo call will land in your own POS during the call, not in a sandbox
- Ask what the measured peak concurrency is and what happens at ceiling plus one
- Ask for the monthly price in dollars and the first-month money back terms, in writing, before the demo
- Ask for the phone number and name of a reference customer in your cuisine with your POS, and then actually call that person
- If the vendor produces all six inside 24 hours, move to a paid pilot; if they produce zero, end the cycle and move on
Run the receipts checklist against us, live, in fifteen minutes
Bring a month of call counts and your POS (Clover, Square, Toast, NCR Aloha, Revel, or any of the 50+ we ship). We will play a cuisine-matched recording, hand over the raw Deepgram transcript, and place a live order into your own POS before the call ends. If it does not work, there is no pitch deck to sit through after; you keep the recording and go evaluate someone else.
Book a 15 minute demo →Bring your POS, we will bring the receipts
Fifteen minutes. Your cuisine, your POS, your call volume. We will play the recording, run the transcript, and land a live order in your system while you watch. Same day go live, $350/mo for up to 1,000 calls, money back in month one.
Frequently asked questions
What is restaurant technology, in one concrete definition?
Any software, hardware, or service that reduces the cost, error rate, or staff attention required to convert a hungry person into a completed, paid order. That definition is deliberately narrow. It excludes category-sounding words like 'digital transformation' and 'connected experience' that do not tie to a measurable operational metric. If a vendor cannot point to which number on your P&L their product moves, and in which direction, their product is not restaurant technology yet; it is restaurant marketing material.
Why does this guide publish a 102.36-second call recording?
Because restaurant technology evaluations collapse the moment you ask to hear one working. The file at public/audio/dennys-order.mp3 in the aiphoneordering.com repo is a real AI-handled phone order, recorded in stereo with the customer on the left channel and the AI on the right. We transcribed it with Deepgram's nova-3 multichannel model and shipped the captions, per-channel 60 Hz amplitude envelope, and the regeneration script alongside the audio. The AI answers 'Hi. This is Denny on a recorded line.', takes a Lumberjack Slam with scrambled eggs and sourdough, upsells cheesecake with the line 'It might make your Coke jealous.', and closes at a $34.11 pickup order in 102.36 seconds. Any operator can download the file, run the script, and reproduce the transcript.
What are the real layers of a modern restaurant technology stack?
At a phone-ordering restaurant in 2026, the stack has seven layers. Layer one: the phone carrier that routes inbound calls (Twilio, Telnyx, your existing VoIP). Layer two: the speech-to-text service that transcribes the caller (Deepgram nova-3 multichannel in our case, alternatives include Whisper and Rev). Layer three: the reasoning model that understands the order and decides what to say next (a tuned LLM with a restaurant-specific prompt, menu, modifier tree, and rules). Layer four: the text-to-speech service that speaks back. Layer five: the POS integration that writes the finished order into Clover, Square, Toast, NCR Aloha, or Revel. Layer six: the payment processor when the order is a phone-paid pickup or delivery. Layer seven: the analytics plane that tracks concurrency, missed calls, upsell rate, and average handle time. Every row in a restaurant technology buyer's evaluation should ask a specific question of one of those seven layers.
Which layer is the most common point of failure?
Layer five, the POS integration. Speech-to-text is a commodity. Large language models are commodities. Text-to-speech is a commodity. The specific thing that blows up in production is writing the finished order, with modifiers, into the restaurant's POS. Half-and-half pizzas, spice levels, protein substitutions, allergy notes, and custom sushi rolls are not flat fields in a POS schema; they are nested modifier trees whose shape differs by POS vendor and sometimes by store. PieLine ships 50+ live POS integrations and 95%+ order accuracy because we spent more engineering on that layer than on any of the others. If a restaurant technology vendor will not let you place a live test call that lands in your own POS during the sales cycle, assume that layer is not ready.
How many simultaneous calls can a real restaurant technology voice stack handle?
PieLine's tested ceiling is 20 simultaneous calls per location. That is not a marketing number; it is the concurrency the infrastructure is qualified against and the graceful-degradation point beyond which calls queue rather than drop. Several competitors claim 'unlimited concurrent calls.' In practice concurrency is bounded by upstream speech-to-text rate limits, POS API rate limits, and the vendor's own compute. Any operator evaluating restaurant technology voice products should ask for the measured peak, what the behavior is at ceiling plus one, and whether the number was measured on a weekday afternoon or on a holiday rush. We publish ours and we will run a load test on demand.
What does a receipts-based evaluation checklist look like?
Six items, each with a verifiable artifact the vendor must produce. One: a playable recording of a completed end-to-end order in the same cuisine as yours. Two: a word-level transcript of that recording generated by a named speech-to-text model, not edited by hand. Three: a screenshot of the resulting order inside the restaurant's POS, including every modifier. Four: the measured peak concurrency number and the behavior at ceiling plus one. Five: a printed price in dollars per month or dollars per call, not 'contact sales.' Six: a named reference customer in the same cuisine or format that will take a ten-minute phone call. A vendor that produces all six in twenty-four hours is real. A vendor that produces zero is a pitch deck.
How does PieLine score itself against that checklist?
One: public/audio/dennys-order.mp3 is in this repo and plays in any browser. Two: the Deepgram nova-3 multichannel transcript is in src/components/voice-activity-data.ts with 46 captions, 28 from the AI and 18 from the customer, at 60 Hz per-channel envelope sampling. Three: the pickup-order line 'Your total is $34.11, and your order will be ready for pickup at 12:45AM' is audible in the recording and present as a caption. Four: 20 simultaneous calls per location, published. Five: $350 per month for up to 1,000 calls, $0.50 per call beyond, money-back first month. Six: Mylapore (11-location South Indian chain in the Bay Area), Idly Express, Amber India, and China Village are our cuisine references; Mylapore's owner Jay Jayaraman speaks publicly on . We hit six of six on our own checklist, which is why we wrote it this way.
Is restaurant technology different for chains versus independents?
Yes, in the rollout, not in the artifact. A 100-location chain buys the same voice stack as a single Mexican shop, but the chain also buys a rollout playbook: POS version reconciliation across stores, menu-variance handling, multi-brand number routing, and a central analytics view. Independents install in a day. Chains install in a quarter. The receipts checklist above is identical for both buyer types; only the reference call is longer. Mylapore is rolling PieLine across 11 locations and projecting $500 per location per day in recovered phone-order revenue, roughly $2 million per year chain-wide once the rollout completes.
What restaurant technology is overhyped in 2026?
Anything that requires large hardware investment before it produces the first dollar and anything marketed with the word 'platform' before it has a single named live customer in your cuisine. Self-order kiosks are legitimate but have a 3 to 6 month payback once you count buildout and menu engineering. Robotic prep equipment works at scale but has a 6-month-plus time-to-first-dollar. Blockchain for supply chain, NFT loyalty, and 'generative AI menu development' without a measurable metric are the obvious overhypes. The boring truth is that most of an independent restaurant's 2026 technology ROI comes from plugging revenue leaks (missed calls, missed reservations, retyped orders, unreconciled delivery aggregator revenue) that the existing categories already address.
Why should PieLine be the first restaurant technology a phone-heavy restaurant installs?
Because the phone channel is the leak that is already happening on the day you sign. Restaurants miss 30 to 40 percent of inbound calls during peak hours, each one is an order someone else captured, and the fix is a 10-minute phone-forward setup with no hardware, no staff training, and no change management. Week one of operation typically captures enough previously-lost orders to exceed the $350 monthly subscription. Every other category on a restaurant technology roadmap (loyalty, kiosks, KDS, robotics) requires staff behavior change or physical installation; the phone channel does not. That is why we rank it first and why Mylapore rolled it out before their other tech investments.
Related restaurant technology deep dives
Restaurant Technology Trends 2026, Ranked By Time to First Dollar
A payback-ordered ranking of the 2026 trend list. Why same day install categories beat everything else on week-one ROI, and where each category actually fits in an operator's roadmap.
AI Phone Ordering with POS Integration for Restaurants
The technical side of layer five. How items get mapped, how modifiers survive the trip, and how a 20-concurrent voice row writes directly into Clover, Square, Toast, NCR Aloha, and Revel.
Restaurant Tech Stack 2026
A working reference architecture for a multi-location chain in 2026, with vendor choices per layer and the reasons each layer is the layer most likely to blow up in production.
Receipts, not a category card
Bring your POS and a month of call counts. We will play the recording, show the transcript, and land a live order in your system while you watch. If anything is missing, you walk with the artifacts and evaluate someone else.
Book a demoHow did this page land for you?
React to reveal totals
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.