Technology in a restaurant is a stack of replayable rows, with one missing.

Every other guide on this is a vendor catalog. Sort the same stack by a single question, can you replay the transaction later, and one layer falls out: the phone line. It is the only piece of restaurant technology that historically writes nothing per call.

Matthew Diakonov, Written with AI

Published April 24, 202613 min read

See a real stereo call recording, channel by channel

4.9from 200+ restaurants

Stereo audio: customer on channel 0, AI on channel 1, by design

Per-channel amplitude envelope sampled at 60 Hz, on every call

Word-level per-speaker transcripts via Deepgram nova-3 multichannel

Up to 20 simultaneous calls on a single main line

One question reorders the stack

When something goes wrong, can you replay the transaction?

POS row, kitchen ticket, online cart, payroll punch, GL entry. All replayable.

The phone line traditionally writes nothing. A mono file at best.

PieLine records every call as 16-bit stereo, customer on channel 0, AI on channel 1.

Per-channel amplitude envelope sampled at 60 Hz, transcript via Deepgram nova-3 multichannel.

The phone layer becomes searchable, like the rest of the stack.

0:00 / 0:05

The framing every other guide misses

Pull up the first ten articles that come up for this topic. They are vendor catalogs. POS, kiosks, contactless payment, online ordering, kitchen display, loyalty, AI. The categories are not wrong. They are also not very useful, because every category gets its own list and the list is the entire article.

A more useful frame, for an operator who already runs four to twelve of these layers, is to ask one question across all of them: when something goes wrong on a Tuesday afternoon and a customer calls back to dispute it, can you replay the transaction?

For seven of the eight layers a typical full-service kitchen runs, the answer is yes within thirty seconds, because the layer was built around a database. For the eighth, the phone line, the answer has historically been no, because the phone line was built around connecting two humans and never around logging what they said. That asymmetry is the part this page is about.

The eight technology layers in a typical full-service restaurant

A floor count, not a ceiling. Quick-service operations run lighter (six to eight layers), multi-location groups run heavier (twelve to twenty). The shape is what matters: one layer per kind of transaction, one row per event, except the last one.

Point-of-sale

Toast, Square, Clover, NCR Aloha, Revel. Every order writes a row. Replayable.

Kitchen display

Tickets fired, modifications, fire-time, completion stamp. Replayable.

Online ordering

Cart events, line items, payment intent, coupon use. Replayable.

Delivery middleware

DoorDash, Uber, Grubhub orders flow into POS via Otter, Cuboh, Chowly. Replayable.

Inventory and prep

MarginEdge, MarketMan. Counts, par levels, recipe costs. Replayable.

Scheduling and payroll

7shifts, Homebase, Toast Payroll. Punches, breaks, tip pools. Replayable.

Accounting and AP

QuickBooks, Restaurant365. Invoices, vendor checks, GL. Replayable.

Phone line

Carrier dial-tone, optional mono recording, no transcript. Until PieLine, not replayable.

Walk the stack from front-of-house to back, then look at the phone

A short timeline of where each layer lives in the operation, in the order an event typically passes through them. The phone is at the front, but it is the layer that historically does not write a row.

Point-of-sale

Every paid order writes a row with timestamp, line items, modifiers, and payment. Replayable in 30 seconds in any modern POS report.

Kitchen display

Each ticket carries a fire time, item-by-item completion, and a hand-off to the expo line. Replayable.

Online and delivery channels

Cart events from your own site and from DoorDash, Uber Eats, and Grubhub flow into the POS through Otter, Cuboh, or Chowly. Replayable.

Inventory and prep

MarginEdge, MarketMan, or xtraCHEF turn invoices into counts and recipes into food cost. Replayable as a daily run.

Scheduling and payroll

7shifts, Homebase, or Toast Payroll write punches, breaks, and tip pools. Replayable to the minute.

Accounting and AP

QuickBooks Online or Restaurant365 store every invoice, every vendor check, every GL entry. Replayable.

Phone line

Until PieLine, the only layer with no replayable record per transaction. PieLine fills the gap with stereo audio plus per-channel transcript.

This is the layer the rest of this guide is about.

Why the phone line is the asymmetry

The other seven layers were designed by software people who wanted a database row. POS systems wrote the row first and the receipt second. Kitchen display systems wrote the row to drive the screen. Online ordering wrote the row to charge the card. Even payroll wrote the row to cut the check. None of them treat audit-trail data as optional, because the rest of the system cannot run without it.

The phone line was designed by telephony people who wanted to connect two humans. The signal was the call, not the data about the call. Recording was an add-on, mono, often delivered as a file you had to download out-of-band. Transcription was a separate service entirely. There was no concept of a per-speaker channel because the protocol assumed both speakers were the point of the recording, not the parties to a transaction.

That assumption made sense when the phone was a communication channel. It stopped making sense when a meaningful share of a restaurant's revenue and dispute history started arriving over the phone, and the rest of the operation moved to a one-row-per-event model.

How PieLine reshapes the phone layer to look like the others

One main number, two destinations per call, three artifacts written for every call. The artifacts are the part that makes the phone layer behave like a database-shaped layer.

Per-call write path on the phone line

What the audio pipeline actually looks like

This is the build script PieLine ships in the public repo. It runs against any stereo call file and rewrites the on-page voice-activity widget. The shape of the script is the shape of the per-call write path on every live call: stereo input, per-channel envelope, Deepgram nova-3 multichannel transcript.

scripts/build-voice-activity-data.py

What it looks like when you actually run it

A redacted shell trace from regenerating the public Denny's order demo. The 102.36-second figure on the last line is the duration of the recording shipped on the PieLine homepage and is the same number you can read in src/components/voice-activity-data.ts.

rebuild voice-activity widget from a stereo call file

Anchor fact

scripts/build-voice-activity-data.py and src/components/voice-activity-data.ts

The first paragraph of the script's docstring states the pipeline directly: the input WAV is 16-bit stereo with the customer on the left channel and the AI on the right channel, the script samples a per-channel amplitude envelope at 60 Hz, and it posts the audio to api.deepgram.com/v1/listen?model=nova-3&multichannel=true for word-level per-channel timestamps.

The output it writes, src/components/voice-activity-data.ts, holds 102.36 seconds of dual-channel envelope plus speaker-tagged caption segments for the public Denny's order demo. The duration field on line 12 is "duration":102.36 and the envelopes object carries two arrays, customer and ai.

Together those two files are why the phone line, on a PieLine deployment, becomes the eighth replayable layer in the restaurant tech stack instead of the gap.

Mono call recording vs stereo two-channel pipeline

The shape of the artifact is what determines whether the phone layer can be searched, joined to the POS, and replayed alongside a kitchen ticket. Mono cannot. Stereo with one speaker per channel can.

Feature	Typical phone recording	PieLine pipeline
What gets stored per transaction	Mono audio file at best, often nothing	16-bit stereo audio plus per-channel envelope plus per-speaker transcript
Speaker separation	Inferred post-hoc, fails on overlap	Channel 0 = customer, channel 1 = AI, by design
Search by phrase	Listen through every recording	Word-indexed Deepgram nova-3 multichannel transcript
Replay alongside POS	No timestamp join key, manual cross-reference	Same call ID written to POS and to audio store
Concurrency on one main line	1 to 4 simultaneous calls, then busy signal	Up to 20 simultaneous calls on a single number
Visual signal of who is speaking	Mono waveform, both speakers blended	Per-channel amplitude envelope sampled at 60 Hz
Time from call to searchable record	Hours, days, or never	Live during the call, finalised on hangup

The numbers behind the phone layer

Capacity and audio-pipeline numbers from the PieLine product site and the public Denny's demo recording. None invented for this page.

0Audio channels recorded per call

0Per-channel envelope sample rate (Hz)

0Public demo recording length (seconds)

0Simultaneous calls on a single main line

90%+

“The experience was better than speaking to a human. No hold time, no confusion, no rushing. 90%+ of our calls are now handled end-to-end by PieLine, and we're projecting $500 in additional revenue per location per day.”

Jay Jayaraman, Owner, Mylapore (11 locations, Bay Area)

How heavy the phone layer actually is

Not theoretical. Three numbers from the PieLine product site that show why the audit-trail gap on the phone line is not a niche problem.

Of restaurant calls go unanswered during peak hours, per the PieLine homepage stats strip

Of missed callers do not call back, they call a competitor

0%+

Order accuracy on PieLine, even on complex modifications, with a stereo audio record of every call

How to verify any of this in the public repo

Six checks. Six minutes. Every claim on this page is grounded in a file you can open in mediar-ai/pieline-phones. The third check is the one most readers skip and is the most diagnostic.

Six-check audit of PieLine's audio pipeline

Open scripts/build-voice-activity-data.py and confirm the docstring mentions 16-bit stereo with customer on L (channel 0) and AI on R (channel 1)
Search the same file for the line SAMPLE_RATE = 60 to confirm the envelope sample rate
Search the same file for &multichannel=true in the Deepgram URL to confirm per-channel transcription
Open src/components/voice-activity-data.ts and confirm the duration field reads 102.36
Confirm two envelope arrays: envelopes.customer and envelopes.ai
Confirm caption segments tagged with speaker = customer or speaker = ai

What this changes for the operator

The usual technology-in-a-restaurant overview ends at a vendor list. The list is fine. It also misses the most operational question, which is whether each layer leaves a record you can replay later.

On a 2026 stack, seven of the eight layers already do. The phone line is the asymmetry. Every call that comes in is a transaction with a customer, sometimes a vendor, sometimes a complaint, and the rest of the operation has moved to a one-row-per-event model that the phone has not. PieLine's bet is that closing that gap is worth more than another category of vendor catalog, because it upgrades the audit trail of an existing layer rather than adding a new one.

The pipeline is small enough to read in one sitting. Stereo input. 60 Hz envelope per channel. Deepgram nova-3 multichannel. Speaker-tagged captions. The result is a phone layer that behaves like the POS, the kitchen display, and the accounting layer already do, and that is the technology question worth asking.

See the stereo recording and per-channel transcript on a real call

On a 15-minute demo we will play the public Denny's order clip with the customer track and the AI track on separate channels, walk the Deepgram nova-3 multichannel transcript line by line, and show the same ingest path running on a live restaurant number.

Frequently asked questions

What does "technology in a restaurant" actually cover in 2026?

Eight layers in a typical full-service operation: point-of-sale, kitchen display, online ordering, third-party delivery middleware, inventory and prep counts, scheduling and payroll, accounting and invoicing, and the phone line. The first seven all ship a database row for every transaction so a manager can pull a report a week later. The eighth, the phone line, traditionally ships nothing. The interesting question is not which categories exist; it is which ones leave a forensic record.

Why is the phone line the only restaurant-tech layer with no forensic record?

Because phone systems were designed to connect two humans, not to log a transaction. POS systems write a row. Kitchen display systems write a row. Online ordering writes a row. Even payroll writes a row. The phone, until very recently, wrote a 60-second mono recording at best, with no speaker separation and no machine-readable transcript. When a customer calls back two days later disputing an order, every other layer has a record and the phone has nothing. PieLine fills that gap by treating every call as a 16-bit stereo recording with the customer on channel 0 and the AI on channel 1, then running it through Deepgram nova-3 multichannel to get word-level per-speaker transcripts.

What can I verify about PieLine's audio pipeline in the public repo?

Open scripts/build-voice-activity-data.py in the mediar-ai/pieline-phones repository. The first paragraph of the docstring states the assumption directly: the input WAV is 16-bit stereo with the customer on the left channel (channel 0) and the AI on the right channel (channel 1). The script then samples a per-channel amplitude envelope at 60 Hz and posts the audio to https://api.deepgram.com/v1/listen?model=nova-3&multichannel=true&punctuate=true&smart_format=true for word-level per-channel timestamps. The output it writes, src/components/voice-activity-data.ts, holds 102.36 seconds of dual-channel envelope plus speaker-tagged caption segments for the public Denny's order demo.

What is the practical difference between a mono call recording and a stereo, dual-channel one?

On a mono recording, both speakers are summed into one waveform. Diarization (who said what) has to be inferred by a second model after the fact, and it makes mistakes when speakers overlap, when one is louder than the other, or when there is background noise from a kitchen. On a stereo recording with one speaker per channel, diarization is free: channel 0 is by definition the customer, channel 1 is by definition the AI, and the per-channel amplitude envelope shows exactly when each one was speaking. For dispute resolution, training data, and analytics, that is the difference between a recording you can search and a recording you have to listen to.

How many distinct technology layers does a typical full-service restaurant run today?

Eight to twelve. Eight is the floor: POS, kitchen display, online ordering, delivery aggregator middleware, inventory, scheduling and payroll, accounting, and the phone line. Twelve is a more realistic number once you add a reservations system, a loyalty layer, a review-management tool, and a paid-marketing pixel. Quick-service operations run leaner (often six to eight). Multi-location groups run heavier (twelve to twenty). The number is not the interesting metric. The interesting metric is how many of those layers leave a record you can replay later.

Does PieLine count as another tech layer or as a replacement for the phone provider?

It sits on top of the existing phone line; it does not replace your carrier. The number stays the same, the customer-dialed experience improves, and the audio recording becomes structured. Orders post live to Toast, Square, Clover, NCR Aloha, or Revel through PieLine's POS integration (defined in src/app/page.tsx around the integrations section, with five live integrations listed and 50+ available). Non-order calls (vendor reps, complaints, complex catering) are handed to a human via smart call transfer with a written conversation summary. From an operator's view, PieLine is the audit-trail upgrade for the phone layer; from a stack-architecture view, it is the missing piece that makes the phone line replayable like the rest of the stack.

Is real-time transcription enough, or do I actually need the audio?

You need both. A transcript without audio loses tone, hesitation, and any moment where the customer is reacting to background sound. Audio without a transcript is unsearchable, so you cannot pull every call where a customer asked about gluten-free options last week. The PieLine pipeline keeps both, with the transcript indexed at the word level and the audio retained as a stereo file so a single timestamp seeks to the matching second on either track. Most other restaurant-tech layers do not have this problem, because they were always machine-readable to begin with.

How does PieLine handle peak-hour concurrency on a single number?

Up to 20 simultaneous calls on the same main line, defined in the PieLine public marketing site at src/app/page.tsx around the features section, in a card titled "20 simultaneous calls" with the description "Friday night, game day, holidays, PieLine handles them all at once. Zero hold time, zero missed orders, no matter how hard the rush hits." That capacity is shared between order calls and non-order calls; both paths are recorded and transcribed identically. The 20-line ceiling is what lets the audit-trail story scale: you cannot replay a call you never picked up, and the historical answer-rate problem in restaurant phone systems was that peak service dropped 35 to 43 percent of inbound calls.

What sample rate does PieLine use for the per-channel amplitude envelope, and why?

60 Hz, defined as the SAMPLE_RATE constant in scripts/build-voice-activity-data.py. The number is set to match the typical browser display refresh, so the bar-by-bar visualization in the on-page voice-activity widget renders one envelope sample per frame at 60 fps. It is a deliberate choice for visual fidelity rather than for transcription, which Deepgram performs at the audio's native sample rate. Knowing this distinction matters because the 60 Hz envelope is what makes the per-speaker waveform on the homepage track a real call; it is not a generic loudness animation.

Where does this leave operators evaluating restaurant technology in 2026?

With a sharper question. Instead of asking which categories of tech exist, ask which ones leave a forensic record. Most categories already do, because they were built around databases. The phone line is the outlier. If the phone is the front door of a restaurant, and a meaningful share of revenue and dispute history walks through that door, then the phone layer needs to write a row like everything else. PieLine is the piece that makes that row a real, searchable, two-channel audio recording with a per-speaker transcript, and that is the gap every other technology-in-a-restaurant guide skips.

Technology in a restaurant is a stack of replayable rows, with one missing.

The framing every other guide misses

The eight technology layers in a typical full-service restaurant

Point-of-sale

Kitchen display

Online ordering

Delivery middleware

Inventory and prep

Scheduling and payroll

Accounting and AP

Phone line

Walk the stack from front-of-house to back, then look at the phone

Point-of-sale

Kitchen display

Online and delivery channels

Inventory and prep

Scheduling and payroll

Accounting and AP

Phone line

Why the phone line is the asymmetry

How PieLine reshapes the phone layer to look like the others

Per-call write path on the phone line

What the audio pipeline actually looks like

What it looks like when you actually run it

scripts/build-voice-activity-data.py and src/components/voice-activity-data.ts

Mono call recording vs stereo two-channel pipeline

The numbers behind the phone layer

How heavy the phone layer actually is

How to verify any of this in the public repo

What this changes for the operator

See the stereo recording and per-channel transcript on a real call

Frequently asked questions

Comments (••)

Comments ()