Operations guide

Restaurant management and operations has six measurable channels. The phone makes seven.

Open any playbook on this topic and you will find the same six categories: FOH, BOH, inventory, labor, POS, and marketing. The phone is never on the list, because a human on a phone leaves no artifact you can manage. PieLine ships every call as a 60 Hz amplitude envelope and a 46-caption transcript. The reference call already checked into our repo has 6,157 samples per channel, exactly zero cross-talk frames, and 1,922 silent frames (31.2 percent of the call). That is a management surface. Here is how to read it.

Book a PieLine demoSame-day onboarding, money-back first month

Matthew Diakonov, Written with AI

Published April 23, 202611 min read

4.9from anchored in a 102.36-second reference call shipped in the product repo

6,157 samples per channel

0 cross-talk frames

31.2 percent silence

46 speaker-tagged captions

The seventh channel

Restaurant management, finally including the phone

FOH, BOH, inventory, labor, POS, marketing.

Six channels you can already measure.

The phone has always been the exception.

Because a voice call left no artifact.

Until every call became a 60 Hz envelope.

0:00 / 0:11

The six channels every other guide already covers

Before we argue that there is a seventh, it is worth being honest about the six. Any decent playbook on this topic hands the manager a weekly review agenda with line items grouped in roughly these buckets. You have read all of them. They work. They are also incomplete.

FOH: covers and turn timeBOH: ticket time and station loadInventory: counts and parsLabor: hours and overtimePOS: voids, discounts, COGSMarketing: repeat rate and attributionTemp logs and food safetyWaste and shrinkageModifier mixDaypart mix

Each of these has an artifact a manager opens on Monday. The POS pulls a report. Inventory posts a variance. Labor exports hours. The cook line leaves tickets. That is why they count as channels: each one produces something the manager reads. The phone is the one activity inside the restaurant that has never produced anything to open.

Why the phone has always been the exception

Picture a Tuesday dinner service at a ten-location chain. The cashier picks up the phone and takes an order. She re-keys it into the POS. The POS emits the row, inventory adjusts, BOH gets a ticket. From the management dashboard the order looks identical to an order placed at the counter. The call itself is gone.

That invisibility is why the industry ends up with cost-per-call estimates, missed-call guesses, and training modules titled Phone etiquette that have no measurable output. You cannot manage what you cannot open. For the phone, the manager has nothing to open.

The underlying problem

Every other operational channel has a primary artifact. The phone does not. Until the phone produces an artifact the manager can read in under 30 seconds, it stays off the agenda and stays the largest source of unmanaged loss in the operation.

The artifact the phone now leaves

When PieLine answers a call, the audio is recorded as a 16-bit stereo WAV with the customer on the left channel and the AI on the right. The file in the repo is public/audio/dennys-order.mp3, one hundred and two point three six seconds long. A script in scripts/build-voice-activity-data.py runs Deepgram multichannel on the raw audio to get per-word timestamps, then computes per-channel RMS amplitude envelopes sampled at 60 Hz and groups the words into speaker-tagged captions on pause or punctuation boundaries.

What lands on disk is a single TypeScript module, src/components/voice-activity-data.ts, that looks like this.

src/components/voice-activity-data.ts

This is the phone channel's primary artifact. A manager opens it the same way they open the ticket screen.

The four numbers a manager now reads

Every new channel eventually produces a small set of round numbers that managers memorize. Food cost percent. Labor percent. Ticket time. Cover count. Here are the four for the phone. All of them come from the reference call above, and all of them can be recomputed against any call that flows through PieLine.

0Samples per channel in the 102.36s reference call

0Cross-talk frames (both above 0.1)

0Silent frames (both below 0.02)

0Speaker-tagged captions

Cross-talk frames

0 / 6,157

Frames where both parties are above 0.1 amplitude. Human-staffed lines drift up under rush; this line is flat zero by design.

Silent frame percent

Frames where both parties are under 0.02 amplitude. This is the think-time band. Compare across calls and locations.

Talk time ratio (AI : customer)

49.04 seconds AI speech vs 19.76 seconds customer speech. Higher than a cashier call, on purpose: two readbacks and an audible POS placement.

Peak envelope

AI max RMS (customer max is 0.8857). Well-normalized call, no clipping, no whisper.

How the artifact is produced, end to end

Nothing about this pipeline is magic. It is a small stack you can read end to end. WAV goes in, a typed TypeScript module comes out. Everything a manager reads is derived from that module.

Stereo WAV to manager dashboard

What the manager actually does on Monday morning

A new management channel is useless if it does not fit the existing cadence. The phone channel piggybacks on the weekly operations review. Ten minutes, sampled calls, four checks per call.

The ten-minute phone channel review

Pull a sample of five calls from the prior week

Stratify by daypart (one lunch, one dinner rush, one off-peak, two holiday or weekend). Do not pick 'problem' calls; pick representative ones.

For each call, verify cross-talk frames = 0

If the count is nonzero, the AI was interrupted by a third party on the line (manager conference, cashier pickup). That is a policy problem, not an agent problem.

Check silent frame percent against the band

Default band is 20 to 40 percent. Below 20 means the agent is rushing (often on a menu that is well-mapped); above 40 means menu mapping is weak and clarifiers are stalling.

Tick the seven structural segments

Greet, intent capture, clarifier, customer resolution, upsell, readback, POS placement. Two or more missing goes into the onboarding queue with a one-line note.

Log the four numbers in the weekly ledger

Sample size, cross-talk total, silent percent band, segment coverage. Four columns, five rows. Trend week over week. Flag outliers.

What the ledger looks like in the terminal

The manager does not need a new tool. The ledger is a CSV. Five rows, four numbers, twenty cells. The whole phone channel fits in an email signature.

weekly-phone-ledger.csv

What each of the four KPIs actually tells the manager

A KPI you cannot act on is a number you throw away. Here is the action each of the four drives, one per card.

Cross-talk frames

Target is 0. Nonzero means a third party interrupted the line (manager pickup, cashier conference). The fix is policy: route interrupts through a different channel. This KPI is the cleanest, because even one nonzero frame is a known specific event to investigate.

Silent frame percent

Target band 20 to 40 percent. Below 20, agent is rushing and skipping clarifiers (expect modifier misses downstream). Above 40, menu mapping is weak. The action is almost always a menu refinement pushed to onboarding.

Talk time ratio

AI-to-customer ratio near 2 to 2.5 is normal and means the AI is doing the two readbacks and the POS narration. Ratio below 1.5 suggests a readback was skipped. Above 3 usually means the customer was confused and the AI over-explained.

Peak envelope

Customer near 0.88, AI near 0.91 is well-normalized. A customer peak below 0.3 means the line is too quiet (often a bad cell connection), which translates to clarifier loops. This KPI catches infrastructure issues, not agent issues.

How this compares to the old way of managing the phone

Most restaurants already have call recording. A few have call analytics that export call counts and average duration. Neither is management infrastructure. Here is the concrete difference.

Feature	Call recording + spreadsheets	PieLine envelope + captions
Artifact per unit of work	An MP3 per call, a CSV of durations	voice-activity-data.ts, 6,157 samples per channel
Time to read one call	Up to 102 seconds (listen end to end)	Under 30 seconds (four numbers)
Comparable across locations	Per-location, per-format, not normalized	Same schema, rolls up by daypart
Cross-talk as a KPI	Not measured, not available	Counted at 60 Hz, target = 0
Silent frame band	Not measured, not available	Sampled at 60 Hz, band 20 to 40 percent
Segment coverage	Subjective tagging by whoever listens	7 labeled segments per call

The six old channels plus the new one, side by side

If you are reworking the weekly review agenda, this is what it looks like with the phone added in.

Monday morning agenda, now with seven channels

FOH: covers, turns, wait time, hospitality moments
BOH: ticket times, station load, waste, temp logs
Inventory: counts, pars, depletion, shrinkage
Labor: schedule, hours, labor percent, overtime
POS and cost of sales: voids, discounts, COGS, modifier mix
Marketing and loyalty: repeat rate, campaigns, attribution
Phone: cross-talk, silent percent, talk ratio, segment coverage

4 KPIs

“The phone used to be the one line item on the agenda where we could only say 'it went okay, I think.' Now it is four numbers, and the numbers are comparable week over week.”

Internal note from a PieLine onboarding call

What disappears from the ops manual once this is in place

A common mistake is to add the seventh channel on top of everything else the team was doing for the phone. That defeats the purpose. Adopting the new channel also retires old practices that had no measurable output. Three of them, specifically.

Retire: phone etiquette module

Tone, pace, hold etiquette, script compliance. None of this applies when the caller never speaks to a human. The remaining training is a single shift briefing: here is the file, here are the four numbers, here is where to click.

Retire: missed-call estimation

Old practice was to multiply average order value by a guess at missed-call percent. Every call is now answered and logged. The estimate is replaced by a count.

Retire: cashier phone coaching

A cashier who takes no phone orders needs no phone coaching. That one-on-one slot becomes available for something else. Most operators redirect it into upsell training for the counter.

See the envelope on your own calls

We will spin up PieLine against your menu, show you the per-channel envelope, and walk the four phone KPIs on a real call from your line.

Frequently asked questions

What are the six channels most restaurant management and operations guides already cover?

FOH (covers, turns, wait time, hospitality moments), BOH (ticket times, station load, waste, temp logs), inventory (counts, pars, depletion, shrinkage), labor (schedule, hours, labor percent, overtime), POS and cost of sales (voids, discounts, COGS, modifier mix), and marketing and loyalty (repeat rate, campaigns, attribution). Every top guide you will find is some reshuffle of these six. They all assume the phone is either (a) a low-volume auxiliary to walk-ins or (b) noise that a cashier handles on the side. That assumption is load-bearing, and it breaks the moment the restaurant discovers that phone orders are 20 to 40 percent of a typical dinner rush.

Why has the phone never counted as a management channel before?

Because there was no artifact. BOH leaves tickets. Inventory leaves counts. POS leaves rows in a database. The phone left a customer who either hung up, complained, or placed an order that a human cashier re-entered by hand. A manager could not open the phone channel on Monday morning and read it. You could only ask the staff how it went, which is the same as not managing it. PieLine's voice-activity-data.ts changes that. Every call becomes a 60 Hz amplitude envelope plus speaker-tagged captions, which is readable the same way a ticket screen is readable.

What is the anchor file and what numbers can I verify from it?

The file is src/components/voice-activity-data.ts in the pieline-phones repo, regenerated by scripts/build-voice-activity-data.py against public/audio/dennys-order.mp3. Duration is 102.36 seconds. Sample rate is 60 Hz. Each channel (customer, AI) is an array of 6,157 floating-point RMS values. The file also contains 46 speaker-tagged captions with start and end timestamps. Peak amplitude for the customer channel is 0.8857; peak for the AI channel is 0.9067. If you run the build script locally with your own stereo WAV, the resulting TypeScript file has the same shape, which is why this works as a management artifact and not a one-off marketing asset.

What is a cross-talk frame and why is zero the target?

A cross-talk frame is a 1/60th-of-a-second slice of the call where both the customer envelope and the AI envelope are above 0.1 amplitude at the same time. In plain terms, both parties are speaking. In human-staffed phone ordering, cross-talk frames go up under pressure because the cashier starts interrupting to keep the line moving, which is where modifier misses originate. In the shipped reference call the cross-talk count is exactly 0 across all 6,157 frames. A manager can pull that number from any call the product processes. It is the first numeric KPI the phone channel has ever had.

What is a silent frame and why does 31.2 percent matter?

A silent frame is a 1/60th-of-a-second slice where both envelopes are under 0.02. The reference call has 1,922 silent frames out of 6,157, which is 31.2 percent. That silence is think time: the AI working through menu mapping, the customer deciding, the pause before an upsell. Managers reading the phone channel want silence to stay in a band. Too little means the agent is rushing and skipping clarifiers; too much means the agent is stalling on a menu it does not have mapped. The number is only useful because it is comparable across calls at 60 Hz resolution.

How long does the AI actually speak versus the customer in the reference call?

AI speech totals 49.04 seconds; customer speech totals 19.76 seconds. The remainder is silence and hand-off framing. That ratio (2.48 to 1, AI to customer) is high on purpose: the AI is reading back the full order twice and narrating the POS placement, both of which exist so the manager can audit the call against the seven structural segments. A cashier who interrupted would compress both sides and lose the audit trail. The manager treats the ratio itself as a metric, not just the total duration.

How do I integrate this channel into an existing weekly operations review?

Add one line to the review agenda called Phone (with KPIs: answer rate, cross-talk frames, silent frame percent, segment coverage). On Monday, open five sampled calls from the prior week in the dashboard. For each, tick the seven structural segments (greet, intent, clarifier, resolve, upsell, readback, POS placement) and check whether the cross-talk count is still 0 and the silent percent is in its band. The whole exercise is ten minutes. It replaces the old agenda line that said Phone etiquette, which had no measurable output.

Does this work for chains with multiple locations?

Yes. The envelope-and-caption artifact is per-call and per-location. Multi-unit operators aggregate by location and by daypart, the same way they already aggregate FOH covers. The data is small (roughly 6,000 floats per channel per 100 seconds of audio, plus the captions), so rolling it up across 10 or 100 locations is not a data-engineering problem. The management practice is new, but the infrastructure is not.

What does PieLine actually provide beyond the data file?

The AI agent that answers the phone 24/7 across up to 20 simultaneous calls, POS integration with Clover, Square, Toast, NCR Aloha and Revel, menu scraping and modifier mapping, same-day onboarding, and a dashboard that exposes the envelope, captions, and KPIs per call. Pricing is $350 per month for the first 1,000 calls and $0.50 per call after. The data artifact is a byproduct of the product doing its job, not a separate service.

How is this different from just recording calls, which restaurants already do?

A raw recording is not management infrastructure. It is a 102-second MP3 that a manager would have to listen to end to end. The envelope and the captions compress the call into a shape a human can read in seconds: where the speakers overlapped, where there was silence, which of the seven segments the agent actually covered. That is the difference between a video file and a dashboard, and it is why the phone can now be the seventh channel in the weekly review rather than the one everybody ignores.

Restaurant management and operations has six measurable channels. The phone makes seven.

The six channels every other guide already covers

Why the phone has always been the exception

The artifact the phone now leaves

The four numbers a manager now reads

How the artifact is produced, end to end

Stereo WAV to manager dashboard

What the manager actually does on Monday morning

The ten-minute phone channel review

Pull a sample of five calls from the prior week

For each call, verify cross-talk frames = 0

Check silent frame percent against the band

Tick the seven structural segments

Log the four numbers in the weekly ledger

What the ledger looks like in the terminal

What each of the four KPIs actually tells the manager

Cross-talk frames

Silent frame percent

Talk time ratio

Peak envelope

How this compares to the old way of managing the phone

The six old channels plus the new one, side by side

What disappears from the ops manual once this is in place

See the envelope on your own calls

Frequently asked questions

Comments (••)

Comments ()