Restaurant operations, rewritten as queueing theory

For the first time, Erlang C applies to your restaurant's phone line

The textbook tools of operations management (Little's Law, M/M/c, offered load, Erlang C) have been used to plan call centers since 1917. They have never been applied to a restaurant's inbound line because the phone was always a c=1 server multiplexed with a cashier. A 20-server AI agent ends that. Here is the math, with real numbers from the reference call shipped in this repo.

Matthew Diakonov, Written with AI

Published April 23, 202611 min read

Book a 15-minute demo Jump to the Erlang table

4.9from 200+ restaurants

c=20 concurrent agents per location

102.36 second reference call in the repo

703 calls/hour per-location hard ceiling

c = 20. One restaurant, twenty agents.

The phone line is finally a real queueing system.

Mean service time: 102.36 seconds

Service rate per server: 35.17 cph

Hard ceiling: 703 calls per hour

Wait probability under 1% to 6 cpm

Erlang C, first time on the phone

0:00 / 0:05

The queueing model every operations playbook skips

Published operations material (NetSuite, Toast, Indeed, Supy, HigherMe, Planday, eduMe) converges on the same list of levers: food cost percentage, prime cost, labor as percent of sales, table turnover, RevPASH, occupancy, average ticket. Little's Law (L = λW) shows up for the dining room. Queueing math for the phone does not.

The reason is honest. A restaurant phone has historically been a c=1 server whose operator is also ringing up in-store guests, bagging food, handing back change. The service-time distribution is unobservable because the operator is multiplexing. You cannot write a meaningful Erlang curve around an operator who might or might not be on the phone when a call lands.

An AI agent that answers up to 20 calls at a time, each call its own independent server, collapses that assumption. The phone is now a clean M/M/c queue with an observable mean service time. Every tool in the operations research canon applies. That includes Erlang C, the formula that runs every call center on earth.

The queue, drawn

Arrivals land on a Poisson process at rate λ. Each one is handed to any of c=20 identical agents, each serving at rate μ=35.17 calls/hour. The output is a POS ticket. No round-robin, no shared backend, no head-of-line blocking. That is the M/M/c assumption, drawn literally.

Inbound arrivals (Poisson λ) → 20 agents → POS tickets

Three numbers that pin the curve

Three observables fully determine the Erlang C curve for any restaurant running PieLine. Two are fixed by the service contract. The third is a single measurement you take from your own call log.

0Concurrent agents (c)

0sMean service time (1/μ)

0Service rate per server (cph)

0Hard ceiling (calls/hour)

The mean service time comes from /src/components/voice-activity-data.ts, specifically the duration field of the exported voiceData object. Open the file; it reads "duration":102.36. That is a real Denny's-style order recorded end to end (caller greeting through POS acknowledgement) and it is what sets the ceiling at 703 calls per hour for a default plan location.

The code, 26 lines

No libraries, no services, no dependencies. You can paste this into a notebook or a terminal and see your restaurant's position on the curve in under a minute. Replace MEAN_SERVICE_SEC with the average length of your last ten phone orders if you want a restaurant-specific curve.

restaurant_ops_erlang_c.py

The table, in arrivals per minute

Running the script above with c=20 and 1/μ = 102.36 seconds gives this table. Read it as a capacity decision: where on the curve is your busiest hour of the week, and what does it tell you about the next staffing or plan-tier move?

restaurant_ops_erlang_c.py

A few things fall out of this. First, wait probability stays under one percent until around six calls per minute; between zero and that, the queue is not doing anything interesting. Second, the curve is sharply nonlinear: going from eight cpm to ten cpm pushes wait probability from eight percent to forty percent. Third, the ceiling at c=20 is near twelve cpm, not a soft drift; once utilization passes ninety percent the line collapses. A sane operations plan sits left of seven cpm at the busy-hour peak and never thinks about the phone again.

Why this was impossible last year

Feature	Cashier + desk phone (c = 1, M/G/1)	PieLine (c = 20, M/M/c)
Server count (c)	1, multiplexed with register duties	20, each an independent agent
Service-time distribution	Unobservable (noisy, confounded with checkout)	Observable, measurable per call
Queue discipline	FIFO, but blocked by in-store interrupts	FIFO, head-of-line blocking impossible
Erlang C applicable	No, assumption failures on c and on service distribution	Yes, textbook M/M/c
Saturation point (calls/hour)	Collapses around 25 to 35 cph	Hard ceiling near 703 cph per location
Capacity planning horizon	Reactive only; lagging metrics	Forward-looking; Erlang C from expected arrival rate
Wait probability at 6 cpm	~100% (single server saturated)	0.48% (ample headroom on the curve)

M/M/c with μ = 35.17 calls/hour/server; the c = 1 row uses the same μ for a fair comparison.

Wiring Erlang C into your weekly operations review

The point of the math is not to run it once for a landing page. It is to make it a fifteen-minute line on the same Monday review where you already read prime cost and RevPASH. Four steps, in order, take a first-time operator from "I have never seen an Erlang curve" to "I know where we sit."

Pull your busiest-hour arrival rate

Export the PieLine analytics call log, filter to the busiest 60-minute window of last week, and divide count by 60. Example: 360 calls between 5pm and 6pm Friday is 6.0 cpm.

Measure your own mean service time

Average the duration of your last ten completed orders from the same log. The 102.36-second reference call in the repo is a reasonable starting point; your menu complexity will shift this by plus or minus 25 percent.

Plug c, μ, and λ into the 26-line Erlang C script

Every restaurant running PieLine on the default plan uses c = 20. μ comes from step 2, λ from step 1. The script prints your wait probability, which is the output you care about.

Make one decision per week

Under 1%: do nothing. Between 1% and 10%: the cheapest next move is menu modifier-graph tuning, which lowers service time and shifts the curve right. Above 10%: raise a plan-tier question on the same review where you already read prime cost.

What the curve looks like at a real Bay Area chain

Mylapore is an eleven-location South Indian chain in the Bay Area running PieLine across the fleet. The busiest location (Mylapore San Jose) peaks around 240 calls across the 4pm to 9pm Friday dinner window. That is roughly 48 calls per hour, or 0.8 calls per minute, averaged across the window. Even at the hottest fifteen-minute sub-window the arrival rate stays under 2 cpm.

At 2 cpm with c=20 and 1/μ = 102 seconds, offered load is roughly 3.4 erlangs and wait probability is essentially zero. The queue is a non-issue. Where the operations model helps is not in the typical Friday; it is in the weeks where the arrival rate jumps (a promo, a holiday, a Super Bowl Sunday) and you need to know, before the night, whether the curve is about to push past the six cpm mark. Erlang C is forward-looking. It is the first tool a restaurant has ever had that answers that question for the phone.

“The experience was better than speaking to a human. No hold time, no confusion, no rushing.”

Mylapore customer, San Jose

After PieLine rollout

Three things the math actually buys you

Capacity planning that is not a guess. When a promo lands on a Friday and the marketing team projects a 40 percent lift in call volume, Erlang C tells you in advance whether you will clear it at under one percent wait probability, or whether the curve has bent and you need to signal it to the team. "Is the phone ready for Sunday" becomes a number, not a feeling.
A menu-complexity lever you can actually pull. Service time—not arrival rate—is the leftward-shifting lever on the curve. If your modifier graph is bloated and mean service drifts from 102 seconds to 140, the curve moves visibly left. The operations manager now has a specific reason to push menu engineering for shorter conversations: "this saves 30 seconds per call and moves our wait-probability threshold from six cpm to eight."
A ceiling you can stop worrying about. The 703 calls per hour hard ceiling at c=20 is a number no single restaurant location will hit. Knowing that, explicitly, ends a class of anxiety that normally occupies ops managers on the busiest nights of the year. The phone was a bottleneck before. It is now a cleared queue with six to seven times the headroom of peak.

Run your own Erlang C against our line

Fifteen-minute walkthrough: we pull your busy-hour arrival rate, measure your service time, and graph where you sit on the curve. No slide deck.

Frequently asked questions

Why does every published guide on restaurant operations skip queueing theory for the phone?

Because the phone was historically a c=1 server shared with a cashier. Mathematically that is an M/G/1 queue whose service-time distribution you cannot observe (the cashier is also ringing up in-store guests, handing back change, bagging food), so the moments are noisy and no honest Erlang math is possible. Guides skip to what is countable: Little's Law on tables, RevPASH, food cost percentage, prime cost. A c=20 AI agent is different: it is a clean M/M/c queue with an observable mean service time, so Erlang C applies for the first time.

What is the mean service time I should use in the math, and how do I verify it for my own restaurant?

PieLine ships a reference call in the repo at /public/audio/dennys-order.mp3, transcribed into /src/components/voice-activity-data.ts, with a duration of 102.36 seconds. That is the full caller-greeting through POS-ticket-ack span for one complete order. Pull ten of your own recent calls, average the lengths, and use that number instead. Anywhere from 80 to 140 seconds is normal for a restaurant; the number does not move the shape of the curve much, it only shifts where the ceiling lands.

How do I translate the Erlang C table into a hire/no-hire decision for my own floor?

Pull the busiest 60-minute window from your POS phone-order log. Divide the call count by 60 to get arrivals per minute. Read down the table: below 6 cpm you are under one percent wait probability with c=20, which is a solved capacity problem and you should be looking at staffing elsewhere. Between 6 and 10 cpm you are still under the ceiling but wait probability is climbing visibly, which is where extending concurrency (not hiring) is the cheapest move. Above 10 cpm you are at the wall; either you are leaving calls on hold now, or you would be if the phone had been serial.

How is this different from measuring after_time_to_answer or missed call rate?

Those are lagging metrics; by the time they move, you already lost revenue and cannot recover it. Erlang C is a forward-looking capacity model. You feed it the arrival rate you expect (Valentine's day, Super Bowl, a promo weekend) and it tells you the exact wait probability you are about to see, before the night happens. Operations planning is forward; lagging KPIs are performance review. Both matter, but they are different tools.

Does the 20-call ceiling ever matter in practice for a normal restaurant?

At a 102-second mean service time, 20 servers saturates at 703 calls per hour. No single restaurant location we have seen crosses that. Mylapore San Jose, a high-volume Bay Area South Indian chain location, peaks around 240 calls in a 4pm-to-9pm Friday dinner window, which is roughly 48 calls per hour. That is under 8 erlangs of offered load, where Erlang C with c=20 returns well under one percent wait probability. The ceiling exists for multi-tenant deployment planning and for extreme promo events, not for normal operation.

What happens to the classical operations tools (Little's Law, RevPASH, prime cost) after the phone becomes a c=20 queue?

Nothing. They still own the FOH and BOH workstreams exactly as before. The phone was always sitting outside those models because it was not a workstream you could schedule against. Adding Erlang C for the phone extends the ops manager's toolkit from one queueing model (Little's Law for seats) to two. The tenth row of a weekly operations review becomes, for the first time, a forward-looking phone capacity line.

Are PieLine's AI agents really independent servers in the queueing sense?

Yes. Each concurrent call runs as its own conversational agent with its own menu lookup, its own POS write, and its own modifier graph traversal. There is no round-robin to a single backend and no head-of-line blocking. From the caller's perspective the first ring is picked up regardless of what the other 19 servers are doing. That is the textbook M/M/c assumption and it is not an approximation here.

What does the weekly arrivals-per-minute check actually look like in practice?

Pull the PieLine analytics export, filter to the busiest hour of the busiest day, and divide the call count by 60. If that number is below 6 you are doing nothing. If it is above 8 you start thinking about whether the menu modifier graph needs tuning (longer service times push the curve left). If it is above 10 you raise it on the weekly review and the ops manager looks at plan tier. The cadence is a 15-minute weekly glance, not a project.

Does PieLine need anything installed for this math to work?

No. PieLine's onboarding scrapes the menu URL, maps each dish to location-scoped POS item IDs, and forwards the restaurant line. There is no hardware install, no IT team requirement, and no on-premise server. The 20-call concurrency ceiling is a property of the service tier you buy, and the per-call service time is whatever your menu complexity and caller patience dictate. Both are observable from day one.

What is the pricing and how does the math work against a human phone hire?

$350 per month for up to 1,000 calls at a location, $0.50 per call beyond that. A dedicated phone employee is $3,000 to $4,000 per month and is by definition a c=1 server who also drops call quality the moment a second line rings. Erlang math on c=1 with the same service time cannot move above about 35 calls per hour before wait probability goes to 100 percent. c=20 at one tenth the cost is not a marginal improvement, it is a different queueing class.

PieLine answers up to 0 concurrent calls per location, with a 0 calls/hour hard ceiling at the reference service time of 102.36 seconds.

For the first time, Erlang C applies to your restaurant's phone line

The queueing model every operations playbook skips

The queue, drawn

Inbound arrivals (Poisson λ) → 20 agents → POS tickets

Three numbers that pin the curve

The code, 26 lines

The table, in arrivals per minute

Why this was impossible last year

Wiring Erlang C into your weekly operations review

Pull your busiest-hour arrival rate

Measure your own mean service time

Plug c, μ, and λ into the 26-line Erlang C script

Make one decision per week

What the curve looks like at a real Bay Area chain

Three things the math actually buys you

Run your own Erlang C against our line

Frequently asked questions

Frequently asked questions

Comments (••)

Comments ()