POS modifier mapping

AI phone ordering POS modifier mapping: a walk through a real 102-second call

Every guide on this calls modifier mapping a feature checkbox: “we sync your modifiers from the POS.” That sentence is true and it explains nothing. Below is the actual choreography, traced second by second through a recorded PieLine demo, of how a spoken request becomes a modifier ID a kitchen printer can fire.

Matthew Diakonov, Written with AI

Published May 1, 20269 min read

Direct answer (verified 2026-05-01)

How does an AI phone agent map modifiers to my POS?

During onboarding the agent ingests every modifier group, modifier ID, item ID, required/optional flag, and price delta from your POS (Toast, Square, Clover, NCR Aloha, Revel, or any of the 50+ supported POS systems). On a call, it does not invent modifiers. It enumerates required groups out loud, slots the caller’s words into existing modifier IDs, posts the cart to the POS’s native order endpoint, and reads back the total the POS returned. Free-form requests like “can you add strawberries?” are resolved against the modifier list. Anything not in the list either gets remapped to the closest real option or surfaced as “not on the menu,” never fabricated. Verified against the file src/components/voice-activity-data.ts in the PieLine repo, which holds the Deepgram captions of the public demo audio at aiphoneordering.com.

The 102-second call, with the modifier moments marked

The transcript below is the recorded PieLine demo at aiphoneordering.com. The Deepgram timestamps come straight from src/components/voice-activity-data.ts in the open repo at mediar-ai/pieline-phones. Lines marked as commands are the moments the agent is actively doing modifier work; lines marked as success are POS round-trips.

dennys-order.mp3 (102.36s), modifier choreography

Three modifier moments stand out. At 15.98 seconds, the agent enumerates a required modifier group (eggs preparation) without prompting from the caller. At 19.18 seconds it enumerates the second required group (bread choice) and recites all four real options the POS configured. At 23.11 seconds it remaps a top-level “Coke” into the soft-drink combo modifier on the slam, out loud, so the caller hears the remap before it is committed. And at 65.98 seconds, when the caller asks “can you add strawberries, if that’s an option?”, the agent does not invent a price; it resolves “strawberries” against the cheesecake item’s modifier list and confirms the real modifier name back at 71.34 seconds: “New York style cheesecake with strawberry topping.”

Anatomy of a single modifier round-trip

The diagram below is what happens between the caller’s words at 29.39 seconds (“sourdough bread, scrambled”) and the POS having a complete cart it can validate. The voice agent never prices the modifier on its own. It collects, posts, and reads back.

Caller words → POS modifier ID → kitchen ticket

The validate step is what makes this architecture safe. If the agent ever attached a fabricated modifier ID, the POS would reject the cart, the call would stall, and the operator would see the failure within minutes. There is no version of this loop where a hallucinated modifier prints on a kitchen ticket. The POS is a hard gate.

What the agent actually sees during onboarding

Before the demo could exist, someone had to map the Lumberjack Slam to a real menu item ID and pull its modifier groups out of the POS. The structure below is the shape PieLine’s onboarding team produces from the menu APIs of Toast, Square, Clover, NCR Aloha, and Revel. Field names vary across vendors; the semantics do not.

menu/lumberjack_slam.json

Two things in this object are doing the heavy lifting. required: true on each group is what tells the agent it cannot post the cart until both groups have a selection, which is why the agent at 15.98 seconds does not skip ahead and try to fire the order. And the explicit list of modifier IDs (mod_eggs_scrambled, mod_bread_sourdough) is what makes the resolution step deterministic: the agent is not generating an ID, it is picking one from a finite set the operator approved.

Mapped vs. unmapped: what the call sounds like in each case

Most failure stories about voice ordering are stories about an LLM being asked to invent a fact the operator never gave it. A modifier the kitchen does not stock. A price that does not exist in the POS. A combo that has not been a real menu item since 2019. The architectural fix is to forbid the agent from inventing in the first place.

What an unmapped modifier sounds like vs. what a mapped one sounds like

The caller asks for strawberries. The agent does not know whether strawberries are a real cheesecake topping at this restaurant or just something the caller imagined.

Agent says "yes" without checking the POS, then the kitchen ticket prints with no strawberry line and the customer is surprised at pickup.
Or the agent guesses a price ("that will be 99 cents extra") that has no source in the POS, and accounting reconciles a phantom item later.
Or the order posts and is rejected by the POS because the modifier ID does not exist, and the caller hears dead air for 8 to 12 seconds while the agent retries.
All three are versions of the same failure: the LLM was asked to invent something the operator did not give it permission to invent.

Cuisine-specific modifier shapes that already exist in the wild

The Lumberjack Slam example is deliberately simple: two required groups, no nesting, no per-side scoping. Real restaurants run weirder structures. Below are six shapes PieLine has mapped against live POS instances for live customers. Each one resolves to a finite list of modifier IDs the agent picks from at call time, never invents.

Pizza: half-and-half + per-side toppings

A 16 inch pie with pepperoni on the left half, mushroom and olive on the right half, light cheese throughout. The agent stores it as one item with two side-scoped modifier groups, not two pizzas. The kitchen ticket prints LEFT/RIGHT lanes the line cooks already use.

Indian: spice level + jain prep

Spice (mild, medium, hot, Indian hot) is one required modifier group. Jain (no onion, no garlic, no root vegetables) is a single tag that fans out to three modifier IDs across every applicable item, so the kitchen ticket reads correctly without the caller having to enumerate.

Chinese: protein swap + extra-spicy oil

Beef Lo Mein with chicken instead of beef is a protein-swap group; the agent debits the beef modifier and credits the chicken modifier. Extra-spicy oil is a kitchen note the POS stores as a free-text field, posted as a special instruction.

Sushi: build-your-own roll

A custom roll is a parent item with three nested groups: protein (1 of N), wrap (1 of 1), add-ons (0 to 4). Each combination has a price delta the agent reads from the POS, not from a guess.

Mexican: no-cilantro + protein sub

An al pastor burrito with chicken instead of pork, no cilantro, extra guac. Three modifiers in three different groups; the agent enumerates the protein swap explicitly because it changes the price, asks once about cilantro because it does not, and adds guac to the add-on list with the price delta the POS provided.

QSR breakfast: combo upgrades

Lumberjack Slam plus a Coke is two top-level items, one of which (the slam) has two required modifier groups and one (the Coke) gets remapped at line 23.11 of the demo to a soft-drink combo modifier on the slam. The agent says it out loud so the caller hears the remap.

The onboarding work that makes call-time mapping deterministic

The visible part of modifier mapping is what the agent says on the phone. The invisible part is the week before the agent goes live, when PieLine’s team builds the mapping table that turns spoken English into modifier IDs the POS will accept. Three concrete things happen in that week.

First, the public menu is scraped. Every dish, every modifier name as customers see it, every spice level, every protein option. This is the vocabulary the agent has to recognise.

Second, the structured menu is pulled from the POS API. Toast Menus, Square Catalog, Clover Inventory, NCR Aloha menu config, Revel product catalog. This is the structure the agent has to post against. Each modifier from the public menu is paired to a modifier ID from the POS. Conflicts (a menu lists “add cheese” but the POS only has “extra cheese” in a different group, or two modifiers with the same display name) are surfaced and resolved with the operator before any call goes live.

Third, the agent is monitored on real calls during the first 7 to 14 days. Anything a caller says that the mapping cannot resolve is logged and reviewed. Real callers say things the public menu does not list (“light sauce,” “cut into eight,” “extra hot like last time”); each of those either gets a real POS modifier added to it or gets a polite decline added to the agent’s vocabulary. After two weeks the open-question queue empties, and the mapping is stable.

Why this matters for the operator

The operator’s real worry about voice AI is not that it will mishear “sourdough.” It is that the AI will silently agree to something the kitchen cannot make, and the customer will arrive at pickup with a complaint. Modifier mapping is the part of the architecture that prevents this. If the agent can only attach modifier IDs the operator already approved, the kitchen never gets a ticket it cannot fulfil. The receipt total comes from the POS. The kitchen ticket comes from the POS. The voice confirmation reads back the total the POS returned. All three artifacts agree because all three are derived from the same record.

That alignment is the difference between an AI phone agent that handles 90% of calls end-to-end (PieLine’s number from Idly Express, a live deployment in Almaden) and a voice bot that creates more reconciliation work than it removes. The line between the two is whether modifiers are mapped to real POS IDs or generated from an LLM’s guess about what a menu probably contains.

Want to hear the modifier choreography on your own menu?

A 15-minute demo will run the agent against your real POS modifier groups (Toast, Square, Clover, Aloha, Revel) and play back a recorded test order. No setup on your end.

Modifier mapping FAQ

How does an AI phone agent map modifiers to my POS?

During onboarding the agent ingests every modifier group, modifier, item, required/optional flag, and price delta from your POS (Toast, Square, Clover, NCR Aloha, Revel). At call time it does not generate modifiers from the LLM's imagination; it resolves the caller's spoken words against that ingested list, attaches the resulting modifier IDs to a cart object, and posts the cart to the POS's native order endpoint. The POS validates and accepts (or rejects) the cart. The agent then reads back the total the POS returned. Anything the caller asks for that is not in the modifier list either gets remapped to the closest real option or surfaced as 'I'm sorry, that is not on the menu' rather than invented.

What is a 'modifier group' and why does it matter for voice ordering?

A modifier group is the POS's bracket around choices that belong together: 'Bread choice' with min_selections=1 and max_selections=1, or 'Toppings' with min=0 and max=5. The required flag is what tells the agent it has to ask before it can post the order. The min/max are what tell the agent whether to keep enumerating after the caller's first answer. On the public PieLine demo at aiphoneordering.com, the Lumberjack Slam has two such required groups, which is why the agent at 15.98 seconds does not just submit the order; it asks 'how would you like your eggs cooked' and at 19.18 seconds 'white, brown, multigrain, or sourdough?'. Both of those are required modifier groups the operator configured in the POS, and the agent surfaces them automatically.

How are 'free-form' caller requests handled, like 'can you add strawberries if that's an option'?

The same way as enumerated requests: they get resolved against the POS modifier list. At 65.98 seconds in the demo audio the caller asks 'can you add strawberries, if that's an option?'. The agent does not invent a price or a new SKU. It checks whether the cheesecake item has a strawberry modifier in any of its modifier groups, finds mod_topping_strawberry, and at 71.34 seconds confirms 'You got it. One slice of New York style cheesecake with strawberry topping.' The verb 'topping' is the modifier group's display name from the POS, which is why the kitchen ticket and the spoken confirmation say the same thing. If strawberries had not existed in the POS, the agent would have offered the closest existing topping rather than promise a strawberry the kitchen does not carry.

What happens when the caller says something the modifier list does not contain?

The agent does one of three things, in order. (1) Try a fuzzy match against the modifier names the operator gave it ('coke' resolves to 'soft drink' if Coke is configured as a soft-drink modifier or item, which is exactly what happens at 23.11 seconds in the demo). (2) If no match, offer the closest real option ('we don't have strawberry, we do have raspberry coulis or chocolate sauce'). (3) If neither works, decline and move on without putting an unmapped string into the cart. The architecture forbids the agent from making up modifier IDs, so an unmappable request never reaches the POS.

Why post the whole cart to the POS instead of letting the AI track quantities and prices internally?

Two reasons. First, the POS is the source of truth for prices, modifier price deltas, taxes, surcharges, delivery fees, and item availability. An LLM that tries to track these in its own state ends up out of sync with the POS within hours; an LLM that posts the cart and reads back the response is correct by construction. Second, the kitchen ticket and the receipt both come from the POS. If the cart was never written to the POS, those two artifacts diverge from what the customer heard, and reconciliation becomes manual work for a manager. PieLine's architecture posts the cart and reads back the total the POS returned. The 2.4 second gap between 'Placing your order now' at 89.12 seconds and 'Done. Your total is $34.11' at 91.52 seconds in the demo is the round trip to the POS.

How is the modifier mapping built during onboarding without breaking the restaurant?

PieLine's onboarding team scrapes the public menu, then pulls the structured menu from the POS API (Toast Menus API, Square Catalog API, Clover Inventory API, etc.). Each spoken modifier name from the public menu is paired to a POS modifier ID. Conflicts (a public menu lists 'extra cheese' but the POS only has 'add cheese' under a different modifier group) are flagged and resolved with the operator. Once the mapping is signed off, calls go live. During the first week of monitoring, real call transcripts surface any modifier the AI guessed at; those get added to the mapping or escalated to the operator. After about 7 to 14 days the mapping is stable and the queue empties.

What about cuisine-specific modifiers like half-and-half pizza, jain prep, or sushi build-your-own?

These are stored as nested or scoped modifier groups in the POS, and the agent reads them the same way it reads simple ones. Half-and-half pizza is one item with side-scoped modifier groups (LEFT, RIGHT) and per-side toppings; the kitchen ticket prints two lanes. Jain prep is a single tag in the agent's vocabulary that fans out to three modifier IDs (no onion, no garlic, no root vegetables). A build-your-own sushi roll is a parent item with three nested groups. The mapping work happens once during onboarding; on calls, the agent enumerates only the required groups and infers the rest from the caller's words.

Where can I verify the timestamps and modifier mappings cited on this page?

Open src/components/voice-activity-data.ts in the PieLine repo (mediar-ai/pieline-phones on GitHub). The file contains the Deepgram multichannel transcription of public/audio/dennys-order.mp3 with start and end fields per caption. Search for 'how would you like your eggs cooked' (15.98 seconds), 'white, brown, multigrain, or sourdough' (19.18 seconds), 'can you add strawberries' (65.98 seconds), and 'New York style cheesecake with strawberry topping' (72.15 seconds). The audio file is the same one the live demo plays at aiphoneordering.com, so the modifier choreography described above is reproducible without booking a call.