Last updated: 30 April 2026
A mid-sized city museum we spoke with last quarter got three quotes from AI audio guide vendors. The numbers spread across roughly an order of magnitude — a low-thousands annual figure, a mid five-figure annual figure, and a "let's talk" that turned into a much larger number once the contract draft arrived. Same museum. Same scope.
The director asked us what was going on. The short answer: each vendor was pricing a different product shape, even though they were all selling "AI audio guides." One was a per-interaction model that billed on actual usage. One was a flat annual SaaS with a tier based on visitor count. One was a legacy audio-guide company that bolted AI onto the top of their old content-production workflow and kept charging like they were still recording voice actors in a studio.
So before anyone can tell you what an AI audio guide costs, you have to know which of those three things you're buying.
A note on numbers and currency. Throughout this piece, ranges are given in euros and reflect publicly listed vendor pricing (where available) plus what we've seen in quotes museums have shared with us. They are illustrative industry ranges, not Musa's published prices — Musa pricing is bespoke per museum. Where a vendor publishes in another currency (e.g. Pathoura in GBP), their listed currency is preserved. If a vendor's quote falls well outside these ranges, that is a signal worth interrogating, not a verdict.
The three pricing shapes
AI audio guide pricing collapses to three structures. Names vary by vendor. The mechanics don't.
Per-interaction (usage-based). You pay a small amount each time a visitor starts a session or triggers AI generation. We typically see this priced in the low tens of cents per session. No upfront fee. No minimum. If nobody uses the guide one month, you pay nothing that month. If adoption spikes during a summer blockbuster, your bill goes up, but so does your visitor experience.
Revenue share on paid guides. Visitors pay a small ticket (often €2-€6, in line with the standard self-guided audio-tour price point) through the vendor's platform. The vendor takes a cut and the museum keeps the rest. Splits vary widely: VoiceMap, for example, pays creators a 50% or 65% royalty depending on plan, while other museum-focused platforms keep a smaller, undisclosed fee per sale. A 20-40% vendor share on the museum side is a reasonable industry-typical band to use when modelling. Zero cost to the museum if nobody buys. The vendor earns only when visitors find the guide valuable enough to pay for it.
Flat monthly SaaS. A predictable subscription. Public pricing from one transparent vendor, Audio-Cult, starts at €248 per month for one tour (around €3,000/year), with extra tours from €49 each. Pathoura's smartphone plans start at £30/month for the Flexi 50 tier, with extra languages billed at £10/month and AI narration credits at £18 per 60 minutes — a few hundred pounds a year for a small site. Larger or more bundled plans climb from there into the low thousands per month at the enterprise tier, sometimes with extra fees for languages or analytics. The bill is the same whether fifty visitors use the guide or fifty thousand.
Each of these can be "AI-powered" on the product page. The economics underneath are completely different.
It is also worth noting two outliers that distort the "what does this cost" question entirely. Bloomberg Connects is funded by Bloomberg Philanthropies and is free for participating institutions and visitors. izi.TRAVEL operates a marketplace model where institutions can publish guides without a museum-side licensing fee. Both shift the cost question from "how much do I pay the vendor" to "what control, branding, and integrations am I willing to give up."
What the numbers actually look like
Below are illustrative ranges by museum size, calibrated against publicly listed vendor pricing and the quotes museums have shared with us. They are intended as an order-of-magnitude sanity check, not a price book — and they are not Musa's published rates.
Small museum (under 50,000 annual visitors). Per-interaction: low four figures per year is the realistic ceiling at modest adoption. Revenue share: often net positive — visitor payments minus the vendor cut typically exceed the museum's costs to operate the guide. Flat SaaS: the publicly listed entry plans we've seen put small-museum SaaS in the €2,500-€6,000/year band before any content production add-ons. At this size, flat SaaS often costs more than you'd ever recoup in value. Usage-based almost always wins.
Mid-sized museum (50,000-300,000 annual visitors). Per-interaction: realistically in the €3,000-€18,000/year band, scaling with adoption. Revenue share: highly variable — a museum charging €3 with ~15% adoption and a 30% vendor cut would net a low five-figure sum from a 150,000-visitor pool, illustrative only. Flat SaaS: typically €10,000-€22,000/year at this scale, based on quotes museums have shared. This is the band where vendors fight hardest, and where the pricing model matters most. A mid-sized museum that misjudges adoption on a flat plan can overpay by 2-3x.
Large museum (300,000+ annual visitors). Per-interaction: low to high five figures per year. Revenue share at a major site can run into six figures net to the museum if visitor uptake is strong. Flat SaaS: enterprise tiers commonly land in the €25,000-€90,000/year range based on quotes we've reviewed, sometimes higher with custom integrations. At this scale the flat plan starts to make mathematical sense if your adoption is reliable, because your per-session cost drops below what per-interaction would bill. But you're trading cost predictability for losing the alignment between what you pay and what visitors actually use.
The pattern is consistent. At low volumes, flat-fee pricing punishes you. At high volumes with proven adoption, it starts to pay off. In the middle (which is most museums) usage-based pricing keeps the risk on the vendor's side of the table, which is where it belongs when nobody knows yet whether the guide will land.
When each model actually fits
Flat SaaS is the right choice in a narrow set of cases. You have predictable, high volume. You've run a guide before and know your adoption rate within a few percentage points. Your finance team won't approve a variable line item, full stop. If all three apply, a flat plan is fine. Otherwise it's a bet the vendor has designed to win.
Per-interaction pricing fits everyone else. It scales with your reality. A slow February costs you less. A busy summer costs more, but it also means the guide is doing its job. If a new exhibition flops with visitors, you don't keep paying for a guide nobody opened. The cost curve tracks the value curve.
Revenue share is the right call when you want the vendor to have actual skin in the game. If they earn only when visitors buy the guide, they will help you with onboarding, signage, staff scripts, and anything else that drives adoption. A vendor on a flat plan has already been paid. A vendor on revenue share is still selling, every day, alongside you. That's a different working relationship.
We lean hard toward usage-based or revenue-share models for most situations for exactly this reason. The incentive alignment is worth more than the paper savings of a flat deal that looked cheaper on the spreadsheet.
The hidden costs vendors don't lead with
The headline number on the quote is rarely the number you pay. Here is what gets buried.
Setup and onboarding fees. Some vendors charge a four- or five-figure fee to "set up" the guide. When you ask what that covers, it's often content ingestion (your catalog is uploaded into their system) and a kickoff workshop. With an AI-native platform, this work is automated or takes an hour. The fee is a holdover from when human producers wrote scripts for each stop. Push back on it. If the vendor can't itemize the hours, it's padding.
Content migration. If you have an existing audio guide — old recordings, transcripts, wall text in a CMS — someone has to move that content into the new system. Some vendors do this free. Some charge per stop. For context, Pathoura's Launch Assist add-on is listed at £5 per object per language (covering descriptions, scripts, photos, and narration or translation); outsourced content production for a full tour typically costs €400-€2,000 per tour and language when human writers, voice actors, and engineers are involved. Translate that into a per-stop figure across a 60-stop museum and you can see how the "services" line of a contract balloons.
Language fees. This one still surprises people. A vendor quotes a flat monthly figure, then mentions that price includes three languages, with each additional language carrying its own monthly add-on (in single-digit euros per language per month at smaller vendors — Pathoura, for instance, lists extra languages at £10/month — and more at enterprise vendors). For a museum that wants ten languages for international visitors, you can double your bill before anyone presses play. AI-generated translation does not need separate production runs per language, so this fee is legacy pricing logic applied to a new product.
Integrations and SSO. Ticketing integration, CRM sync, analytics export, SSO for staff accounts. Each one tends to show up either as a one-time fee or a monthly add-on. If you need the guide to talk to your ticketing system for admission bundling, budget for it explicitly before signing.
Device or kiosk hardware. AI audio guides are BYOD almost by definition, but some vendors still push loaner tablets or charging stations for accessibility compliance. Traditional hardware-led systems remain expensive — a 100-device traditional setup with content production lands around £9,000-£10,000 just to launch, before annual maintenance. If hardware is a hard requirement on your side, fine — but it should be optional and clearly priced.
Minimum commitments and auto-renewal. Read the term carefully. A flat SaaS plan with a multi-year minimum and auto-renewal isn't a subscription. It's a capital expenditure disguised as an operating one. If the guide underperforms, you're locked in anyway. Usage-based contracts rarely have this problem because the vendor's incentive is to keep you using, not to trap you.
Why the zero-capex case keeps winning
Strip away the pricing-page poetry and there's a structural argument underneath all of this.
The old audio guide business required a huge upfront investment: scripts, studios, voice actors, devices. Vendors charged upfront because they spent upfront. That logic is gone. AI generation happens at runtime, per visitor, at compute cost that's measured in fractions of a cent. The marginal cost of one more session is close to zero. The marginal cost of one more language is close to zero. The marginal cost of updating the whole tour because you rehung a gallery is close to zero.
When the underlying costs look like that, pricing should too. Usage-based or revenue-share reflects the actual shape of what's being delivered. A flat fee plus setup fee plus per-language fee is a pricing structure designed for a product that no longer exists.
The museums that benefit most from AI audio guides are the ones that couldn't afford the old model: small sites, regional collections, heritage locations, community museums. These are the institutions that most need the guide to cost zero until visitors actually use it. That's not a nice-to-have. It's the difference between launching and not launching at all.
We've watched boards approve modest, low-four-figure usage-based deals that would never have cleared approval as five-figure capital projects. The five-year spend can be the same order of magnitude. The political and financial reality of getting it started is completely different.
What to ask every vendor before signing
When you're comparing AI audio guide quotes, don't compare the headline number. Compare the shape.
- What's the monthly cost at zero usage? If it's not zero, you're on a flat plan regardless of what the vendor calls it.
- What's the cost per additional language? If there is one, walk.
- What's the setup fee, and what specifically does it cover in hours?
- What's the minimum contract length? What happens if we cancel at month three?
- If we add a temporary exhibition with 15 new stops, what does that cost?
- What's the cost curve if our adoption doubles? Triples? Halves?
- Are analytics, accessibility features, and staff accounts included, or priced separately?
The answers will sort the vendors faster than any feature comparison. A vendor who gives straight, itemized answers is one you can build a long-term relationship with. A vendor who deflects into "let's schedule a call to discuss your needs" is one whose pricing model probably can't survive direct scrutiny.
Where we land
If you're running procurement for an AI audio guide right now, our honest advice is to start with a usage-based or revenue-share model unless you have a specific reason not to. Put the vendor on the hook for adoption. Keep your downside small. Measure what happens for three to six months. If the guide lands, you can consider moving to a flat plan for cost predictability once your usage is known. If it doesn't land, you walk away having spent a few thousand, not a few hundred thousand.
This is the philosophy behind how we price Musa — usage-aligned, with no per-language fee and no minimum term — because it's the model that actually reflects what AI-generated content costs to deliver. (Specific Musa pricing is bespoke per museum and is shared during the procurement conversation.) Other vendors price this way too. Our argument isn't "pick us." It's "pick the shape that aligns your vendor's incentives with your visitors' experience," and most of the time that rules out the flat-SaaS-with-hidden-fees quote before you even get to feature comparison.
For broader context on how audio guide pricing works across all formats, see our piece on audio guide pricing models. If you want the full multi-year cost picture including hardware and BYOD alternatives, the total cost of ownership breakdown has the five-year math. And if you're specifically weighing revenue-share deals, we wrote about how to structure them in audio guide revenue share models.
The vendor quote on your desk right now probably isn't wrong. It's just one of several valid shapes. Make sure you're choosing the shape on purpose.