Most museums approach audio guides with the same playbook. Hire a producer, script every stop, record the narration, deploy the hardware, move on. Some variation of this has been standard practice for decades.
It also produces mediocre results at most institutions. Not because anyone's doing it maliciously, but because the inherited assumptions about how audio guides should work are wrong in specific, fixable ways.
We've worked with enough museums to see the same mistakes come up repeatedly. Not at one or two places — across the board, regardless of size or budget. Here are the patterns that cause the most damage, and what to do instead.
Trying to cover every object
The instinct is understandable. You have 400 objects in your collection, so you want 400 audio guide stops. Completeness feels like quality.
It's the opposite. A guide with 400 shallow stops is worse than a guide with 25 deep ones. Visitors don't walk through a museum checking items off a list. They spend time with the things that interest them and walk past the rest. A guide that says something interesting about 25 objects gives those visitors something worth stopping for. A guide that says one generic paragraph about 400 objects gives nobody a reason to keep listening.
The math on production supports this too. Writing, recording, and translating 400 stops takes enormous effort and budget. Writing 25 great stops takes a fraction of that, and you can actually afford to make each one good — proper research, thoughtful scripting, interesting angles beyond the wall label.
Start focused. Twenty stops, done well. With AI-based systems, expanding later is cheap. Adding a new stop to Musa takes minutes, and it automatically speaks in the voice and style you've already established. The initial 20 create the scaffolding; new content inherits it. But even with traditional guides, a tight selection outperforms an exhaustive one.
Writing catalog entries instead of conversations
Museum audio guides are written by people who know how to write about art. That's the problem. Curators write like curators. The result is wall labels you listen to instead of read — provenance details, art historical context, formal analysis. Technically accurate, tonally dead.
Visitors are standing in a room, probably with someone else, possibly tired, definitely distracted by everything around them. They don't want a lecture. They want someone to tell them something interesting about the thing they're looking at.
The difference is the difference between "This oil-on-canvas work, executed in 1887, represents the artist's transition to a post-Impressionist palette" and "Look at the sky in this painting. See how the blue isn't quite blue? Van Gogh mixed in green and yellow because he was painting what the sky felt like at that moment, not what color it technically was."
Same information. One sounds like a textbook. The other sounds like a person standing next to you.
Write for ears, not eyes. Short sentences. Direct address. Questions that make visitors look more closely. If your script would work as a museum wall panel, it's wrong for audio. Audio is intimate — someone's voice inside your head. Use that.
Ignoring the adoption funnel
This is the one that wastes the most money. Museums spend months and tens of thousands building a guide, then put a small sign near the coat check and hope for the best.
The adoption funnel has four stages: awareness (do visitors know the guide exists?), access (can they get to it easily?), setup (is the onboarding smooth?), and content (is it good enough to keep going?). Most museums pour everything into content and skip the first three stages.
We've seen museums with excellent guides running at 2-3% adoption because visitors literally didn't know the guide was available. No mention on the website. No prominent signage. Front desk staff who'd never tried it themselves. The guide might as well not exist.
The fix is embarrassingly simple. Put it on your website where people actually plan visits. Put a real sign at the entrance — not an A4 printout, a proper banner. Train your front desk to mention it in every greeting. One sentence: "We have a free audio guide, just scan that code." That's five seconds of effort per visitor, and we've watched it double adoption rates at multiple institutions.
The best guide in the world is worthless if 2% of your visitors find it.
One-size-fits-all content
A retired art history professor and a family with two kids under ten walk into the same gallery. They both press play on stop 7. They hear the same 90-second recording.
The professor finds it shallow. The kids are bored thirty seconds in. The parents feel guilty for subjecting their children to it. Nobody had a good experience, and the guide was technically fine.
This is the fundamental limitation of recorded audio guides. One script, one delivery, one depth level. Museums know their visitors are diverse — they just accept that the guide can only serve one slice of that diversity. Usually the middle: educated adult, some interest in art, no specialist knowledge. Everyone else gets an experience that wasn't designed for them.
AI changes this completely. A system like Musa delivers different content to different visitors based on how they interact. Someone who asks deep questions gets deep answers. Someone clicking through quickly gets shorter, punchier introductions. A child hears different language than an adult. The content adapts because the system responds to the visitor rather than playing a recording.
This isn't a nice-to-have. It's the difference between a guide that works for 30% of your audience and one that works for 80%.
Set-and-forget content
Here's what happens after a traditional audio guide launches. For the first few months, it's current and accurate. Then the museum rearranges a gallery. Adds a temporary exhibition. Deaccessions a piece. Loans something to another institution. New research changes the interpretation of a key work.
The guide doesn't change. Updating it means re-scripting, re-recording, re-translating. That costs real money — often enough to make the museum decide it's not worth it. So the guide stays frozen. A year later, it's describing objects that have moved. Two years later, it's missing your best recent acquisitions. Three years later, visitors are being directed to galleries that no longer exist.
Content rot is the default outcome of any system where updates are expensive. It's invisible at first — the guide still works, it just becomes slightly less accurate and relevant every month until one day it's actively misleading.
The museums that keep their guides current are the ones where updating is cheap and fast. With Musa, updating a stop takes minutes. Adding a new one is just loading the content — the system handles scripting, voice, and translation automatically. No production cycle, no re-recording, no waiting months for a vendor. The guide stays alive because keeping it alive is trivial.
If your current system makes updates expensive, you're not going to update. Plan accordingly.
Not involving front desk staff
Your front desk is the most powerful distribution channel you have. Every visitor passes through it. Most visitors have a brief interaction with the person behind the counter. That interaction is the single best opportunity to drive audio guide adoption.
Yet front desk staff are consistently the last people trained on the audio guide. Often they're not trained at all. They know the guide exists the way they know the fire extinguisher exists — it's there, they've seen it, they couldn't tell you how it works.
This isn't a technology problem. It's a people problem. Front desk staff need three things: they need to have used the guide themselves (even once, for ten minutes), they need a one-sentence pitch that fits naturally into the greeting, and they need to believe it's worth mentioning. The third one comes from the first. People who've experienced the guide recommend it. People who've only heard about it in a staff meeting don't.
Block an hour. Have every visitor-facing employee walk through the guide as a visitor would. Then put the mention into the standard greeting script. That's it. This alone moves the needle more than any signage redesign or website update.
Comparing to human guides instead of no guide
Museums evaluate audio guides by comparing them to their best docents. "The audio guide can't do what Sarah does on her Thursday afternoon tour." No, it can't. Sarah is a person who reads the room, adjusts her pace, tells jokes, and has forty years of knowledge.
But Sarah leads one tour a day for 25 people. Your museum sees 800 visitors on a Saturday. What interpretation do the other 775 get?
Nothing. They get wall labels.
The right comparison isn't audio guide versus your best docent. It's audio guide versus no guide at all. The vast majority of museum visitors experience a collection with zero interpretation beyond what's printed on the wall. A good audio guide doesn't need to replace Sarah — it needs to give the other 97% of visitors something closer to what Sarah provides than what a wall label provides.
This reframing matters for decision-making. When the bar is "better than our docent program," almost any audio guide feels inadequate. When the bar is "better than nothing for most visitors," the value proposition is obvious and the design priorities shift entirely.
Over-investing upfront
The traditional procurement model for audio guides looks like this: write an RFP, evaluate vendors, sign a multi-year contract, spend six to twelve months in content production, launch, and then live with the result for five to ten years.
This front-loads all the risk. You're making the biggest decisions — which vendor, what content, how many languages, which stops — before you have any data on how visitors will actually use the guide. You're spending the most money at the point of maximum ignorance.
Then you're locked in. Switching vendors means writing off the upfront investment. Changing content means another production cycle. The sunk cost fallacy kicks in and museums stick with underperforming guides for years because they can't stomach abandoning the investment.
The alternative is to start small and iterate. Launch with a focused pilot — one gallery, one tour, a few languages. See what visitors actually do. Which stops do they listen to? Where do they drop off? What questions do they ask? Use that data to expand. Add stops that visitors clearly want. Drop the ones nobody uses. Test different content styles.
This only works with systems where expansion is cheap. It's impractical with traditional recorded guides — you can't exactly re-record your narrator for three new stops. But with AI-generated guides, adding content is marginal cost. You can launch in weeks instead of months and improve continuously instead of hoping you got it right the first time.
Not collecting or using data
Most traditional audio guides are black boxes. Visitors press play, listen, and return the device. The museum knows how many devices were checked out. That's about it.
This is a waste of a direct channel to your visitors. An audio guide — especially a digital one — can tell you which stops people spend the most time on, where they drop off, what languages they prefer, which galleries get skipped, what questions they ask, and what topics they're curious about that your guide doesn't cover.
That data is gold for museum operations. If everyone skips the stop about Etruscan pottery, maybe the content needs reworking. If 30% of visitors are using the guide in Spanish, maybe your Spanish-language programming is underserving a real audience. If people keep asking about the building's architecture but there's no stop for it, that's a content gap you didn't know existed.
With conversational AI guides, the data goes deeper. Musa can map visitor interests, identify curiosity patterns, and surface what topics generate the most engagement. A museum can see not just that people visited the Impressionist gallery, but that they were particularly interested in color theory and kept asking about the techniques used. That's useful intelligence for exhibition planning, gift shop stocking, and programming decisions.
Collecting this data isn't optional anymore. It's the difference between running your audio guide on gut feeling and running it on evidence.
The common thread
Every mistake on this list shares a root cause: treating the audio guide as a finished product rather than a living system. Build it, ship it, move on.
Audio guides that work — the ones with real adoption, sustained engagement, and operational value — are the ones that get attention after launch. Updated content. Trained staff. Data-driven iteration. Steady improvement.
The museums doing this well aren't the ones with the biggest budgets. They're the ones that chose systems flexible enough to evolve and teams willing to keep improving. That's the real lesson from the field.
If any of these patterns sound familiar, we'd be glad to talk through what we've seen work.