Frequently Asked Questions

Auvi — Live Transcription & Translation


About Auvi

Auvi is made by Creative Machines Limited, a company incorporated in Hong Kong SAR. Founded by a British developer based in Hong Kong — on the doorstep of Shenzhen and the world’s electronics capital — Auvi was born from a viral Raspberry Pi project that brought live captions to a family member. Thousands of people asked for a proper app, so we built one.

We believe accessibility should not be expensive, and your conversations should stay yours. That’s why Auvi processes everything on your device with no cloud, no accounts, and no data collected.

Getting Started

What is Auvi?

Auvi is a real-time transcription and translation app for iPhone and iPad. It listens to conversations and displays colour-coded live captions, with each speaker automatically assigned their own colour. All processing happens entirely on your device — no internet connection, no cloud, no accounts required.

People use Auvi for all kinds of reasons: following conversations in noisy environments, captioning meetings and lectures, translating while travelling abroad, keeping a record of important conversations, or as a daily accessibility tool. We built it for everyone.

What devices does Auvi work on?

Auvi requires an iPhone or iPad running iOS 17 or later. An Apple Watch companion app is also available for watchOS 10 or later, which streams live captions to your wrist from your iPhone.

Transcription quality and speed can vary depending on your device. Newer devices with more processing power produce faster, more accurate results. On devices running iOS 26 or later, Auvi can use Apple’s built-in speech recognition framework, which may already be installed on your device and uses zero app memory.

Do I need an internet connection?

Only for the initial setup on older versions of iOS, when the app may need to download speech recognition models. After that, Auvi works completely offline — in airplane mode, underground, in hospital wards, anywhere.

On newer versions of iOS (26 and later), the required models may already be present on your device, so no download may be needed at all — you can start transcribing immediately.

How much storage space does Auvi need?

On older versions of iOS, a one-time model download of approximately 700 MB is required. The app will prompt you before it begins, recommending you connect to Wi-Fi. If the download is interrupted, it resumes automatically.

On iOS 26 and later, Auvi can use speech models already installed on your device, potentially skipping this download entirely. The app itself is small; the storage requirement is almost entirely the speech recognition models.

Is there a free trial?

Yes. Auvi includes a 7-day free trial with full access to every feature, including Plus and Translation. No account is required and no credit card is needed. After the trial, you can purchase Standard (a one-time payment for core features) or subscribe to Plus and/or Translation as optional add-ons.


Features

How does speaker identification work?

Auvi uses on-device speaker identification to distinguish between different voices. Each speaker is automatically assigned a colour — no setup, labelling, or voice enrolment required. Colours appear shortly after the text, as the identification engine processes audio in short chunks behind the scenes.

In longer conversations, Auvi maintains speaker consistency so the same person always appears in the same colour. With Plus, you can create voice profiles that remember speakers across sessions — so your colleagues, family, or friends are recognised automatically next time.

What languages does Auvi support?

Auvi can transcribe speech in over 40 languages. You can select your transcription language in Settings. With the Translation add-on, Auvi can translate captions in real time into 31 languages, all processed on-device. Tap any translated caption to hear it spoken aloud.

How does real-time translation work?

The Translation add-on uses Apple's on-device translation framework to convert captions into a language of your choice as they appear. All 31 supported languages are processed locally — nothing is sent to a server.

You can also tap a translated caption to have it spoken aloud, and use Type-to-Speak to type a reply that's read aloud in either language. The Quick-Reply Phrasebook gives you one-tap access to common phrases for travel and everyday conversations.

No expensive hardware device needed. Unlike standalone translation devices (Pocketalk, etc.), Auvi runs on the iPhone you already own.

What is scam call detection?

Auvi can analyse the words being spoken and alert you if the conversation matches known patterns used in phone scams. It covers 25 categories of phone scam tactics — such as impersonation of officials, false urgency around bank accounts, and gift card demands — using multi-keyword weighted scoring.

You can set the sensitivity to High, Medium, or Low, and the feature is included with Standard at no extra cost. All analysis is on-device; the text of your conversation is never sent anywhere.

What is tourist scam detection?

Tourist scam detection is a separate, opt-in feature (marked Beta) that covers 10 categories of travel-related scams, such as tea house lures, fake authority fines, taxi overcharges, and distraction theft. It uses the same on-device detection engine as phone scam detection and can be configured independently. It is included with Standard and is particularly useful when travelling abroad.

How does Sound Awareness work?

Sound Awareness listens for important environmental sounds and shows a visual banner when one is detected. Supported sounds include doorbells, smoke alarms, fire alarms, sirens, knocking, and more — 15 sounds across three urgency tiers. Critical sounds (like fire alarms) trigger a stronger haptic alert and a more prominent banner, even if the rest of Sound Awareness is turned off.

What are name-mention alerts?

You can configure your name (and name variants) in Settings. When someone says your name during a conversation, Auvi highlights the text and delivers a haptic tap to your phone and Apple Watch. This is useful in group conversations where you might be reading captions and could miss that someone is addressing you directly.

Can I use Auvi with external microphones?

Yes. Auvi works with any audio input your iPhone or iPad supports: the built-in microphone, AirPods, Bluetooth microphones, USB-C microphones, Lightning accessories, and wired headsets. You can switch audio sources at any time.

For best results in noisy environments or across a table, a lapel microphone or a directional Bluetooth conference microphone can significantly improve accuracy.

What is Picture-in-Picture mode?

Picture-in-Picture (PiP) keeps a floating caption overlay visible over other apps. When you leave Auvi to check something on your phone, captions continue to appear in a small window that you can move around the screen. PiP starts automatically when you background the app during an active transcription session.

Can I use Auvi on my Apple Watch?

Yes. The Apple Watch companion app (watchOS 10 or later) displays live captions streamed from your iPhone directly on your wrist. You can also start and stop transcription from the Watch without taking your phone out. If your name is detected in the conversation, the Watch taps your wrist to alert you.

What is Type-to-Speak?

Type-to-Speak lets you type a message and have it spoken aloud by your device. This is useful when you want to respond verbally but prefer to communicate in writing. A Quick-Reply Phrasebook gives you one-tap access to common phrases, and you can add your own custom phrases.

What are meeting minutes and metadata?

With Plus, Auvi can automatically extract action items, decisions, dates, and participants from saved conversations. Action items can be sent directly to iOS Reminders, and dates to your Calendar. Conversations are categorised and summarised, and you can export professional meeting minutes via email — all processed on-device.

What export formats are available?

With Plus, you can export transcripts as plain text, PDF, or subtitle formats (SRT and VTT). Subtitle exports include timestamps and speaker attribution, making them suitable for captioning video content or for accessibility purposes. You can also copy text directly from captions and share via any iOS share sheet.


Privacy & Security

Does Auvi record my conversations?

No. Auvi does not create audio files, ever. It converts speech to text in real time, and by default that text only exists in memory — it disappears the moment you stop transcription or close the app. If you choose to enable Save Transcripts in Settings (a Plus feature), only the text is stored, locally on your device. No audio is ever written to disk or sent anywhere.

What data does Auvi collect?

None. Auvi's App Store privacy label says “No Data Collected” because that is literally true. There are no analytics, no telemetry, no crash reporters, no usage tracking, and no accounts. Nothing leaves your device. Auvi makes no network requests after the initial model download is complete.

Can my employer control Auvi's features?

Yes. Auvi supports Managed App Configuration (MDM), which allows IT administrators to enforce privacy settings across a fleet of managed devices. Administrators can permanently disable saving, copying, and exporting of transcripts via MDM profiles. When a setting is locked by an administrator, its toggle is hidden in the app. Deploy through Apple Business Manager or Volume Purchase Programme at standard App Store pricing.

Is Auvi HIPAA compliant?

Auvi's privacy architecture is strongly aligned with HIPAA principles: all processing is on-device, no data is transmitted, no data is collected, and saving is disabled by default. However, HIPAA compliance is an organisational and contractual matter, not just a technical one. If you are deploying Auvi in a healthcare setting, we recommend consulting with your compliance team.


Pricing & Subscriptions

How much does Auvi cost?

Auvi starts with a 7-day free trial that includes every feature. After the trial, Standard is a one-time purchase for core transcription features. Plus and Translation are optional auto-renewable subscriptions available monthly or annually, or as a discounted Bundle. Exact prices are shown in the App Store and vary by region.

What's the difference between Standard, Plus, and Translation?

Standard (one-time purchase) — Live transcription in 40+ languages, speaker identification, scam detection, sound awareness, Apple Watch, Picture-in-Picture, name-mention alerts, and all accessibility features.

Plus (subscription) — Everything in Standard plus transcript history, search, meeting minutes with metadata, action items → Reminders, dates → Calendar, export & share, subtitle export (SRT/VTT), copy text, bookmarks, custom vocabularies, domain packs, and voice profiles.

Translation (subscription) — Everything in Standard plus real-time on-device translation in 31 languages, tap-to-hear, Type-to-Speak, Quick-Reply Phrasebook, and Tourist Scam Detection (Beta).

Bundle — Plus + Translation together at a discounted rate.

Can I get a refund?

All purchases are processed through the Apple App Store. Refunds are handled by Apple according to their refund policy. To request a refund, visit reportaproblem.apple.com. You can manage or cancel subscriptions at any time in your Apple ID settings: go to Settings > [your name] > Subscriptions on your device.

Is there a family or education discount?

Auvi does not currently offer a separate family plan or education pricing tier. Organisations looking to deploy Auvi across many devices can do so through Apple Business Manager or Volume Purchase Programme (VPP) at standard App Store pricing. If you have questions about institutional deployment, please email support@creativemachines.ai.


Troubleshooting

Why is there a delay before captions appear?

Auvi processes audio in short chunks before sending it to the speech recognition engine. The first caption typically appears after around 15 seconds. This is normal — a small audio buffer needs to accumulate before transcription begins. Once captions start appearing, they update continuously.

On newer devices and iOS versions, captions may appear faster due to more efficient processing.

Transcription doesn't seem very accurate

Transcription accuracy depends on several factors:

  • Device and iOS version — newer devices with more processing power produce more accurate results. iOS 26+ uses Apple's latest speech recognition, which can be significantly better.
  • Microphone distance — hold the device or place it within about one metre of the speaker.
  • Background noise — reduce ambient noise where possible, or use an external microphone.
  • Language setting — make sure the transcription language in Settings matches the language being spoken. This makes a significant difference.
  • Vocabulary packs — if you use domain-specific terms (medical, legal, technical), enable the relevant vocabulary pack in Settings.
Can Auvi transcribe phone calls?

iOS does not allow third-party apps to access phone call audio directly. This is a system-level restriction that applies to all apps. However, you can work around this by:

  • Putting the call on speakerphone and placing your device nearby
  • Using a USB telephone recorder connected to your landline, which feeds audio into your iPhone's microphone
  • For video calls (FaceTime, Zoom, Teams), using the device speaker

The app includes a Tips section with guidance on getting the best results. For recommended accessories, see our Accessories page (coming soon).

The model download is taking a long time

The initial download is approximately 700 MB, so it can take several minutes depending on your internet connection. We recommend using Wi-Fi. If the download is interrupted, reopen the app and it will resume from where it left off.

Note: On iOS 26 and later, this download may not be required at all — the required models may already be present on your device.

My screen keeps turning off during transcription

Auvi automatically prevents your screen from locking while transcription is active. If your screen is still turning off, check that transcription is actually running (the listening indicator should be visible) and that Low Power Mode is not overriding the screen lock prevention. When you stop transcription, normal screen lock behaviour resumes.


Accessibility

What accessibility features does Auvi have?

Auvi was built with accessibility at its core:

  • Five colour schemes: System, Light, Dark, High Contrast, and Yellow-on-Black
  • Adjustable font size from 16pt to 48pt, independent of the iOS system font size
  • Full support for iOS Dynamic Type, including the largest accessibility text sizes
  • Bold text support, following the iOS system preference
  • Colour-blind-friendly speaker palette toggle
  • VoiceOver labels on all controls
  • Haptic feedback for new speech, name mentions, and sound awareness alerts
  • Reduce Motion support for all animations
  • All screen orientations including upside-down portrait
  • Visual-only onboarding that works at very large text sizes
Does Auvi work with VoiceOver?

Yes. All controls have VoiceOver labels and descriptions. New caption segments are announced as they appear, so screen reader users can follow a transcription session. This is important because some users have both hearing loss and visual impairment, and accessibility tools must not be mutually exclusive.

What colour schemes are available?
  • System — follows your iOS Light or Dark Mode setting automatically
  • Light — always light, regardless of system setting
  • Dark — always dark, regardless of system setting
  • High Contrast — maximises contrast for users with low vision
  • Yellow-on-Black — high-visibility yellow text on black

Each scheme has tuned colour variants for the speaker identification colours. The colour-blind-friendly palette toggle works across all schemes.


Didn't find your answer here? We're happy to help.

Email: support@creativemachines.ai

We aim to respond within 2 business days.