How We Tested
Accuracy claims are meaningless without methodology. We designed a comprehensive benchmark to compare the three most widely used speech-to-text engines: OpenAI's Whisper AI (via Scrybapp), Apple's built-in Dictation, and Google's Speech-to-Text (via Chrome). Here's how we conducted the test.
Methodology
- Test material: 50 pre-written passages across 10 categories, each read aloud by the same speaker
- Equipment: MacBook Pro M3 with Blue Yeti USB microphone in a quiet room
- Measurement: Word Error Rate (WER) — the percentage of words that were incorrectly transcribed
- Models: Whisper Medium (via Scrybapp), Apple Dictation (macOS Sequoia), Google Speech-to-Text (latest Chrome)
- Each test was run three times with results averaged to account for variance
The Results
| Category | Whisper AI (Scrybapp) | Apple Dictation | Google STT |
|---|---|---|---|
| Casual conversation | 97.3% | 94.1% | 96.2% |
| Business English | 96.8% | 93.5% | 95.7% |
| Technical (software) | 94.5% | 86.2% | 91.8% |
| Medical terminology | 92.1% | 78.5% | 88.4% |
| Legal language | 93.8% | 81.7% | 89.6% |
| Academic/scientific | 93.2% | 82.9% | 90.1% |
| Accented English (Indian) | 93.4% | 82.1% | 91.5% |
| Accented English (British) | 95.6% | 88.3% | 93.8% |
| Noisy environment | 88.7% | 76.4% | 84.2% |
| Fast speech (180+ WPM) | 91.2% | 79.8% | 87.6% |
| Overall Average | 93.7% | 84.4% | 90.9% |
Analysis: What the Numbers Mean
Whisper AI: The Accuracy Leader
Whisper AI, running locally via Scrybapp, achieved the highest accuracy across every single category. The advantage is most dramatic in specialized content: medical terminology (+13.6% over Apple), technical software terms (+8.3%), and accented speech (+11.3% for Indian English).
This superiority comes from Whisper's training data: 680,000 hours of diverse, multilingual audio that includes technical content, accented speech, and a wide range of speaking styles. Apple's dictation model was trained on a narrower, more standardized dataset.
Apple Dictation: Decent for Basics
Apple Dictation performs reasonably well for casual, everyday English conversation (94.1%). But accuracy drops sharply with specialized vocabulary, accents, and challenging conditions. The 78.5% accuracy on medical terms means roughly one in five medical words is transcribed incorrectly — that's not just inconvenient, it's potentially dangerous in a clinical context.
For a deeper comparison, see our Apple Dictation vs third-party apps article.
Google Speech-to-Text: Strong but Cloud-Dependent
Google's offering comes in second place overall. Its accuracy is respectable across categories, benefiting from Google's massive training infrastructure. However, it requires an internet connection, only works in Chrome, and sends all audio to Google's servers — a significant privacy trade-off.
Factors That Affect Accuracy
Microphone Quality
All three engines perform better with a quality microphone. Our tests used a Blue Yeti USB microphone, which is a good mid-range option. Built-in laptop microphones reduce accuracy by roughly 2-4% across the board due to environmental noise pickup and lower audio quality.
Background Noise
This is where Whisper's training advantage really shows. In our noisy environment test (simulating a coffee shop), Whisper maintained 88.7% accuracy while Apple Dictation dropped to 76.4%. Whisper's training on diverse, real-world audio gives it significantly better noise handling.
Speaking Speed
Faster speech challenges all engines, but Whisper degrades the most gracefully. At 180+ words per minute, it still achieved 91.2% accuracy. Apple Dictation struggled significantly at this speed, dropping below 80%.
Model Size (Whisper Only)
The results above use Whisper's Medium model. Here's how different model sizes compare:
| Model | Overall Accuracy | Processing Speed |
|---|---|---|
| Tiny | 87.2% | ~10x real-time |
| Base | 89.8% | ~7x real-time |
| Small | 92.1% | ~4x real-time |
| Medium | 93.7% | ~2x real-time |
| Large | 94.5% | ~1x real-time |
Even Whisper's Tiny model outperforms Apple Dictation's overall accuracy, though with a larger margin of error on specialized content.
The Privacy Dimension
Accuracy isn't the only factor. Consider the privacy implications of each approach:
- Whisper (Scrybapp) — 100% local, zero data transmitted. Privacy details
- Apple Dictation — Mostly local for English, but some data may be sent to Apple
- Google STT — All audio sent to Google servers for processing
For professionals handling sensitive content — healthcare, legal, financial — the privacy advantage of local processing is as important as accuracy.
Recommendations by Use Case
General Everyday Use
All three engines work well for casual dictation. Apple Dictation is free and built-in, making it a fine starting point. But for the best experience, Scrybapp provides noticeably cleaner results.
Professional/Business Use
The 3-4% accuracy advantage of Whisper over Apple Dictation compounds over time. In a 1,000-word document, that's 30-40 fewer errors to fix. Choose Scrybapp for professional work.
Specialized Vocabulary
For medical, legal, technical, or scientific content, Whisper is the only viable option among the three. Apple Dictation's accuracy on specialized terms is too low for professional use.
Multilingual Dictation
Whisper supports 99+ languages with consistent accuracy. Read our multilingual guide for details.
Try It Yourself
Numbers only tell part of the story. The best way to evaluate accuracy is to try each engine with your own voice, your own vocabulary, and your own use case. Download Scrybapp for a free 3-minute trial and compare it to Apple Dictation yourself. The difference is immediately noticeable.