Skip to main content
JobCannon
All Skills

Transcription Speech-to-Text

🔥 Tier 2
Category
Tech
Salary Impact
Complexity
Medium
Used in
All careers

Speech-to-text (ASR, automatic speech recognition) converts audio recordings into text automatically. Modern ASR models (Whisper, Google Cloud, AWS Transcribe) achieve >95% accuracy on clean audio and can handle multiple languages, accents, and dialects. Applications range from accessibility (captions for deaf users), content creation (podcast transcripts, video subtitles), to voice interfaces (Alexa, Siri). ASR combines audio signal processing, acoustic modeling, and language models to predict what words were spoken.