
ElevenLabs is Launching Its Own Speech-to-Text Model
ElevenLabs, a prominent AI startup that recently secured $180 million in funding, has made the bold move to launch its very own speech-to-text model. Dubbed “Scribe,” this innovation has the potential to disrupt the industry by providing unparalleled accuracy and versatility.
In an exciting development, ElevenLabs’ Scribe model can support over 99 languages at launch, including English (with a claimed accuracy rate of 97%), French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. While other companies like Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI’s Whisper models are already established players in this space, ElevenLabs is aiming to carve out its own niche by offering a more inclusive solution.
The company has developed the speech-to-text component for its AI conversational agent platform, which was launched last year. However, this marks the first time that ElevenLabs is releasing a standalone speech detection model. As per CEO Mati Staniszewski’s previous comments, the goal is to significantly improve speech detection models by leveraging in-house teams to annotate data and obtain rapid feedback.
ElevenLabs’ Scribe model boasts cutting-edge features such as smart speaker diarization (identifying who’s speaking), timestamps for accurate subtitles, and auto-tagging of sound events like audience laughter. The company will also be providing a straightforward way for customers to directly transcribe video content to add captions in its studio.
Although the current version only works with pre-recorded audio formats, ElevenLabs has promised that a low-latency real-time version is forthcoming. This means Scribe won’t be suitable for meeting transcripts or voice note-taking just yet. However, this limitation is temporary and will soon be addressed by the company.
As a standalone offering, ElevenLabs’ pricing strategy for Scribe is competitive at $0.40 per hour of transcribed audio. However, rivals in the space are currently offering lower rates with some added features.
Source: https://techcrunch.com/2025/02/26/elevenlabs-is-launching-its-own-speech-to-text-model/