Sarvam AI Unveils Bulbul-v2: India-Focused Text-to-Speech Model Supporting 11 Languages

May 8, 2025

1625

Bengaluru-based AI startup Sarvam AI has announced the release of its latest innovation—Bulbul-v2, a powerful text-to-speech (TTS) model tailored specifically for India. Supporting 11 Indian languages, the model is designed to deliver speech with authentic regional accents that the company describes as sounding “just like India.”

In a recent post on LinkedIn, Sarvam AI emphasized that Bulbul-v2 generates lifelike, expressive audio, avoiding the flat or robotic tone common in many TTS systems. It also boasts high-speed processing, customizable voice options, and is particularly suited for use by brands and enterprises looking to localize content at scale.

According to Sarvam, Bulbul-v2 represents a leap forward for speech AI in India, setting new standards in terms of naturalness, responsiveness, and affordability. As part of its broader mission to democratize access to AI in India, the startup is offering low-latency API access at India-friendly pricing, helping expand the technology’s reach across industries.

Notably, Sarvam AI is the first Indian startup selected by the central government to develop India’s sovereign large language model (LLM) under the national IndiaAI initiative, which aims to build indigenous capabilities in artificial intelligence.

What is Bulbul-v2?

Bulbul-v2 is Sarvam AI’s flagship TTS model, engineered to mirror India’s linguistic diversity and speech patterns. It supports real-time synthesis, multi-language inputs, and code-mixed text, making it adept at handling natural conversations across different Indian languages. The model also includes multiple voice personas, giving users creative flexibility.

Key capabilities include:

Realistic voice prosody (rhythm, tone, and intonation)
Voice customization (adjust pitch, speed, and volume)
Language-aware text processing, including smart handling of numbers, dates, and mixed-language sentences
Sample rate options ranging from 8kHz to 24kHz for adaptable audio quality

What Can Bulbul-v2 Do?

The model can instantly convert text into natural speech using preset or custom configurations. Users have fine-grained control over audio output, allowing them to tailor speech style to specific use cases—be it customer service, storytelling, or content localization.

The integrated text preprocessing system intelligently normalizes text inputs to enhance clarity and pronunciation, especially for numerical or hybrid linguistic inputs.

Released as a follow-up to Bulbul-v1, which launched in August 2024 with six voice presets, Bulbul-v2 pushes the boundaries with more nuanced voice personalities and greater scalability.

Given its speed, affordability, and Indian linguistic orientation, Bulbul-v2 is being positioned as a competitive alternative to global TTS models, especially for developers, educators, and businesses aiming for localized engagement.

- Advertisement -

Sarvam AI Unveils Bulbul-v2: India-Focused Text-to-Speech Model Supporting 11 Languages

What is Bulbul-v2?

What Can Bulbul-v2 Do?

Related Articles

Cybercrimes and Online Scams Rise Sharply Across Asia, Interpol Report Finds

Rasika Prashant Takes Charge as CEO of NSRCEL, IIM Bangalore’s Startup Incubator

Amit Nehru Joins Rubrik as Group Vice President, GSIs and MSP

Honasa Consumer Appoints Richa Gupta as Senior Vice President to Lead E-Commerce Business

LEAVE A REPLY Cancel reply

Latest Articles

Cybercrimes and Online Scams Rise Sharply Across Asia, Interpol Report Finds

Rasika Prashant Takes Charge as CEO of NSRCEL, IIM Bangalore’s Startup...

Amit Nehru Joins Rubrik as Group Vice President, GSIs and MSP

Honasa Consumer Appoints Richa Gupta as Senior Vice President to Lead...

Salesforce Signs Definitive Agreement to Acquire Fin

SoftBank Launches OpenAI-Powered Cybersecurity Service to Strengthen Critical Infrastructure Protection

CERT-In Releases Updated Cybersecurity Guidance to Strengthen Security Practices Across Technology...

Sarvam raises $234 million in first close of $300 million Series...

U.S. Court Filing Reveals Military Use of xAI’s Grok in Iran...

ChatGPT’s Market Share Slips Below 50%, But It Retains Lead in...