India's Sarvam AI Releases Open-Source 30B and 105B Parameter Models
Indian AI lab Sarvam AI releases open-source 30B and 105B parameter LLMs with 22-language support, speech, and vision — challenging closed-source AI dominance.
Whether you're looking for an angel investor, a growth advisor, or just want to connect — I'm always open to great ideas.
Get in TouchAI, startups & growth insights. No spam.
TL;DR: Bangalore-based Sarvam AI has released two open-source large language models — a 30B and a 105B parameter model — trained with deep coverage of all 22 scheduled Indian languages, alongside complementary speech-to-text, text-to-speech, and vision models. The release is the largest open-source AI effort to come out of India to date and represents a direct challenge to the closed-source dominance of OpenAI, Google, and Anthropic in one of the world's largest and most linguistically complex AI markets. Sarvam, which has raised $41M in total funding, is backed by peak XV Partners and Lightspeed, and operates with a government mandate through India's national AI mission. The models are released under a permissive open-source license and available to developers immediately.
Sarvam AI was founded in 2023 by Vivek Raghavan and Pratyush Kumar, two researchers with deep roots in Indian language technology. Raghavan previously led technology strategy at Aadhaar — India's billion-person biometric identity infrastructure — and at the National Payments Corporation of India, which built UPI. Kumar is a machine learning researcher whose prior work focused on low-resource language processing. Together they represent an unusual combination of AI research credibility and experience shipping infrastructure at national scale.
The company's founding thesis is straightforward and underserved: the dominant AI models in the world were trained primarily on English-language internet text. A country like India, where over 1.4 billion people speak 22 officially recognized languages and where the majority of the population is not fluent in English, is functionally excluded from the value of those models. Sarvam was built to close that gap.
What distinguishes this release from prior India-origin AI efforts is scale and completeness. This is not a fine-tuned version of an existing Western model. Sarvam has trained its own foundation models — 30B and 105B parameters — from the ground up with Indian language data as a first-class training objective, not an afterthought. The accompanying speech and vision models turn this into a multimodal platform, not just a text capability.
The timing matters. India's government has committed significant resources to national AI infrastructure through the India AI Mission. Sarvam is one of the key private-sector counterparts in that effort. This release is both a commercial milestone and a signal about the direction of India's AI industrial policy: build sovereign capability, make it open, and challenge the assumption that frontier AI comes only from Silicon Valley or China.
Sarvam has released two model sizes with distinct positioning. The 30B parameter model is designed for deployment flexibility — it can run on high-end consumer hardware, fits within practical GPU memory constraints for most enterprise on-premise setups, and is the model most developers will reach for in production. The 105B parameter model is the research and performance flagship: larger, more capable across complex reasoning and generation tasks, but requiring data center-class hardware for inference.
Both models are transformer-based decoder architectures in the style of Llama and Mistral, which allows the broader open-source ecosystem — LoRA fine-tuning frameworks, quantization tools, inference servers like vLLM and llama.cpp — to work with Sarvam's models without significant adaptation. This was a deliberate choice. Building on a familiar architecture means the investment required to integrate Sarvam's models into existing ML pipelines is low.
The training data composition is where Sarvam diverges from Western open-source models in meaningful ways. The training corpus was assembled with explicit attention to:
The result is a model that does not merely translate between English and Indian languages but has genuine fluency in Indian language contexts: understanding idioms, handling mixed-script inputs, and generating outputs that read naturally to native speakers rather than having the stilted quality of translated content.
India's linguistic diversity is frequently cited but rarely quantified in ways that make the engineering challenge legible. The 22 languages in India's Eighth Schedule span four language families — Indo-Aryan, Dravidian, Austroasiatic, and Sino-Tibetan — with distinct grammars, scripts, and phonological structures that share almost nothing in common. Hindi and Tamil are as linguistically distant as English and Arabic. A model that handles Hindi well has no particular advantage when it comes to handling Malayalam.
| Language Family | Key Languages | Script | Approximate Speakers |
|---|---|---|---|
| Indo-Aryan | Hindi, Bengali, Marathi, Gujarati, Punjabi, Odia, Assamese | Devanagari + others | ~800M |
| Dravidian | Tamil, Telugu, Kannada, Malayalam | Distinct scripts per language | ~250M |
| Austroasiatic | Santali | Ol Chiki | ~8M |
| Sino-Tibetan | Bodo, Manipuri, Nepali (scheduled) | Varied | ~10M+ |
Prior models — including Llama 3, Mistral, and GPT-4 — have varying coverage of Indian languages that skews heavily toward Hindi and, to a lesser extent, Bengali and Tamil. Coverage of Odia, Assamese, Bodo, or Santali in those models is minimal to nonexistent. For AI applications that need to serve rural India or non-Hindi-belt populations, existing models are effectively unusable.
Sarvam's coverage of all 22 languages is not equal in depth — Hindi and other high-resource Indic languages will have stronger performance than low-resource languages like Santali or Bodo. But the deliberate inclusion of low-resource languages in the training data and evaluation framework is the operative distinction. It means the model architecture, the tokenizer design, and the evaluation regime were built to handle the full breadth of India's linguistic landscape, not just the highest-traffic languages.
The practical implication: developers building for Tier 2 and Tier 3 Indian cities — where the majority of India's population lives and where English penetration is low — now have a foundation model that can serve those markets rather than a model that technically supports the language but produces outputs a native speaker would recognize as machine-generated.
Alongside the LLMs, Sarvam has released dedicated speech-to-text (STT) and text-to-speech (TTS) models covering Indian languages. These are not voice wrappers around the LLM — they are independently trained models for audio processing and generation.
The speech-to-text model is trained on Indian-accented speech data across multiple languages, handling the acoustic features that make Indian English and Indian language speech difficult for Western STT systems: retroflex consonants, aspirated stops, tonal variation in Dravidian languages, and code-switching mid-sentence. Whisper and other Western STT systems have well-documented accuracy degradation on Indian-accented speech. Sarvam's STT model is specifically trained to close that gap.
The text-to-speech model supports natural-sounding synthesis in Indian languages — a harder problem than it might appear. Good TTS in Indian languages requires correct handling of:
The combined STT + LLM + TTS stack creates a voice AI pipeline that can operate end-to-end in Indian languages without routing through English as an intermediate representation. This matters enormously for conversational AI applications in India: customer support, healthcare consultations, agricultural advisory services, and government information access — all use cases where voice is the primary interface for users who may be literate in their regional language but not in English.
Sarvam's release also includes a vision model capable of processing images alongside text. The vision model supports document understanding — a high-priority capability for Indian enterprise and government use cases where large volumes of documents exist only in printed or handwritten form in regional language scripts.
Key capabilities of the vision component include:
The vision model does not match the capabilities of GPT-4o or Gemini 1.5 Pro on general visual reasoning benchmarks. What it provides is deep coverage of Indian language document understanding that those models lack, particularly for handwritten scripts and domain-specific document types common in Indian government and business contexts.
The open-source LLM landscape in early 2026 is dominated by Meta's Llama series and Mistral's model family, with significant contributions from Qwen (Alibaba), Gemma (Google), and DeepSeek. Where does Sarvam fit?
| Model | Parameters | Indian Language Coverage | Open-Source | Speech | Vision |
|---|---|---|---|---|---|
| Sarvam 30B | 30B | All 22 scheduled languages | Yes | Yes (separate model) | Yes (separate model) |
| Sarvam 105B | 105B | All 22 scheduled languages | Yes | Yes (separate model) | Yes (separate model) |
| Llama 3.3 70B | 70B | Hindi, some Bengali/Tamil | Yes | No | No (base model) |
| Mistral Large | ~120B | Hindi, limited others | Partial (weights available) | No | No (base model) |
| Qwen 2.5 72B | 72B | Hindi, Bengali, some others | Yes | No | Via separate model |
| Gemma 3 27B | 27B | Limited Indic coverage | Yes | No | Via separate model |
On raw English-language benchmarks — MMLU, GPQA, HumanEval — the 30B Sarvam model will not outperform a 70B Llama 3.3. Parameter count is not the only variable, but the compute dedicated to Indian language training comes at some cost to English benchmark performance. This is a deliberate trade-off, not a failure.
The correct comparison is not Sarvam 30B versus Llama 3.3 70B on English benchmarks. It is Sarvam 30B versus Llama 3.3 70B on Telugu question answering, Marathi document summarization, Hindi-English code-switched customer support dialogues, or Kannada voice assistant tasks. On those dimensions, Sarvam's training focus gives it a structural advantage that no amount of post-training fine-tuning on a Western base model can fully replicate.
The 105B model narrows the gap on English tasks while maintaining the Indic language advantage — it is the model for organizations that need both strong general capability and deep Indic language support.
Sarvam's release does not exist in a vacuum. It is part of a broader acceleration in India's AI ecosystem that has been building since 2023 and received substantial policy support through the India AI Mission, a government initiative approved in March 2024 with a budget of approximately $1.2 billion (INR 10,372 crore) over five years.
The India AI Mission's core components are directly relevant to Sarvam's work:
Sarvam is one of the anchor companies in this ecosystem. The government compute allocation reduces the training cost barrier that has historically made it impractical for Indian AI labs to train models at this scale. The datasets platform provides access to curated Indic language data that complements Sarvam's own data collection efforts.
India's AI ecosystem has also benefited from a wave of diaspora returnees — researchers and engineers who trained at top Western universities and companies and have returned to build in India. This talent pool, combined with government support and a massive domestic market underserved by existing AI products, creates conditions for AI development that did not exist five years ago.
The broader geopolitical context: India is deliberately positioning itself as an AI power that is neither in the US orbit nor the Chinese orbit. Open-source, sovereign AI capability is a strategic asset — a way to ensure that India's digital infrastructure is not dependent on technology decisions made by foreign corporations or subject to export controls from foreign governments.
The release of Sarvam's models is a challenge to the closed-source AI providers along specific dimensions — not across the board, but in ways that matter for the Indian market specifically.
Cost. OpenAI, Anthropic, and Google charge per-token API pricing that, at scale, represents significant cost for Indian enterprises and government deployments. A model you can run on your own infrastructure — especially with government-subsidized compute through the India AI Mission — removes that ongoing dependency. For a government ministry processing millions of citizen documents, the economics of open-source deployment versus closed-source API access are not close.
Data sovereignty. Sending sensitive government documents, healthcare records, or financial information to a US-headquartered AI company's servers raises compliance and sovereignty concerns for Indian organizations operating under data localization requirements. Open-source models deployed domestically eliminate this concern.
Language capability. As documented above, closed-source models from Western providers have structural gaps in Indian language coverage that cannot be papered over with fine-tuning. For applications that need to work in Tamil, Telugu, or Bengali at production quality, no closed-source option currently matches a model specifically trained for those languages.
Customizability. Open-source models can be fine-tuned on domain-specific data without permission from a vendor. An Indian healthcare startup can fine-tune Sarvam's models on medical dialogues in regional languages. A legal tech company can train on Indian case law. Closed-source models do not permit this level of customization.
The closed-source providers are not standing still. Google's Gemini models have stronger Indic language coverage than earlier versions, and both Google and Meta have announced research efforts focused on Indian languages. But Sarvam has a structural advantage: its entire mission is Indian language AI, which means its training data curation, evaluation frameworks, and engineering trade-offs are all optimized for this specific problem, whereas Indian language support is one feature among many for a global provider.
The vertical use cases where Sarvam's models have the most immediate enterprise adoption potential are those where Indian language capability is the primary barrier to deployment, not a secondary consideration.
Government services digitization. India's government operates at a scale — delivering services to 1.4 billion people in 22 languages across 28 states — where AI-assisted document processing, citizen-facing chatbots, and automated form handling could generate enormous efficiency gains. Existing AI infrastructure is inadequate for regional language use cases. Sarvam's models, deployed on the India AI Mission compute infrastructure, fit directly into this use case.
Financial services. India's banking and insurance sector has hundreds of millions of customers in rural areas who transact in regional languages. KYC document processing, customer support, loan application handling, and fraud detection all require Indian language capability at scale. The combination of OCR for regional language documents and LLM-powered document understanding is directly applicable.
Healthcare. Rural healthcare in India is constrained by the unavailability of trained professionals in Tier 2 and Tier 3 cities. AI-assisted diagnostics, patient intake, prescription reading, and health advisory services in regional languages can extend healthcare reach. The STT and TTS components make voice-based healthcare AI feasible in areas with low digital literacy.
Agriculture. India's 140 million farm households are primary targets for AI-powered advisory services covering crop management, weather response, market pricing, and government scheme information. All of these services need to be delivered in regional languages over voice interfaces. Sarvam's combined LLM + STT + TTS stack is purpose-built for this application.
Education. Personalized tutoring systems, curriculum content generation, and assessment tools in regional languages represent a large addressable market. Existing EdTech platforms in India have struggled to serve non-English speakers at scale.
Sarvam's models are released under a permissive open-source license that allows commercial use, modification, and redistribution. This is a deliberate contrast to some open-source releases that include use restrictions or commercial licensing requirements. Sarvam's position is that maximum permissiveness maximizes adoption and ecosystem development — the same logic that made Llama's license evolution toward permissiveness significant for its own adoption.
The models are available on Hugging Face, which is the standard distribution channel for open-source LLMs and where the existing tooling ecosystem — transformers, PEFT, quantization tools — is natively integrated. Researchers and developers can download, fine-tune, and deploy the models without contacting Sarvam or obtaining additional permissions.
Sarvam has also published model cards with detailed documentation of training data composition, evaluation results on Indic language benchmarks, and known limitations. This level of documentation transparency is above average for open-source releases and reflects Sarvam's research orientation — both founders have academic backgrounds and an expectation that the community will build on their work.
The API access option is also available for developers who prefer managed inference over self-hosting. Sarvam operates a cloud inference API with per-token pricing that competes with Western providers on cost for Indian language tasks.
Sarvam AI has raised $41 million in total funding across two known rounds:
| Round | Amount | Key Investors | Year |
|---|---|---|---|
| Seed | ~$11M | Peak XV Partners, Lightspeed | 2023 |
| Series A | ~$30M | Peak XV Partners, Lightspeed (follow-on) | 2024 |
The investors are India's premier venture firms — Peak XV Partners (formerly Sequoia India) and Lightspeed India both have strong track records in Indian enterprise software. The $41M total is modest relative to the scale of training runs that produced the 105B parameter model, which is a reflection of how much government compute support through the India AI Mission reduced the capital requirement for training at this scale.
What comes next for Sarvam is shaped by two parallel tracks. The first is commercial: the company needs to convert model releases into revenue through enterprise API contracts, government deployments, and the broader developer ecosystem building on its open-source models. The second is research: Sarvam has published work on Indic language benchmarks and has stated intent to continue releasing model updates and potentially larger models as India's government compute infrastructure scales.
The longer-term question is whether Sarvam can sustain a competitive position as Western labs increase their own Indic language investment. Meta has significant resources to dedicate to multilingual Llama capabilities. Google has deep India expertise across its consumer products. The window in which Sarvam has a structural advantage in Indian language AI is real but not permanent — execution velocity and ecosystem lock-in will determine how durable the position is.
What is already certain: the release of open-source 30B and 105B models with genuine Indian language depth changes the baseline for what Indian AI developers, enterprises, and government agencies can build. Before this release, building production-quality Indian language AI required either expensive closed-source APIs with known gaps or significant in-house research investment. After it, the infrastructure is available to anyone.
Sarvam AI is a Bangalore-based AI research company founded in 2023 by Vivek Raghavan and Pratyush Kumar. Raghavan previously led technology at India Stack infrastructure including Aadhaar and UPI. Kumar is an ML researcher focused on low-resource language processing. The company's mission is to build AI models optimized for Indian languages and use cases.
They are open-source large language models trained with deep coverage of all 22 officially scheduled Indian languages. The 30B model is designed for deployment flexibility and can run on high-end consumer or enterprise on-premise hardware. The 105B model is the performance flagship, requiring data center-class GPU hardware but delivering stronger results on complex reasoning and generation tasks. Both are released under permissive open-source licenses.
Sarvam's models are specifically trained on all 22 scheduled Indian languages, including low-resource languages like Santali, Bodo, and Odia that have minimal to no coverage in Llama, Mistral, or GPT-4. For high-quality Indian language generation and understanding — particularly for code-switching, regional idioms, and domain-specific content in regional languages — Sarvam has a structural advantage that cannot be replicated by fine-tuning a Western base model.
Yes. Alongside the LLMs, Sarvam has released dedicated speech-to-text and text-to-speech models covering Indian languages. These models handle Indian-accented speech, regional language prosody, schwa deletion rules, and natural intonation patterns that Western STT and TTS systems handle poorly. The combined LLM + STT + TTS stack enables end-to-end voice AI applications in Indian languages without routing through English.
The India AI Mission is a government initiative approved in 2024 with approximately $1.2 billion (INR 10,372 crore) in funding over five years. It provides shared GPU compute infrastructure, curated Indian language datasets, and grants for AI applications in priority sectors. Sarvam is one of the key private-sector partners in this initiative. Government-subsidized compute significantly reduced the capital required to train models at the 105B parameter scale.
Yes. The permissive open-source license allows commercial use, modification, and fine-tuning without requiring permission from Sarvam. Enterprises can fine-tune the models on domain-specific data — medical dialogues, legal documents, customer support transcripts — in regional languages. This is a key advantage over closed-source providers, where fine-tuning is limited to what the vendor permits and offered at additional cost.
Sarvam has raised $41 million in total funding, with Peak XV Partners (formerly Sequoia India) and Lightspeed India as lead investors across seed and Series A rounds. The relatively modest funding for models at this scale reflects the contribution of government compute infrastructure through the India AI Mission, which reduced training capital requirements significantly.
The highest-impact sectors are those where Indian language capability is the primary deployment barrier: government services digitization, rural financial services and banking, healthcare in Tier 2 and 3 cities, agricultural advisory services, and regional language education technology. In all of these verticals, the combination of LLM capability, voice AI, and document understanding in Indian languages enables applications that existing AI infrastructure cannot adequately support.
Meta has quietly revealed its Llama 4 lineup: Mango (flagship multimodal) and Avocado (coding-specialized) models launching H1 2026, with open weights that could democratize AI access.
DeepSeek V4 launches with 1 trillion total parameters, native multimodal capabilities, a 1M token context window, and pricing 50x cheaper than GPT-5.2 — all open-source and optimized for Huawei chips.
xAI announces Grok 5, a 6-trillion-parameter Mixture-of-Experts model training on Colossus 2 — the largest LLM announced by parameter count, with public beta expected Q2 2026.