comscore

Made in India Sarvam AI outperforms ChatGPT and Gemini in reading documents, Indian languages, and AI voice tech

Sarvam AI, a Made-in-India AI model, outperforms ChatGPT and Google Gemini in OCR, Indian language document reading, text-to-speech, and AI voice generation.

Published By: Deepti Ratnam | Published: Feb 09, 2026, 01:36 PM (IST)

  • whatsapp
  • twitter
  • facebook
  • whatsapp
  • twitter
  • facebook

Sarvam AI, a Made in India AI startup, has drawn attention following its better performance against international AI models, including Google Gemini and ChatGPT, in optical character recognition, reading Indian-language documents, text-to-speech, and AI voice generation. developed by the Bengaluru-based company, the newly launched AI startup is demonstrating that it can produce higher results than the general-purpose global systems in the fields of Indian language-specific and data-optimization.

What Is Sarvam AI

The Sarvam AI is an Indian AI firm established in August 2023 by Dr. Vivek Raghavan and Dr. Pratyush Kumar. A gap that existed in the AI ecosystem led to the creation of the startup. The majority of AI models in the today’s scenario cannot cope with Indian languages, regional accents, and real-life forms of documents utilized in the country. These are the very problems which Sarvam AI is working on.

What is Sarvam Vision and OCR Accuracy

The Sarvam Vision, which is company’s AI document reading model, has been performing well in optical character recognition. OCR refers to the capability of an AI to read scanned documents, images, charts, and tables. In the conventional OCR benchmarks, Sarvam Vision has achieved better accuracy scores than other models, such as Gemini and ChatGPT. Its power is particularly observable in documents written in Indian language as global models are unable to process text properly.

Indian-Language Document Reading

Sarvam Vision has been made to cope with true Indian documents, including government forms, scan pages, pages containing various languages, and poor quality prints. The model is able to comprehend charts, text written in a handwriting and a complex table. This renders its applicability in areas such as governance, education, banking as well as digitization projects within India.

Bulbul V3 and AI Voice Generation

Bulbul V3 is another significant Sarvam AI product, which is a text-to-speech model. It has 35 voices in the 22 planned Indian languages. Various accents, tones and speaking style are trained in the model. Bulbul V3 has the ability to produce natural-sounding AI voices and performs well even using historical or flawed text sources.

Sarvam AI presents itself as an independent AI platform. This implies that it is concerned with data in India, local infrastructure and adherence to national regulations. This is aimed at lessening the reliance on AI systems in other countries and providing India with a better control over the processing and storage of its information.

Add Techlusive as a Preferred SourceAddTechlusiveasaPreferredSource

Sarvam AI has published a number of open-source models, such as Indic language and speech oriented systems. Such models enable the developers to create applications that actually comprehend the Indian users, lingo, and style of communication.