Revolutionizing AI: Sarvam M Shines in Mathematics and Indian Languages, Outclassing Competitors

by Akash Das
2 weeks ago

Revolutionizing AI: Sarvam M Shines in Mathematics and Indian Languages, Outclassing Competitors

Highlights

1 Sarvam-M: India’s Ambitious Large Language Model

Sarvam-M: India’s Ambitious Large Language Model

Indian AI startup Sarvam has introduced its flagship large language model (LLM), Sarvam-M. This 24-billion-parameter hybrid open-weights model is built on the Mistral Small framework. As a versatile and locally relevant alternative in the competitive global LLM landscape, Sarvam-M has garnered attention for its impressive performance in Indian languages, mathematics, and programming, although it has faced some scepticism from certain sectors of the tech community.

Understanding 24 Billion Parameters

When discussing parameters, it is essential to understand that they are the internal settings a language model employs to process and generate text. These can be compared to dials and switches that get calibrated during training to enhance the model’s understanding of grammar, context, facts, reasoning, and more. The number of parameters is crucial; the more parameters a model contains, the more refined its understanding and outputs can be, although this is also influenced by the quality of the training data and methods used. Sarvam-M, with its 24 billion parameters, is characterized as a mid-to-large scale model. It is considerably larger than open models like Mistral 7B but smaller than leading systems such as OpenAI’s GPT-4 or Google’s Gemini 1.5 Pro.

Comparing Sarvam-M with Leading Models

Here’s a snapshot of Sarvam-M’s position relative to other prominent models:

Model	Parameters	Strengths
Sarvam-M	24B	Indian languages, maths, programming
OpenAI GPT-4	1.8T (estimated)	General reasoning, coding, multilingual
Gemini 1.5 Pro	200B+	Multimodal capabilities, advanced reasoning and coding performance
Llama 3 70B	70B	Reasoning, coding, and multilingual tasks
Anthropic Claude 3.7 Sonnet	2T (estimated)	High-quality summarisation, reasoning, and content generation

Sarvam-M ranks below the largest proprietary models in size but excels in specific areas, especially in mathematics and reasoning in Indian languages. However, it falls behind in English-focused benchmarks such as MMLU, showing about a 1% performance gap and indicating the need for enhancement in broader linguistic generalisation.

The Development Process of Sarvam-M

The creation of Sarvam-M involved a three-phase training approach:

Supervised Fine-Tuning (SFT): This phase utilized high-quality prompts and responses to develop the model’s conversational and reasoning skills while reducing cultural bias.
Reinforcement Learning with Verifiable Rewards (RLVR): The model learned to follow instructions and resolve logic-heavy challenges through strategically designed rewards and feedback mechanisms.
Inference Optimisation: Advanced compression techniques, such as FP8 quantisation, and improved decoding strategies enhanced efficiency and speed, though scalability challenges in high-concurrency settings remain.

Significance of Sarvam-M in the AI Landscape

Sarvam-M supports ten Indian languages and is capable of addressing competitive exam questions in Hindi, positioning it as a valuable tool for local education and translation initiatives. The model demonstrated an 86% improvement in a test integrating mathematics and romanised Indian languages, proving its robust multilingual reasoning capacity.

While there have been questions regarding whether Sarvam-M is “good enough” to compete on a global scale, its launch has notably elevated the profile of Indian contributions in the AI domain. The model is now available to the public through Sarvam’s API and on Hugging Face, allowing developers to create, test, and contribute further advancements.

Although it may not yet match the most sophisticated LLMs, Sarvam-M signifies a meaningful advancement in the effort to democratise AI development in India, making strides for users requiring support beyond just English.

Categories: Artificial Intelligence, Tech
Tags: AI

Sarvam-M: India’s Ambitious Large Language Model

Understanding 24 Billion Parameters

Comparing Sarvam-M with Leading Models

The Development Process of Sarvam-M

Significance of Sarvam-M in the AI Landscape

Related Content

Majority of India's iPhoneExports Target the US Market: New Insights

Barbie Undergoes an AI Transformation with OpenAI Collaboration

Google Commemorates Victims of Air India Ahmedabad Flight Tragedy with Symbolic Black Ribbon

OnePlus Bullets Wireless Z3 Neckband Set to Debut in India on June 19

BSNL Unveils Ambitious Plan to Roll Out 1 Lakh 4G Towers Nationwide