Update README.md
Browse files
README.md
CHANGED
|
@@ -14,29 +14,39 @@ language:
|
|
| 14 |
- ta
|
| 15 |
- te
|
| 16 |
base_model:
|
| 17 |
-
- mistralai/Mistral-Small-3.1-24B-
|
| 18 |
---
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
-
|
|
|
|
|
|
|
| 25 |
|
| 26 |
-
|
| 27 |
-
- **Hybrid thinking mode** A single model supports both "think" and "non-think" modes. Use the think mode for tasks requiring complex logical reasoning, math, and coding, and switch to the non-think mode for efficient, general-purpose conversation.
|
| 28 |
-
- **Indic Skills** Specifically post-trained on Indian languages alongside English, the model also embodies a character that reflects and emphasizes Indian cultural values.
|
| 29 |
-
- **Reasoning capabilities** Sarvam-M outperforms most models of similar size on coding and math benchmarks, demonstrating strong reasoning capabilities.
|
| 30 |
-
- **Chatting Experience** With support for both Indic scripts and romanized versions of Indian languages, Sarvam-M offers a smooth and accessible multilingual chat experience.
|
| 31 |
|
| 32 |
-
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
```python
|
| 37 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 38 |
|
| 39 |
-
model_name = "sarvamai/sarvam-
|
| 40 |
|
| 41 |
# load the tokenizer and the model
|
| 42 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
@@ -72,14 +82,12 @@ print("reasoning content:", reasoning_content)
|
|
| 72 |
print("content:", content)
|
| 73 |
```
|
| 74 |
|
| 75 |
-
|
| 76 |
|
| 77 |
-
For deployment,
|
| 78 |
-
|
| 79 |
-
vllm
|
| 80 |
-
```
|
| 81 |
|
| 82 |
-
For inference and switching between thinking and non-thinking mode, refer to the below python code:
|
| 83 |
```python
|
| 84 |
from openai import OpenAI
|
| 85 |
|
|
@@ -117,6 +125,4 @@ print("content:", content)
|
|
| 117 |
messages.append(
|
| 118 |
{"role": "assistant", "content": output_text}
|
| 119 |
)
|
| 120 |
-
```
|
| 121 |
-
|
| 122 |
-
The above example also shows how to add assistant turns in the messages for multiturn conversation.
|
|
|
|
| 14 |
- ta
|
| 15 |
- te
|
| 16 |
base_model:
|
| 17 |
+
- mistralai/Mistral-Small-3.1-24B-Base-2503
|
| 18 |
---
|
| 19 |
|
| 20 |
+
# Model Information
|
| 21 |
|
| 22 |
+
`sarvam-m` is a multilingual, hybrid-reasoning, text-only language model built on Mistral-Small. This post-trained version delivers exceptional improvements over the base model:
|
| 23 |
|
| 24 |
+
- +20% average improvement on Indian language benchmarks
|
| 25 |
+
- +21.6% enhancement on math benchmarks
|
| 26 |
+
- +17.6% boost on programming benchmarks
|
| 27 |
|
| 28 |
+
Performance gains are even more impressive at the intersection of Indian languages and mathematics, with an outstanding +86% improvement in romanized Indian language GSM-8K benchmarks.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
Learn more about sarvam-M in our detailed [blog post](https://www.sarvam.ai/blogs/sarvam-m).
|
| 31 |
|
| 32 |
+
# Key Features
|
| 33 |
+
|
| 34 |
+
- **Hybrid Thinking Mode**: A single versatile model supporting both "think" and "non-think" modes. Use the think mode for complex logical reasoning, mathematical problems, and coding tasks, or switch to non-think mode for efficient, general-purpose conversation.
|
| 35 |
+
|
| 36 |
+
- **Advanced Indic Skills**: Specifically post-trained on Indian languages alongside English, embodying a character that authentically reflects and emphasizes Indian cultural values.
|
| 37 |
+
|
| 38 |
+
- **Superior Reasoning Capabilities**: Outperforms most similarly-sized models on coding and math benchmarks, demonstrating exceptional reasoning abilities.
|
| 39 |
+
|
| 40 |
+
- **Seamless Chatting Experience**: Full support for both Indic scripts and romanized versions of Indian languages, providing a smooth and accessible multilingual conversation experience.
|
| 41 |
+
|
| 42 |
+
# Quickstart
|
| 43 |
+
|
| 44 |
+
The following code snippet demonstrates how to use `sarvam-m` using Transformers.
|
| 45 |
|
| 46 |
```python
|
| 47 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 48 |
|
| 49 |
+
model_name = "sarvamai/sarvam-m"
|
| 50 |
|
| 51 |
# load the tokenizer and the model
|
| 52 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
|
|
| 82 |
print("content:", content)
|
| 83 |
```
|
| 84 |
|
| 85 |
+
# VLLM Deployment
|
| 86 |
|
| 87 |
+
For easy deployment, we can use `vllm>=0.8.5` and create an OpenAI-compatible API endpoint with `vllm serve sarvamai/sarvam-m`
|
| 88 |
+
|
| 89 |
+
For more control, we can use vllm in Python. That way, we can explicitly enable or disable thinking mode.
|
|
|
|
| 90 |
|
|
|
|
| 91 |
```python
|
| 92 |
from openai import OpenAI
|
| 93 |
|
|
|
|
| 125 |
messages.append(
|
| 126 |
{"role": "assistant", "content": output_text}
|
| 127 |
)
|
| 128 |
+
```
|
|
|
|
|
|