AdvRahul commited on
Commit
4bd52ef
·
verified ·
1 Parent(s): 0d79f16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -55
README.md CHANGED
@@ -1,65 +1,100 @@
1
  ---
2
- library_name: transformers
3
  license: apache-2.0
4
  language:
5
  - en
6
- - bn
7
  - hi
8
- - kn
9
- - gu
10
- - mr
11
- - ml
12
- - or
13
- - pa
14
  - ta
15
  - te
16
- base_model: sarvamai/sarvam-m
17
- base_model_relation: finetune
 
 
 
 
18
  tags:
19
- - llama-cpp
20
- - gguf-my-repo
 
21
  ---
22
 
23
- # AdvRahul/sarvam-m-Q5_K_M-GGUF
24
- This model was converted to GGUF format from [`sarvamai/sarvam-m`](https://huggingface.co/sarvamai/sarvam-m) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
25
- Refer to the [original model card](https://huggingface.co/sarvamai/sarvam-m) for more details on the model.
26
-
27
- ## Use with llama.cpp
28
- Install llama.cpp through brew (works on Mac and Linux)
29
-
30
- ```bash
31
- brew install llama.cpp
32
-
33
- ```
34
- Invoke the llama.cpp server or the CLI.
35
-
36
- ### CLI:
37
- ```bash
38
- llama-cli --hf-repo AdvRahul/sarvam-m-Q5_K_M-GGUF --hf-file sarvam-m-q5_k_m.gguf -p "The meaning to life and the universe is"
39
- ```
40
-
41
- ### Server:
42
- ```bash
43
- llama-server --hf-repo AdvRahul/sarvam-m-Q5_K_M-GGUF --hf-file sarvam-m-q5_k_m.gguf -c 2048
44
- ```
45
-
46
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
47
-
48
- Step 1: Clone llama.cpp from GitHub.
49
- ```
50
- git clone https://github.com/ggerganov/llama.cpp
51
- ```
52
-
53
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
54
- ```
55
- cd llama.cpp && LLAMA_CURL=1 make
56
- ```
57
-
58
- Step 3: Run inference through the main binary.
59
- ```
60
- ./llama-cli --hf-repo AdvRahul/sarvam-m-Q5_K_M-GGUF --hf-file sarvam-m-q5_k_m.gguf -p "The meaning to life and the universe is"
61
- ```
62
- or
63
- ```
64
- ./llama-server --hf-repo AdvRahul/sarvam-m-Q5_K_M-GGUF --hf-file sarvam-m-q5_k_m.gguf -c 2048
65
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: apache-2.0
3
  language:
4
  - en
 
5
  - hi
 
 
 
 
 
 
6
  - ta
7
  - te
8
+ - kn
9
+ - ml
10
+ - bn
11
+ - mr
12
+ - gu
13
+ pipeline_tag: text-generation
14
  tags:
15
+ - Axion
16
+ - Indic
17
+ library_name: transformers
18
  ---
19
 
20
+ <div align="center" style="line-height: 1;">
21
+ <a href="https://huggingface.co/AdvRahul" target="_blank" style="margin: 2px;">
22
+ <img alt="Chat" src="https://img.shields.io/badge/🤖_Chat-Axion-blue" style="display: inline-block; vertical-align: middle;"/>
23
+ </a>
24
+ <a href="https://huggingface.co/AdvRahul" target="_blank" style="margin: 2px;">
25
+ <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-AdvRahul-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
26
+ </a>
27
+ <a href="https://github.com/AdvRahul" target="_blank" style="margin: 2px;">
28
+ <img alt="Github" src="https://img.shields.io/badge/GitHub-Axion-000?logo=github&color=0000FF" style="display: inline-block; vertical-align: middle;"/>
29
+ </a>
30
+ <a href="https://x.com/yourhandle" target="_blank" style="margin: 2px;">
31
+ <img alt="X" src="https://img.shields.io/badge/X-Axion-6080F0?logo=x&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
32
+ </a>
33
+ </div>
34
+
35
+ <div align="center" style="line-height: 1;">
36
+ <a href="#license" style="margin: 2px;">
37
+ <img alt="License" src="https://img.shields.io/badge/License-Apache2.0-A5de54" style="display: inline-block; vertical-align: middle;"/>
38
+ </a>
39
+ </div>
40
+
41
+ # Axion-Pro-Indic-24B
42
+
43
+
44
+ ## Model Information
45
+
46
+ **Axion-Pro-Indic-24B** is a multilingual, hybrid-reasoning, text-only language model built on Mistral-Small.
47
+ This post-trained version delivers exceptional improvements over the base model:
48
+
49
+ - **+20%** average improvement on Indian language benchmarks
50
+ - **+21.6%** enhancement on math benchmarks
51
+ - **+17.6%** boost on programming benchmarks
52
+ - **+86%** improvement in romanized Indian language GSM-8K benchmarks (languages × mathematics intersection).
53
+
54
+ ### Key Features
55
+
56
+ - **Hybrid Thinking Mode**: Supports both "think" and "non-think" modes.
57
+ - **Advanced Indic Skills**: Post-trained on Indian languages + English, reflecting Indian cultural values.
58
+ - **Superior Reasoning Capabilities**: Outperforms similarly sized models on coding and math benchmarks.
59
+ - **Seamless Multilingual Experience**: Full support for Indic scripts and romanized text.
60
+
61
+ ---
62
+
63
+ ## Quickstart
64
+
65
+ ### With Transformers
66
+
67
+ ```python
68
+ from transformers import AutoModelForCausalLM, AutoTokenizer
69
+
70
+ model_name = "AdvRahul/Axion-Pro-Indic-24B"
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
73
+ model = AutoModelForCausalLM.from_pretrained(
74
+ model_name, torch_dtype="auto", device_map="auto"
75
+ )
76
+
77
+ prompt = "Who are you and what is your purpose on this planet?"
78
+
79
+ messages = [{"role": "user", "content": prompt}]
80
+ text = tokenizer.apply_chat_template(
81
+ messages,
82
+ tokenize=False,
83
+ enable_thinking=True, # Default True; set False for no-think mode
84
+ )
85
+
86
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
87
+
88
+ generated_ids = model.generate(**model_inputs, max_new_tokens=8192)
89
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]) :].tolist()
90
+ output_text = tokenizer.decode(output_ids)
91
+
92
+ if "</think>" in output_text:
93
+ reasoning_content = output_text.split("</think>")[0].rstrip("\n")
94
+ content = output_text.split("</think>")[-1].lstrip("\n").rstrip("</s>")
95
+ else:
96
+ reasoning_content = ""
97
+ content = output_text.rstrip("</s>")
98
+
99
+ print("reasoning content:", reasoning_content)
100
+ print("content:", content)