ravirajoshi
commited on
Commit
•
53324d8
1
Parent(s):
c040672
Update README.md
Browse files
README.md
CHANGED
@@ -12,14 +12,14 @@ library_name: nemo
|
|
12 |
|
13 |
# Model Overview
|
14 |
|
15 |
-
Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
|
16 |
Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
|
17 |
|
18 |
This model is for research and development only.
|
19 |
|
20 |
**Model Developer:** NVIDIA
|
21 |
|
22 |
-
**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and
|
23 |
|
24 |
## License
|
25 |
|
@@ -93,7 +93,7 @@ print(output_text)
|
|
93 |
|
94 |
## Evaluation Results
|
95 |
|
96 |
-
*Zero-shot performance.* Evaluated using select datasets from the [
|
97 |
|
98 |
| MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
|
99 |
| :------------- | :------------- | :------------- | :------------- | :------------- |
|
|
|
12 |
|
13 |
# Model Overview
|
14 |
|
15 |
+
Nemotron-4-Mini-Hindi-4B-Base is a base model pre-trained on Hindi and English corpus. The Nemotron-Mini-4B-Base (Minitron-4B) is subject to continuous pre-training using Hindi and English data (400B tokens) exclusively to create a strong base model for Hindi, English, and Hinglish. We make extensive use of synthetic data during the continuous pre-training stage. The base small language model (SLM) is optimized through distillation, pruning, and quantization for speed and on-device deployment.
|
16 |
Please refer to our [arXiv paper](https://arxiv.org/abs/2410.14815) for more details.
|
17 |
|
18 |
This model is for research and development only.
|
19 |
|
20 |
**Model Developer:** NVIDIA
|
21 |
|
22 |
+
**Model Dates:** Nemotron-4-Mini-Hindi-4B-Base was trained between June 2024 and Sept 2024.
|
23 |
|
24 |
## License
|
25 |
|
|
|
93 |
|
94 |
## Evaluation Results
|
95 |
|
96 |
+
*Zero-shot performance.* Evaluated using select Hindi datasets from the [Airavata Evaluation Framework](https://github.com/AI4Bharat/IndicInstruct) with additions:
|
97 |
|
98 |
| MMLU | ARC-C | ARC-E | HellaSwag | BoolQ |
|
99 |
| :------------- | :------------- | :------------- | :------------- | :------------- |
|