Improve model card metadata and add paper link
Browse filesHi! I'm Niels from the Hugging Face community team.
I've opened this PR to improve the metadata and structure of your model card. Specifically, I've added:
- `pipeline_tag: text-generation` for better discoverability on the Hub.
- `library_name: transformers` as the model uses the Llama architecture and is compatible with the `transformers` library, enabling the "Use in Transformers" button and code snippets.
- A direct link to the paper page on Hugging Face at the start of the model description, without replacing your existing arXiv citation.
Additionally, I've clarified the model architecture by adding "(Llama-based)" and updated the BibTeX citation to match the official paper title and use the `bibtex` code block type.
These changes will make your model more accessible and easier to use for the community.
|
@@ -1,12 +1,17 @@
|
|
| 1 |
---
|
| 2 |
-
license: cc-by-nc-sa-4.0
|
| 3 |
language:
|
| 4 |
- bn
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
|
|
|
| 6 |
## Description
|
| 7 |
|
| 8 |
**Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
|
| 9 |
|
|
|
|
|
|
|
| 10 |
Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
|
| 11 |
|
| 12 |
This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
|
|
@@ -21,7 +26,7 @@ If you use our model, please cite our paper [Niyogi and Bhattacharya, 2024](http
|
|
| 21 |
The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
|
| 22 |
|
| 23 |
### Model Architecture
|
| 24 |
-
Transformer Decoder Only Auto Regressive Model
|
| 25 |
|
| 26 |
### Limitations
|
| 27 |
The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
|
|
@@ -33,9 +38,9 @@ Gyan AI Research does own the output generated from the model.
|
|
| 33 |
|
| 34 |
### Citations
|
| 35 |
|
| 36 |
-
```
|
| 37 |
@misc{niyogi2024paramanufamilynovelefficient,
|
| 38 |
-
title={Paramanu:
|
| 39 |
author={Mitodru Niyogi and Arnab Bhattacharya},
|
| 40 |
year={2024},
|
| 41 |
eprint={2401.18034},
|
|
@@ -43,3 +48,4 @@ Gyan AI Research does own the output generated from the model.
|
|
| 43 |
primaryClass={cs.CL},
|
| 44 |
url={https://arxiv.org/abs/2401.18034},
|
| 45 |
}
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
- bn
|
| 4 |
+
license: cc-by-nc-sa-4.0
|
| 5 |
+
library_name: transformers
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
---
|
| 8 |
+
|
| 9 |
## Description
|
| 10 |
|
| 11 |
**Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
|
| 12 |
|
| 13 |
+
This model was presented in the paper [Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages](https://huggingface.co/papers/2401.18034).
|
| 14 |
+
|
| 15 |
Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
|
| 16 |
|
| 17 |
This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
|
|
|
|
| 26 |
The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
|
| 27 |
|
| 28 |
### Model Architecture
|
| 29 |
+
Transformer Decoder Only Auto Regressive Model (Llama-based)
|
| 30 |
|
| 31 |
### Limitations
|
| 32 |
The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
|
|
|
|
| 38 |
|
| 39 |
### Citations
|
| 40 |
|
| 41 |
+
```bibtex
|
| 42 |
@misc{niyogi2024paramanufamilynovelefficient,
|
| 43 |
+
title={Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages},
|
| 44 |
author={Mitodru Niyogi and Arnab Bhattacharya},
|
| 45 |
year={2024},
|
| 46 |
eprint={2401.18034},
|
|
|
|
| 48 |
primaryClass={cs.CL},
|
| 49 |
url={https://arxiv.org/abs/2401.18034},
|
| 50 |
}
|
| 51 |
+
```
|