PyTorch
Oriya
llama

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Description

GyanAI's Oriya-87M, is 87 million parameters pretrained open source Generative Decoder-only Auto Regressive Language Model for Odia/Oriya.

It is a monolingual Oriya/Odia generative Decoder Language model.

This is a pretrained model from scratch at a context size of 1024.

This model is not either chat-tuned or fine-tuned.

We recommend to fine-tune/chat-tune this pretrained model on Oriya/Odia chat or Oriya/Odia instruction datasets. Please use PyTorch for fine-tuning/instruction-tuning.

This model is strictly prohibited to use for commercial purposes.

If you use our model, please cite our paper Niyogi and Bhattacharya, 2024

Model Architecture

Transformer Decoder Auto Regressive Model

Limitations

The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.

Gyan AI Research does own the output generated from the model.

Citations

@misc{niyogi2024paramanufamilynovelefficient,
      title={Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages}, 
      author={Mitodru Niyogi and Arnab Bhattacharya},
      year={2024},
      eprint={2401.18034},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2401.18034}, 
}
Downloads last month
0
Inference API
Unable to determine this model's library. Check the docs .