File size: 3,278 Bytes

a58fcd5
 
5a7b088
 
 
 
 
 
 
a58fcd5
 
 
 
5a7b088
a58fcd5
 
 
 
c314f0c
5a7b088
 
 
c314f0c
 
 
 
 
a58fcd5
 
 
5a7b088
a58fcd5
c314f0c
 
 
a58fcd5
c314f0c
a58fcd5
 
 
 
 
c314f0c
 
 
a58fcd5
 
 
 
0f0314d
a58fcd5
 
 
 
 
 
 
 
c314f0c
a58fcd5
 
 
 
c314f0c
a58fcd5
 
 
 
 
 
c314f0c
a58fcd5
 
 
c314f0c
a58fcd5
0f0314d
a58fcd5
0f0314d
a58fcd5
0f0314d
a58fcd5
 
 
 
0f0314d
a58fcd5
0f0314d
 
 
 
 
 
a58fcd5
0f0314d
 
a58fcd5
0f0314d

---
library_name: transformers
license: gemma
datasets:
- ai4bharat/sangraha
language:
- mr
- en
pipeline_tag: text-generation
---

# Model Card for Model ID

<!--  -->



## Model Details
Shivneri Marathi LLM is being built with the wish to bring the benefits of Generative AI to non-English (especially Marathi) speaking population of India.
Marathi has the third largest number of native speakers in India, after Hindi and Bengali. 
Almost 83 million people speak the language. 
This is a  preliminary version of our Marathi LLM (Large Language Model)!
Built on the mighty Gemma 7B base model, Shivneri LLM can generate creative and informative text in both Marathi and English.  This is just the beginning – we're constantly improving Shivneri, and even more exciting features are on the horizon!

### Model Description

<!--  -->

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

- **Developed by:** Amit Ghadge
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [Amit Ghadge]
- **Model type:** [ Decoder-only large language model (LLM) with a transformer architecture]
- **Language(s) (NLP):** [Marathi, English]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [Gemma-7B]

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [https://github.com/amitagh/shivneri-llm]
- **Paper [optional]:** [https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8]
- **Demo [optional]:** [Coming soon]

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This is a very preliminary version. Please use with caution. Would suggest to more updates and final models to try out.


## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

[Continually Pretrained with Lora on AI4Bharat/Sangraha dataset]

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
Continually Pretrained with Lora




### Model Architecture and Objective

[ Decoder-only large language model (LLM) with a transformer architecture]

### Compute Infrastructure

[A100 80 GB]

## Meet the Developers

Get to know the creators behind this innovative model and follow their contributions to the field:

- [Amit Ghadge](https://www.linkedin.com/in/amit-ghadge-a162a115/)


## Citation [optional]

If you use this model in your research, please cite:

```bibtex
@misc{amitghadge2024ShivneriLLMv01,
      title={Shivneri-LLM: Your Bilingual Marathi and English Text Generation LLM}, 
      author={Amit Ghadge},
      year={2024},
      eprint={https://medium.com/@amitagh/shivneri-marathi-llm-e823f0a045d8},

}
```

We hope this model serves as a valuable tool in your NLP toolkit and look forward to seeing the advancements it will enable in the understanding and generation of the Marathi language.