Gujju LLaMA 7B Instruct v0.1

We are pleased to announce the release of the Gujju LLaMA 7B instruct model. This significant advancement represents a major step forward in Gujarati language processing capabilities. The model is operational for immediate use and can also be further fine-tuned to address your specific NLP requirements.

Related Models

Model	Type	Data	Base Model	# Params	Download Links
Gujarati LLaMA 7B Base	Base model	10GB	LLaMA 2 7B	7B	HF Hub
Gujarati LLaMA 7B Instruct	Instruction tuned model	300k instructions	Gujarati LLaMA 7B Base	7B	HF Hub

Model description

We have expanded the Llama-2 model's knowledge base by incorporating a whopping 17,000 Gujarati tokens. This builds upon the solid foundation of the original Llama-2, significantly enhancing the Gujarati Llama's ability to understand and process Gujarati language.

Model type: Llama-2 7B parameter model fine-tuned on Gujju-Alpaca - Subset of Dolly Gujju-Dolly and a subset of Gujju-Orca datasets.
Language(s): Gujarati and English
License: Llama 2 Community License
Finetuned from model: sampoorna42/gujju-llama-base-v1.0
Training Precision: float16

Prompting Format

Prompt Template Without Input

{system_prompt}

### Instruction:
{instruction or query}

### Response:
{response}

Prompt Template With Input

{system_prompt}

### Instruction:
{instruction or query}

### Input:
{input}

### Response:
{response}

Usage Note

These models possess impressive linguistic skills, but it's important to remember they haven't been specifically optimized to avoid potentially harmful or offensive content. To mitigate this risk, we advise users to:

Exercise discretion: Carefully consider potential implications before utilizing outputs.
Supervise closely: Monitor outputs, especially in public or sensitive settings.
Be aware of limitations: Remember these models are under development and may not generate perfect results in all situations.

Meet the researchers

LM Evaluation Harness Results

Metric	Value
Avg.	42.53
AI2 Reasoning Challenge (25-Shot)	42.06
HellaSwag (10-Shot)	71.59
MMLU (5-Shot)	37.44
TruthfulQA (0-shot)	33.68
Winogrande (5-shot)	64.17
GSM8k (5-shot)	0.0

This model is your gateway to unlocking the potential of Gujarati language! Let's join forces to push the boundaries of comprehension and expression together!

sampoorna42
/

Gujju-Llama-Instruct-v0.1

Gujju LLaMA 7B Instruct v0.1

Related Models

Model description

Prompting Format

Usage Note

Meet the researchers

LM Evaluation Harness Results

Finetuned from

Space using sampoorna42/Gujju-Llama-Instruct-v0.1 1

Collection including sampoorna42/Gujju-Llama-Instruct-v0.1

Gujju-Llama Models and Datasets

Gujju LLaMA 7B Instruct v0.1

Related Models

Model description

Prompting Format

Usage Note

Meet the researchers

LM Evaluation Harness Results

Finetuned from sampoorna42/gujju-llama-base-v1.0

Space using sampoorna42/Gujju-Llama-Instruct-v0.1 1

Collection including sampoorna42/Gujju-Llama-Instruct-v0.1

Finetuned from