File size: 2,294 Bytes
86db54c 86a67f5 86db54c 86a67f5 6eebf96 86a67f5 6eebf96 86a67f5 6eebf96 86a67f5 6eebf96 86a67f5 6eebf96 86a67f5 7ecbeb5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
library_name: peft
base_model: allenai/tulu-13b
---
Model Details
Original Model: allenai/tulu-13b
Fine-Tuned For: Azerbaijani language understanding and generation
Dataset Used: Azerbaijani translation of the Stanford Alpaca dataset
Fine-Tuning Method: Self-instruct method
This model, is part of the ["project/Barbarossa"](https://github.com/Alas-Development-Center/project-barbarossa) initiative, aimed at enhancing natural language processing capabilities for the Azerbaijani language. By fine-tuning this model on the Azerbaijani translation of the Stanford Alpaca dataset using the self-instruct method, we've made significant strides in improving AI's understanding and generation of Azerbaijani text.
__Our primary objective with this model is to offer insights into the feasibility and outcomes of fine-tuning large language models (LLMs) for the Azerbaijani language. The fine-tuning process was undertaken with limited resources, providing valuable learnings rather than creating a model ready for production use. Therefore, we recommend treating this model as a reference or a guide to understanding the potential and challenges involved in fine-tuning LLMs for specific languages. It serves as a foundational step towards further research and development rather than a direct solution for production environments.__
This project is a proud product of the [Alas Development Center (ADC)](https://az.linkedin.com/company/alas-development-center?trk=ppro_cprof). We are thrilled to offer these finely-tuned large language models to the public, free of charge.
How to use?
```
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, pipeline
model_path = "alasdevcenter/az-tulu"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
instruction = "Təbiətin qorunması "
formatted_prompt = f"""Aşağıda daha çox kontekst təmin edən təlimat var. Sorğunu adekvat şəkildə tamamlayan cavab yazın.
### Təlimat:
{instruction}
### Cavab:
"""
result = pipe(formatted_prompt)
print(result[0]['generated_text'])
```
|