Edit model card

Model Card for Model ID

This model is a fine-tuned version of YTU's Cosmos GPT2 Language Model. You can check the code from here:Fine Tuning Cosmos by LoRA and QLoRA

Training Details

The model was fine-tuned using LoRA and QLoRA techniques. Training parameters are defined below.

LoRA configs:

  • r=16
  • lora_alpha=32
  • target_modules=c_proj,c_fc, gate_proj, c_proj, c_attn
  • lora_dropout=0.05
  • bias="lora_only"
  • fan_in_fan_out=True
  • max_seq_length=512
  • use_rslora=True

Train Parameters:

  • train_epochs=5
  • optim="paged_lion_8bit"
  • learning_rate=2e-4
  • warmup_ratio=0.03
  • max_grad_norm=0.3
  • lr_scheduler_type="linear"

Training Data

For training, I used Merve's Turkish Instructions Dataset, which you can check here: Merve's Turkish Instructions Dataset

Instruction template:

def format_instruction(sample):
    return f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.
        ### Input:
        {sample["talimat"]}
        
        ### Context:
        {sample[" giriş"]}

        ### Response:
        {sample[" çıktı"]}
    """

Generate Output:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "ardaorcun/finetuned_cosmos2603"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, 
                                             device_map='auto', 
                                             load_in_8bit=True)

sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9)

pipe = pipeline("text-generation", 
                model=model, 
                tokenizer=tokenizer,
                device_map="auto",
                max_new_tokens=512, 
                return_full_text=True,
                repetition_penalty=1.1
               )

DEFAULT_SYSTEM_PROMPT = "Sen cevap vermeyi seven yardımcı bir dil modelisin.\n"

def format_instruction(sample):
    return f"""{DEFAULT_SYSTEM_PROMPT}
### Input:
{sample["talimat"]}

### Context:
{sample["giriş"]}

### Response:
{sample["çıktı"]}"""

Create Answer:

prompt = "your_prompt"
girdi = "your_entry"
instruction = f"""Sen cevap vermeyi seven yardımcı bir dil modelisin.\n### Input:\n{prompt}\n\n### Context:\n{girdi}\n\n### Response:"""
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_length = 2048)
result = pipe(instruction)
print(result[0]['generated_text'][len(instruction):])
Downloads last month
2
Safetensors
Model size
774M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ardaorcun/finetuned_cosmos2603