Loss is 0.00000 (Also model not answering after training)

#17
by banank1989 - opened

I am trying to fine tune the LLM(OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5) with my data.

My code:

import torch
from transformers import LineByLineTextDataset, DataCollatorForLanguageModeling
from transformers import Trainer, TrainingArguments
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", 
                                             load_in_8bit=True,
                                             device_map="auto")

from datasets import load_dataset

# Load the dataset
dataset = load_dataset('parquet', data_files='data/dataset.parquet')

# Tokenize and format the dataset
def tokenize_function(examples):
    return tokenizer(examples['TEXT'], truncation=True, max_length=128, padding='max_length')


tokenized_dataset = dataset.map(tokenize_function, batched=True)
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=100,
    per_device_train_batch_size=2,
    per_device_eval_batch_size=4,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=4
)



data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm=False,
)

# Create the Trainer and train
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset['train'],
    data_collator=data_collator,
)

trainer.train()

# Save the trained model
trainer.save_model("model")  # replace with the path where you want to save the model
tokenizer.save_pretrained("model")

Now the issue is while training, loss is 0.000000 meaning there is something wrong with my training, Also when I am loading the trainied model, answers are not coming at all(Which should not be the case). Also the downloaded actual model disk size is 23GB but mine model size is 9.6GB

My raw data is in csv which I have then converted to parquet. My dataset has 3 columns(TEXT, source, metadata). Also my dataset only contains 12 rows. I have also tried the same with some another dataset from huggingface, same

hi @banank1989 were you able to fine-tune the model?

Using LoRA. yes

banank1989 changed discussion status to closed

HI @banank1989 can you please share me the code snippet or resource material where I can find how to finetune using LoRA. It would be a great help. Thanks

HI @banank1989 can you please share me the code snippet or resource material where I can find how to finetune using LoRA. It would be a great help. Thanks

Sign up or log in to comment