OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 · Loss is 0.00000 (Also model not answering after training)

banank1989

May 29, 2023

I am trying to fine tune the LLM(OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5) with my data.

My code:

import torch
from transformers import LineByLineTextDataset, DataCollatorForLanguageModeling
from transformers import Trainer, TrainingArguments
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5", 
                                             load_in_8bit=True,
                                             device_map="auto")

from datasets import load_dataset

# Load the dataset
dataset = load_dataset('parquet', data_files='data/dataset.parquet')

# Tokenize and format the dataset
def tokenize_function(examples):
    return tokenizer(examples['TEXT'], truncation=True, max_length=128, padding='max_length')


tokenized_dataset = dataset.map(tokenize_function, batched=True)
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=100,
    per_device_train_batch_size=2,
    per_device_eval_batch_size=4,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=4
)



data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm=False,
)

# Create the Trainer and train
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset['train'],
    data_collator=data_collator,
)

trainer.train()

# Save the trained model
trainer.save_model("model")  # replace with the path where you want to save the model
tokenizer.save_pretrained("model")

Now the issue is while training, loss is 0.000000 meaning there is something wrong with my training, Also when I am loading the trainied model, answers are not coming at all(Which should not be the case). Also the downloaded actual model disk size is 23GB but mine model size is 9.6GB

My raw data is in csv which I have then converted to parquet. My dataset has 3 columns(TEXT, source, metadata). Also my dataset only contains 12 rows. I have also tried the same with some another dataset from huggingface, same

nikhiljais

Jun 7, 2023

hi @banank1989 were you able to fine-tune the model?

banank1989

Jun 13, 2023

Using LoRA. yes

banank1989 changed discussion status to closed Jun 13, 2023

nikhiljais

Jun 13, 2023

HI @banank1989 can you please share me the code snippet or resource material where I can find how to finetune using LoRA. It would be a great help. Thanks

nikhiljais

Jun 15, 2023

HI @banank1989 can you please share me the code snippet or resource material where I can find how to finetune using LoRA. It would be a great help. Thanks