LLM Finetuning

With AutoTrain, you can easily finetune large language models (LLMs) on your own data!

AutoTrain supports the following types of LLM finetuning:

Data Preparation

LLM finetuning accepts data in CSV format.

For SFT / Generic Trainer, the data should be in the following format:

text
human: hello \n bot: hi nice to meet you
human: how are you \n bot: I am fine
human: What is your name? \n bot: My name is Mary
human: Which is the best programming language? \n bot: Python

For SFT/Generic training, your dataset must have a text column

For Reward Trainer, the data should be in the following format:

text	rejected_text
human: hello \n bot: hi nice to meet you	human: hello \n bot: leave me alone
human: how are you \n bot: I am fine	human: how are you \n bot: I am not fine
human: What is your name? \n bot: My name is Mary	human: What is your name? \n bot: Whats it to you?
human: Which is the best programming language? \n bot: Python	human: Which is the best programming language? \n bot: Javascript

For Reward Trainer, your dataset must have a text column (aka chosen text) and a rejected_text column.

For DPO/ORPO Trainer, the data should be in the following format:

prompt	text	rejected_text
hello	hi nice to meet you	leave me alone
how are you	I am fine	I am not fine
What is your name?	My name is Mary	Whats it to you?
What is your name?	My name is Mary	I dont have a name
Which is the best programming language?	Python	Javascript
Which is the best programming language?	Python	C++
Which is the best programming language?	Java	C++

For DPO/ORPO Trainer, your dataset must have a prompt column, a text column (aka chosen text) and a rejected_text column.

For all tasks, you can use both CSV and JSONL files!