bart-large-instructiongen-w-inputs

Use this text2text model to find out what LLM instruction (and inputs if relevant) might have generated <arbitrary input text>!

This model is a fine-tuned version of facebook/bart-large on the pszemraj/fleece2instructions-inputs-alpaca-cleaned dataset. It achieves the following results on the evaluation set:

Loss: 0.9302
Rouge1: 64.2236
Rouge2: 41.5632
Rougel: 60.5935
Rougelsum: 62.1285
Gen Len: 25.8938

example

Intended uses & limitations

This model is intended to be used to generate instructions from arbitrary text. You can then use these instructions + your data to fine-tune an LLM on instructions w.r.t. a specific domain. This model is primarily intended to enable low-resource domain adaptation, rather than "I want to generate even better prompts for the FLAN-V2 dataset!".

The fleece2instructions-inputs-alpaca-cleaned dataset, obtained from the alpaca-lora repo under the ODC-BY license, has been converted to a text2text format for use with language models. In this dataset, the original 'inputs' and 'instructions' columns are combined into a single 'instructions_inputs' column. To clearly separate the two types of content, each piece of text is prefixed with either an <instruction> or <inputs> token. These tokens not only facilitate model comprehension, but also allow for easy regex separation of model outputs during inference.

As such, users can expect the output of this model to be similarly structured with <instruction> and <inputs> tokens.

Training and evaluation data

Refer to the fleece2instructions-inputs-alpaca-cleaned dataset

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.0145	1.0	1361	1.0460	62.8374	39.8538	59.2593	60.8095	25.2752
0.8796	2.0	2722	0.9289	63.7086	41.1315	60.1588	61.7145	25.7215
0.6943	3.0	4083	0.9302	64.2236	41.5632	60.5935	62.1285	25.8938

pszemraj
/

bart-large-instructiongen-w-inputs