Model Card

This is a laser fine tuning of Aloobun's great 1.5b param reyna mini model.

Model Description

This model is quite conversational - even a bit more so after laser tuning even using Peft. The function calling is mediocre, but will be improved in future versions.

Uses

As Aloobun's model is well performing and impressive on it's own, I decided to add some function calling while practicing the LaserRMT technique.

Direct Use

Chat
Conversational
Text Generation
Function Calling

Bias, Risks, and Limitations

This model will take over your house, borrow your car, talk badly to your family, and generally make everything incrementally worse. If you use it for nefarious purposes.

Recommendations

Use at your own risk. It's a great small model, owing to the base model before tuning.

Training Details

Training Data

"eval/loss": 2.1797242164611816,
"_timestamp": 1708624900.2239263,
"_runtime": 20945.370138406754,
"train/train_loss": 2.515587423102269,
"train/global_step": 918,
"train/train_steps_per_second": 0.044,
"train/loss": 2.2062,
"train/learning_rate": 0,
"train/train_samples_per_second": 1.403,
"train/train_runtime": 20945.6359,
"eval/steps_per_second": 4.867,
"eval/samples_per_second": 4.867,
"_step": 923,
"train/epoch": 2.98,
"eval/runtime": 41.0972,
"train/grad_norm": 0.2638521194458008,
"train/total_flos": 141790931224363000

Training Procedure

LaserRMT was used to refine the weights, using the 16 highest scored weights specifically by noise-to-ratio analysis.

This technique avoids training unnecessarily low-performng weights that can turn to garbage. By pruning these weights, the model size is decreased slightly.

Axolotl was used for training and dataset tokenization.

Preprocessing

Dataset was formatted in ShareGpt format for the purposes of using with Axolotl, in conversational format.

Training Hyperparameters

lora_r: 64
lora_alpha: 16
lora_dropout: 0.05
gradient_accumulation_steps: 4
micro_batch_size: 1
num_epochs: 3
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.00025

Evaluation

Groups	Version	Filter	n-shot	Metric	Value		Stderr
Open LLM Leaderboard	N/A	none	5	rouge2_acc	0.1920	±	0.0176
		none	5	bleu_max	15.2292	±	0.6714
		flexible-extract	5	exact_match	0.0220	±	0.0066
- truthfulqa_mc1	2	none	0	acc	0.2440	±	0.0192
- truthfulqa_mc2	2	none	0	acc	0.4430	±	0.0195
- winogrande	1	none	5	acc	0.5120	±	0.0224
- arc_challenge	1	none	25	acc	0.1760	±	0.0170
		none	25	acc_norm	0.2320	±	0.0189
- gsm8k	3	strict-match	5	exact_match	0.0060	±	0.0035
		flexible-extract	5	exact_match	0.0220	±	0.0066
- hellaswag	1	none	10	acc	0.3520	±	0.0214
		none	10	acc_norm	0.4040	±	0.0220
		none	5	rouge2_diff	-3.3178	±	0.9477
		none	5	rougeL_acc	0.3860	±	0.0218
		none	5	acc_norm	0.3180	±	0.0145
		none	5	rouge1_diff	-1.5564	±	1.0223
		none	5	bleu_diff	-0.6500	±	0.6421
		none	5	rouge2_max	16.4873	±	1.0172
		none	5	rougeL_diff	-0.7765	±	1.0034
		strict-match	5	exact_match	0.0060	±	0.0035
		none	5	bleu_acc	0.4360	±	0.0222
		none	5	rougeL_max	33.8798	±	0.9367
		none	5	rouge1_max	36.3550	±	0.9462
		none	5	rouge1_acc	0.3700	±	0.0216
		none	5	acc	0.2664	±	0.0036
- mmlu	N/A	none	0	acc	0.2533	±	0.0039
- humanities	N/A	none	5	acc	0.2408	±	0.0075
- other	N/A	none	5	acc	0.2443	±	0.0080
- social_sciences	N/A	none	5	acc	0.2538	±	0.0081
- stem	N/A	none	5	acc	0.2740	±	0.0079
- truthfulqa	N/A	none	0	rouge2_acc	0.1920	±	0.0176
		none	0	rougeL_diff	-0.7765	±	1.0034
		none	0	bleu_max	15.2292	±	0.6714
		none	0	rouge2_diff	-3.3178	±	0.9477
		none	0	rougeL_acc	0.3860	±	0.0218
		none	0	bleu_diff	-0.6500	±	0.6421
		none	0	rouge2_max	16.4873	±	1.0172
		none	0	rouge1_diff	-1.5564	±	1.0223
		none	0	acc	0.3435	±	0.0137
		none	0	bleu_acc	0.4360	±	0.0222
		none	0	rougeL_max	33.8798	±	0.9367
		none	0	rouge1_max	36.3550	±	0.9462
		none	0	rouge1_acc	0.3700	±	0.0216

jtatman
/

Reyna-Mini-1.8B-v0.2-function-call-laser