Edit model card

This is a model from blockchainlab test 2.4 - alnrg2arg/blockchainlabs_7B_merged_test2_4.

The project is running to make a small LLM for a on-device purpose.

Overall pipeline for this iteration is

1.Merging to make a base model (7B) 2.Prune the model to reduce the parameter (50% sparcity) 3.For recovery phase of the pruning, the DPO is chosen.

This model which is not pruned is intended to compare with the pruned model.

This is the code and parameters I chose for this model(DPO).

from transformers import TrainingArguments, AutoModelForCausalLM
from trl import DPOTrainer

dpo_trainer = DPOTrainer(
    model = model,
   
    ref_model = None,
    args = TrainingArguments(
        per_device_train_batch_size = 8,
        gradient_accumulation_steps = 8,
        warmup_ratio = 0.1,
        num_train_epochs = 3,
        learning_rate = 5e-6,
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.0,
        lr_scheduler_type = "linear",
        seed = 42,
        output_dir = "output_DPO",
    ),
    beta = 0.1,
    train_dataset = dataset,
    # eval_dataset = raw_datasets["test"],
    tokenizer = tokenizer,
    max_length = 1024,
    max_prompt_length = 512,
)

The code and parameters are borrowed from https://colab.research.google.com/drive/1SKrKGV-BZoU4kv5q3g0jtE_OhRgPtrrQ?usp=sharing

Benchmark Scores

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.6894 ± 0.0135
none 0 acc_norm 0.6860 ± 0.0136
Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc 0.7092 ± 0.0045
none 0 acc_norm 0.8736 ± 0.0033
Tasks Version Filter n-shot Metric Value Stderr
truthfulqa_mc2 2 none 0 acc 0.7126 ± 0.015
Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6225 ± 0.1292
- humanities N/A none 0 acc 0.5745 ± 0.1286
- other N/A none 0 acc 0.6952 ± 0.1095
- social_sciences N/A none 0 acc 0.7280 ± 0.0735
- stem N/A none 0 acc 0.5195 ± 0.1313
Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.824 ± 0.0107
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.7263 ± 0.0123

Average = 74.08

Downloads last month
1,883
Safetensors
Model size
7.24B params
Tensor type
BF16
·

Finetuned from

Dataset used to train alnrg2arg/test3_sft_16bit_dpo2