Edit model card

This is a model from blockchainlab test 2.4 which are merged - alnrg2arg/blockchainlabs_7B_merged_test2_4.

The project is running to make a small LLM for a on-device purpose.

Overall pipeline for this iteration is

1.Merging to make a base model (7B) 2.Prune the model to reduce the parameter (50% sparcity) 3.For recovery phase of the pruning, the DPO is chosen.

This model which is not pruned is intended to compare with the pruned model.

This is the code and parameters I chose for this model(DPO).

from transformers import TrainingArguments, AutoModelForCausalLM
from trl import DPOTrainer

dpo_trainer = DPOTrainer(
    model = model,
    ref_model = None,
    args = TrainingArguments(
        per_device_train_batch_size = 8,
        gradient_accumulation_steps = 8,
        warmup_ratio = 0.1,
        num_train_epochs = 3,
        learning_rate = 5e-6,
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.0,
        lr_scheduler_type = "linear",
        seed = 42,
        output_dir = "output_DPO",
    beta = 0.1,
    train_dataset = dataset,
    # eval_dataset = raw_datasets["test"],
    tokenizer = tokenizer,
    max_length = 1024,
    max_prompt_length = 512,

The code and parameters are borrowed from https://colab.research.google.com/drive/1SKrKGV-BZoU4kv5q3g0jtE_OhRgPtrrQ?usp=sharing

Benchmark scores

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 25 acc 0.6971 ± 0.0134
none 25 acc_norm 0.7142 ± 0.0132
Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 10 acc 0.7008 ± 0.0046
none 10 acc_norm 0.8726 ± 0.0033
Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6265 ± 0.1232
- humanities N/A none 5 acc 0.5864 ± 0.1135
- other N/A none 5 acc 0.6930 ± 0.1085
- social_sciences N/A none 5 acc 0.7270 ± 0.0820
- stem N/A none 5 acc 0.5230 ± 0.1264
Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 5 acc 0.8414 ± 0.0103
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.7263 ± 0.0123
Tasks Version Filter n-shot Metric Value Stderr
truthfulqa_mc2 2 none 0 acc 0.6794 ± 0.0153

Average : 74.34

Downloads last month
Model size
7.35B params
Tensor type

Quantized from

Datasets used to train alnrg2arg/blockchainlabs_7B_merged_test2_4_sft_4bit_DPO_orca2_truthy