Edit model card

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:


slices:
  - sources:
      - model: Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct
        layer_range: [0, 48]
      - model: kodonho/SolarM-SakuraSolar-SLERP
        layer_range: [0, 48]
merge_method: slerp
base_model: Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16
Model AGIEval GPT4All TruthfulQA Bigbench Average
NebuIA-10.7B-DPO 48.38 74.87 72.57 45.74 60.39

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 27.56 ± 2.81
acc_norm 27.95 ± 2.82
agieval_logiqa_en 0 acc 42.40 ± 1.94
acc_norm 42.86 ± 1.94
agieval_lsat_ar 0 acc 27.39 ± 2.95
acc_norm 25.22 ± 2.87
agieval_lsat_lr 0 acc 54.31 ± 2.21
acc_norm 55.10 ± 2.20
agieval_lsat_rc 0 acc 69.89 ± 2.80
acc_norm 69.14 ± 2.82
agieval_sat_en 0 acc 79.61 ± 2.81
acc_norm 80.10 ± 2.79
agieval_sat_en_without_passage 0 acc 48.06 ± 3.49
acc_norm 47.57 ± 3.49
agieval_sat_math 0 acc 42.73 ± 3.34
acc_norm 39.09 ± 3.30

Average: 48.38%

GPT4All

Task Version Metric Value Stderr
arc_challenge 0 acc 60.67 ± 1.43
acc_norm 63.74 ± 1.40
arc_easy 0 acc 83.08 ± 0.77
acc_norm 81.23 ± 0.80
boolq 1 acc 88.44 ± 0.56
hellaswag 0 acc 69.28 ± 0.46
acc_norm 86.71 ± 0.34
openbookqa 0 acc 37.60 ± 2.17
acc_norm 48.00 ± 2.24
piqa 0 acc 80.25 ± 0.93
acc_norm 80.20 ± 0.93
winogrande 0 acc 75.77 ± 1.20

Average: 74.87%

TruthfulQA

Task Version Metric Value Stderr
truthfulqa_mc 1 mc1 57.89 ± 1.73
mc2 72.57 ± 1.49

Average: 72.57%

Bigbench

Task Version Metric Value Stderr
bigbench_causal_judgement 0 multiple_choice_grade 58.95 ± 3.58
bigbench_date_understanding 0 multiple_choice_grade 63.41 ± 2.51
bigbench_disambiguation_qa 0 multiple_choice_grade 37.60 ± 3.02
bigbench_geometric_shapes 0 multiple_choice_grade 28.97 ± 2.40
exact_str_match 0.00 ± 0.00
bigbench_logical_deduction_five_objects 0 multiple_choice_grade 28.20 ± 2.01
bigbench_logical_deduction_seven_objects 0 multiple_choice_grade 21.86 ± 1.56
bigbench_logical_deduction_three_objects 0 multiple_choice_grade 47.00 ± 2.89
bigbench_movie_recommendation 0 multiple_choice_grade 44.00 ± 2.22
bigbench_navigate 0 multiple_choice_grade 63.90 ± 1.52
bigbench_reasoning_about_colored_objects 0 multiple_choice_grade 58.15 ± 1.10
bigbench_ruin_names 0 multiple_choice_grade 41.96 ± 2.33
bigbench_salient_translation_error_detection 0 multiple_choice_grade 38.48 ± 1.54
bigbench_snarks 0 multiple_choice_grade 65.75 ± 3.54
bigbench_sports_understanding 0 multiple_choice_grade 72.31 ± 1.43
bigbench_temporal_sequences 0 multiple_choice_grade 63.10 ± 1.53
bigbench_tracking_shuffled_objects_five_objects 0 multiple_choice_grade 24.64 ± 1.22
bigbench_tracking_shuffled_objects_seven_objects 0 multiple_choice_grade 18.00 ± 0.92
bigbench_tracking_shuffled_objects_three_objects 0 multiple_choice_grade 47.00 ± 2.89

Average: 45.74%

Average score: 60.39%

Downloads last month
570
Safetensors
Model size
10.7B params
Tensor type
BF16
·

Merge of