Edit model card

NebuIA-10.7B-slerp

NebuIA-10.7B-slerp is a merge of the following models using LazyMergekit:

🧩 Configuration

slices:
  - sources:
      - model: DopeorNope/SOLARC-M-10.7B
        layer_range: [0, 48]
      - model: Nous-Hermes-2-SOLAR-10.7B
        layer_range: [0, 48]
merge_method: slerp
base_model: DopeorNope/SOLARC-M-10.7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "xellDart13/NebuIA-10.7B-slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Evaluation

Model AGIEval GPT4All TruthfulQA Bigbench Average
NebuIA-10.7B-slerp 47.17 75.18 65.21 46.29 58.46

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 28.35 ± 2.83
acc_norm 27.56 ± 2.81
agieval_logiqa_en 0 acc 44.24 ± 1.95
acc_norm 43.63 ± 1.95
agieval_lsat_ar 0 acc 24.35 ± 2.84
acc_norm 20.43 ± 2.66
agieval_lsat_lr 0 acc 57.65 ± 2.19
acc_norm 57.06 ± 2.19
agieval_lsat_rc 0 acc 69.52 ± 2.81
acc_norm 68.40 ± 2.84
agieval_sat_en 0 acc 77.18 ± 2.93
acc_norm 76.70 ± 2.95
agieval_sat_en_without_passage 0 acc 48.54 ± 3.49
acc_norm 48.54 ± 3.49
agieval_sat_math 0 acc 42.73 ± 3.34
acc_norm 35.00 ± 3.22

Average: 47.17%

GPT4All

Task Version Metric Value Stderr
arc_challenge 0 acc 60.32 ± 1.43
acc_norm 62.88 ± 1.41
arc_easy 0 acc 84.76 ± 0.74
acc_norm 83.80 ± 0.76
boolq 1 acc 88.41 ± 0.56
hellaswag 0 acc 66.66 ± 0.47
acc_norm 84.98 ± 0.36
openbookqa 0 acc 34.60 ± 2.13
acc_norm 45.20 ± 2.23
piqa 0 acc 82.10 ± 0.89
acc_norm 83.30 ± 0.87
winogrande 0 acc 77.66 ± 1.17

Average: 75.18%

TruthfulQA

Task Version Metric Value Stderr
truthfulqa_mc 1 mc1 49.57 ± 1.75
mc2 65.21 ± 1.54

Average: 65.21%

Bigbench

Task Version Metric Value Stderr
bigbench_causal_judgement 0 multiple_choice_grade 62.11 ± 3.53
bigbench_date_understanding 0 multiple_choice_grade 66.94 ± 2.45
bigbench_disambiguation_qa 0 multiple_choice_grade 34.88 ± 2.97
bigbench_geometric_shapes 0 multiple_choice_grade 35.93 ± 2.54
exact_str_match 0.00 ± 0.00
bigbench_logical_deduction_five_objects 0 multiple_choice_grade 31.20 ± 2.07
bigbench_logical_deduction_seven_objects 0 multiple_choice_grade 23.14 ± 1.60
bigbench_logical_deduction_three_objects 0 multiple_choice_grade 46.67 ± 2.89
bigbench_movie_recommendation 0 multiple_choice_grade 40.00 ± 2.19
bigbench_navigate 0 multiple_choice_grade 53.90 ± 1.58
bigbench_reasoning_about_colored_objects 0 multiple_choice_grade 60.30 ± 1.09
bigbench_ruin_names 0 multiple_choice_grade 46.65 ± 2.36
bigbench_salient_translation_error_detection 0 multiple_choice_grade 41.78 ± 1.56
bigbench_snarks 0 multiple_choice_grade 67.40 ± 3.49
bigbench_sports_understanding 0 multiple_choice_grade 74.04 ± 1.40
bigbench_temporal_sequences 0 multiple_choice_grade 59.40 ± 1.55
bigbench_tracking_shuffled_objects_five_objects 0 multiple_choice_grade 24.64 ± 1.22
bigbench_tracking_shuffled_objects_seven_objects 0 multiple_choice_grade 17.60 ± 0.91
bigbench_tracking_shuffled_objects_three_objects 0 multiple_choice_grade 46.67 ± 2.89

Average: 46.29%

Average score: 58.46%

Elapsed time: 03:31:52

Downloads last month
4
Safetensors
Model size
10.7B params
Tensor type
BF16
·

Merge of