Edit model card

Phi-2-psy

Phi-2-psy is a merge of the following models:

πŸ† Evaluation

The evaluation was performed using LLM AutoEval on Nous suite.

Model AGIEval GPT4All TruthfulQA Bigbench Average
phi-2-psy 34.4 71.4 48.2 38.1 48.02
phixtral-2x2_8 34.1 70.4 48.8 37.8 47.78
dolphin-2_6-phi-2 33.1 69.9 47.4 37.2 46.89
phi-2-orange 33.4 71.3 49.9 37.3 47.97
phi-2 28.0 70.8 44.4 35.2 44.61

🧩 Configuration

slices:
  - sources:
      - model: rhysjones/phi-2-orange
        layer_range: [0, 32]
      - model: cognitivecomputations/dolphin-2_6-phi-2
        layer_range: [0, 32]
merge_method: slerp
base_model: rhysjones/phi-2-orange
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

πŸ’» Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("vince62s/phi-2-psy", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("vince62s/phi-2-psy", trust_remote_code=True)
inputs = tokenizer('''def print_prime(n):
   """
   Print all primes between 1 and n
   """''', return_tensors="pt", return_attention_mask=False)
outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 62.80
AI2 Reasoning Challenge (25-Shot) 60.84
HellaSwag (10-Shot) 75.52
MMLU (5-Shot) 57.57
TruthfulQA (0-shot) 48.22
Winogrande (5-shot) 75.45
GSM8k (5-shot) 59.21
Downloads last month
2,844
Safetensors
Model size
2.78B params
Tensor type
BF16
Β·

Merge of

Evaluation results