Edit model card

This model has a LASER intervention on Mihaiii/Pallas-0.5-LASER-0.1 .

Configs used:

  • lnum: 58
  • lnames: attn (meaning: ["self_attn.k_proj.weight", "self_attn.q_proj.weight", "self_attn.v_proj.weight", "self_attn.o_proj.weight"])
  • rate: 9.0
  • dataset: bigbench (subset: causal_judgement)
  • intervention type: rank-reduction
Name Validation acc (higher is better) Validation logloss (lower is better) Test acc (higher is better) Test logloss (lower is better)
Pallas-0.5 55.263 1.650 60.526 1.463
Pallas-0.5-LASER-0.1 55.263 1.639 61.184 1.451
Pallas-0.5-LASER-0.2 55.263 1.646 61.184 1.458
Pallas-0.5-LASER-0.3 55.263 1.575 61.842 1.382
Pallas-0.5-LASER-0.4 55.263 1.525 61.842 1.326
Pallas-0.5-LASER-0.5 55.263 1.484 61.842 1.297
Pallas-0.5-LASER-0.6 55.263 1.455 61.184 1.283

In order to replicate on a single A100, you can use my branch (the original code will throw OOM for 34b models).

Downloads last month
3,877
Safetensors
Model size
34.4B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Finetuned from