Mihaiii
/

Pallas-0.5-LASER-0.2

Text Generation

text-generation-inference

Model card Files Files and versions Community

Edit model card

This model has a LASER intervention on Mihaiii/Pallas-0.5-LASER-0.1 .

Configs used:

lnum: 58
lnames: attn (meaning: ["self_attn.k_proj.weight", "self_attn.q_proj.weight", "self_attn.v_proj.weight", "self_attn.o_proj.weight"])
rate: 9.0
dataset: bigbench (subset: causal_judgement)
intervention type: rank-reduction

Name	Validation acc (higher is better)	Validation logloss (lower is better)	Test acc (higher is better)	Test logloss (lower is better)
Pallas-0.5	55.263	1.650	60.526	1.463
Pallas-0.5-LASER-0.1	55.263	1.639	61.184	1.451
Pallas-0.5-LASER-0.2	55.263	1.646	61.184	1.458
Pallas-0.5-LASER-0.3	55.263	1.575	61.842	1.382
Pallas-0.5-LASER-0.4	55.263	1.525	61.842	1.326
Pallas-0.5-LASER-0.5	55.263	1.484	61.842	1.297
Pallas-0.5-LASER-0.6	55.263	1.455	61.184	1.283

In order to replicate on a single A100, you can use my branch (the original code will throw OOM for 34b models).

Downloads last month: 3,877

Safetensors

Model size

34.4B params

Tensor type

BF16

·

Finetuned from