Edit model card

jamba-H1024_L12-v0.07-fineweb-1M-med

Open In Colab

mid-training checkpoint

  • arch: jamba (see model card for kernels/use)
  • tokenizer: claude3 as HF GPT2
  • has only seen up to 2048 context length thus far

numbers

for this checkpoint

hf (pretrained=pszemraj/jamba-H1024_L12-v0.07-fineweb-1M-med,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.4972 ± 0.0141
piqa 1 none 0 acc 0.6072 ± 0.0114
none 0 acc_norm 0.6034 ± 0.0114
openbookqa 1 none 0 acc 0.1660 ± 0.0167
none 0 acc_norm 0.2800 ± 0.0201
lambada_openai 1 none 0 perplexity 157.6757 ± 6.8536
none 0 acc 0.2127 ± 0.0057
boolq 2 none 0 acc 0.6235 ± 0.0085
arc_easy 1 none 0 acc 0.3944 ± 0.0100
none 0 acc_norm 0.3531 ± 0.0098
Downloads last month
233
Safetensors
Model size
888M params
Tensor type
F32
·
BF16
·

Dataset used to train pszemraj/jamba-H1024_L12-v0.07-fineweb-1M-med