pszemraj's picture
Update README.md
47f2ec0 verified
metadata
license: apache-2.0
datasets:
  - BEE-spoke-data/fineweb-1M_en-med
language:
  - en
tags:
  - jamba
  - claude3 tokenizer

jamba-H1024_L12-v0.07-fineweb-1M-med

Open In Colab

mid-training checkpoint

  • arch: jamba (see model card for kernels/use)
  • tokenizer: claude3 as HF GPT2
  • has only seen up to 2048 context length thus far

numbers

for this checkpoint

hf (pretrained=pszemraj/jamba-H1024_L12-v0.07-fineweb-1M-med,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.4972 ± 0.0141
piqa 1 none 0 acc 0.6072 ± 0.0114
none 0 acc_norm 0.6034 ± 0.0114
openbookqa 1 none 0 acc 0.1660 ± 0.0167
none 0 acc_norm 0.2800 ± 0.0201
lambada_openai 1 none 0 perplexity 157.6757 ± 6.8536
none 0 acc 0.2127 ± 0.0057
boolq 2 none 0 acc 0.6235 ± 0.0085
arc_easy 1 none 0 acc 0.3944 ± 0.0100
none 0 acc_norm 0.3531 ± 0.0098