jamba-H1024_L12-v0.07-fineweb-1M-med
mid-training checkpoint
- arch: jamba (see model card for kernels/use)
- tokenizer: claude3 as HF GPT2
- has only seen up to 2048 context length thus far
numbers
for this checkpoint
hf (pretrained=pszemraj/jamba-H1024_L12-v0.07-fineweb-1M-med,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
winogrande | 1 | none | 0 | acc | 0.4972 | ± | 0.0141 |
piqa | 1 | none | 0 | acc | 0.6072 | ± | 0.0114 |
none | 0 | acc_norm | 0.6034 | ± | 0.0114 | ||
openbookqa | 1 | none | 0 | acc | 0.1660 | ± | 0.0167 |
none | 0 | acc_norm | 0.2800 | ± | 0.0201 | ||
lambada_openai | 1 | none | 0 | perplexity | 157.6757 | ± | 6.8536 |
none | 0 | acc | 0.2127 | ± | 0.0057 | ||
boolq | 2 | none | 0 | acc | 0.6235 | ± | 0.0085 |
arc_easy | 1 | none | 0 | acc | 0.3944 | ± | 0.0100 |
none | 0 | acc_norm | 0.3531 | ± | 0.0098 |
- Downloads last month
- 636
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.