Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,9 @@ metrics:
|
|
9 |
|
10 |
# jamba-H1024_L12-v0.13-KIx2
|
11 |
|
|
|
|
|
|
|
12 |
|
13 |
This is a pretraining experiment on the `jamba` arch as a "smol MoE". Details:
|
14 |
|
@@ -29,8 +32,8 @@ if I pretrain it further, other versions will be in new repos with incremented v
|
|
29 |
Quick eval for: pszemraj/jamba-H1024_L12-v0.13-KIx2
|
30 |
|
31 |
|
32 |
-
bootstrapping for stddev: perplexity
|
33 |
hf (pretrained=pszemraj/jamba-H1024_L12-v0.13-KIx2,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: 0.9999, num_fewshot: None, batch_size: 8
|
|
|
34 |
| Tasks |Version|Filter|n-shot| Metric | Value | |Stderr|
|
35 |
|--------------|------:|------|-----:|----------|-------:|---|-----:|
|
36 |
|winogrande | 1|none | 0|acc | 0.5067|± |0.0141|
|
|
|
9 |
|
10 |
# jamba-H1024_L12-v0.13-KIx2
|
11 |
|
12 |
+
<a href="https://colab.research.google.com/gist/pszemraj/62d037d0d93656ef2101d7e29e3b7220/jamba-test-sandbox.ipynb">
|
13 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
14 |
+
</a>
|
15 |
|
16 |
This is a pretraining experiment on the `jamba` arch as a "smol MoE". Details:
|
17 |
|
|
|
32 |
Quick eval for: pszemraj/jamba-H1024_L12-v0.13-KIx2
|
33 |
|
34 |
|
|
|
35 |
hf (pretrained=pszemraj/jamba-H1024_L12-v0.13-KIx2,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: 0.9999, num_fewshot: None, batch_size: 8
|
36 |
+
|
37 |
| Tasks |Version|Filter|n-shot| Metric | Value | |Stderr|
|
38 |
|--------------|------:|------|-----:|----------|-------:|---|-----:|
|
39 |
|winogrande | 1|none | 0|acc | 0.5067|± |0.0141|
|