Text Generation
Transformers
Safetensors
English
mega
Generated from Trainer
Inference Endpoints
pszemraj commited on
Commit
9af4293
1 Parent(s): d913048

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -14
README.md CHANGED
@@ -1,19 +1,17 @@
1
  ---
2
  license: apache-2.0
3
- base_model: pszemraj/mega-ar-350m-v0.12-napierone_epub
4
  tags:
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
8
- model-index:
9
- - name: mega-ar-350m-v0.12-napierone_epub-UltraTextbooks-2.1-fw_mix-vN
10
- results: []
11
  ---
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # mega-ar-350m-v0.12-napierone_epub-UltraTextbooks-2.1-fw_mix-vN
 
 
17
 
18
  This model is a fine-tuned version of [pszemraj/mega-ar-350m-v0.12-napierone_epub](https://huggingface.co/pszemraj/mega-ar-350m-v0.12-napierone_epub) on the BEE-spoke-data/UltraTextbooks-2.1-fw_mix dataset.
19
  It achieves the following results on the evaluation set:
@@ -21,17 +19,27 @@ It achieves the following results on the evaluation set:
21
  - Accuracy: 0.5885
22
  - Num Input Tokens Seen: 3468165120
23
 
24
- ## Model description
25
 
26
- More information needed
 
 
27
 
28
- ## Intended uses & limitations
29
 
30
- More information needed
31
 
32
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- More information needed
35
 
36
  ## Training procedure
37
 
@@ -85,4 +93,4 @@ The following hyperparameters were used during training:
85
  - Transformers 4.40.2
86
  - Pytorch 2.2.0+cu121
87
  - Datasets 2.19.1
88
- - Tokenizers 0.19.1
 
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
  - generated_from_trainer
5
  metrics:
6
  - accuracy
7
+ language:
8
+ - en
 
9
  ---
10
 
 
 
11
 
12
+ # mega-ar-350m-v0.13
13
+
14
+ ## Model description
15
 
16
  This model is a fine-tuned version of [pszemraj/mega-ar-350m-v0.12-napierone_epub](https://huggingface.co/pszemraj/mega-ar-350m-v0.12-napierone_epub) on the BEE-spoke-data/UltraTextbooks-2.1-fw_mix dataset.
17
  It achieves the following results on the evaluation set:
 
19
  - Accuracy: 0.5885
20
  - Num Input Tokens Seen: 3468165120
21
 
 
22
 
23
+ ## Quick eval
24
+
25
+ Quick eval for: pszemraj/mega-ar-350m-v0.13
26
 
 
27
 
28
+ hf (pretrained=pszemraj/mega-ar-350m-v0.13,trust_remote_code=True,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
29
 
30
+ | Tasks |Version|Filter|n-shot| Metric | Value | |Stderr|
31
+ |--------------|------:|------|-----:|----------|------:|---|-----:|
32
+ |arc_easy | 1|none | 0|acc | 0.4491|± |0.0102|
33
+ | | |none | 0|acc_norm | 0.4061|± |0.0101|
34
+ |boolq | 2|none | 0|acc | 0.5367|± |0.0087|
35
+ |lambada_openai| 1|none | 0|perplexity|55.3308|± |2.3100|
36
+ | | |none | 0|acc | 0.3113|± |0.0065|
37
+ |openbookqa | 1|none | 0|acc | 0.1760|± |0.0170|
38
+ | | |none | 0|acc_norm | 0.2680|± |0.0198|
39
+ |piqa | 1|none | 0|acc | 0.6366|± |0.0112|
40
+ | | |none | 0|acc_norm | 0.6213|± |0.0113|
41
+ |winogrande | 1|none | 0|acc | 0.5036|± |0.0141|
42
 
 
43
 
44
  ## Training procedure
45
 
 
93
  - Transformers 4.40.2
94
  - Pytorch 2.2.0+cu121
95
  - Datasets 2.19.1
96
+ - Tokenizers 0.19.1