crumb commited on
Commit
30bdecb
1 Parent(s): 755a695

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -11
README.md CHANGED
@@ -11,17 +11,14 @@ should probably proofread and complete it, then remove this comment. -->
11
 
12
  # gpt-fake-lang-17m
13
 
14
- This model is a pre-trained GPT2 (with 17m parameters) on a synthetic dataset (1gb of documents created in 4 fake languages, each with a formal and informal writing style).
 
15
  It achieves the following results on the evaluation set:
16
  - Loss: 3.5592
17
 
18
- ## Model description
19
-
20
- More information needed
21
-
22
  ## Intended uses & limitations
23
 
24
- More information needed
25
 
26
  ## Training and evaluation data
27
 
@@ -33,12 +30,9 @@ More information needed
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 0.001
36
- - train_batch_size: 4
37
- - eval_batch_size: 4
38
  - seed: 42
39
- - gradient_accumulation_steps: 16
40
- - total_train_batch_size: 64
41
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
  - num_epochs: 1
44
 
 
11
 
12
  # gpt-fake-lang-17m
13
 
14
+ This model is a pre-trained GPTJ (with 17m parameters) on a synthetic dataset (1gb of documents created in 4 fake languages, each with a formal and informal writing style) for 1 epoch.
15
+
16
  It achieves the following results on the evaluation set:
17
  - Loss: 3.5592
18
 
 
 
 
 
19
  ## Intended uses & limitations
20
 
21
+ This model is to be used as a base model for fine-tuning any language/task to probe the effectiveness of both pre-training on an algorithmically generated corpus and extremely small models. It can only generate text based on its training data (which will be uploaded as a huggingface dataset soon).
22
 
23
  ## Training and evaluation data
24
 
 
30
 
31
  The following hyperparameters were used during training:
32
  - learning_rate: 0.001
33
+ - batch_size 64
 
34
  - seed: 42
35
+ - optimizer: Adam
 
 
36
  - lr_scheduler_type: linear
37
  - num_epochs: 1
38