sofial commited on
Commit
8c6f40a
1 Parent(s): 02e5a08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -45,6 +45,15 @@ Prompt sentences are tokenized and packed together to form 1024 token sequences,
45
  Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
46
  Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.
47
 
 
 
 
 
 
 
 
 
 
48
  ## How to use
49
  The model can be easily loaded using AutoModelForCausalLM.
50
  You can use the pipeline API for text generation.
 
45
  Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
46
  Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.
47
 
48
+ ### Hyperparameters:
49
+ - epochs:
50
+ - optimiser: AdamW (beta1: 0.9, beta2: 0.999, eps: 1e-6, weight decay: 0.0, learning rate: 5e-6)
51
+ - learning rate schedule: warmup schedule (min: 1e-7, max: 5e-6, warmup proportion: 0.005995)
52
+ - batch size: 128
53
+
54
+ ## Performance
55
+ The resulting model matches SOTA performance with 82.5% accuracy.
56
+
57
  ## How to use
58
  The model can be easily loaded using AutoModelForCausalLM.
59
  You can use the pipeline API for text generation.