Text Generation
Transformers
PyTorch
Safetensors
code
Eval Results
Inference Endpoints
Muennighoff commited on
Commit
e1278b8
1 Parent(s): 26c238e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -91,8 +91,8 @@ print(tokenizer.decode(outputs[0]))
91
  ## Model
92
 
93
  - **Architecture:** GPT-2 model with multi-query attention and Fill-in-the-Middle objective
94
- - **Steps:** 250k pretraining & TODO instruction tuning
95
- - **Pretraining tokens:** 1 trillion pretraining & TODO instruction tuning
96
  - **Precision:** bfloat16
97
 
98
  ## Hardware
@@ -101,8 +101,8 @@ print(tokenizer.decode(outputs[0]))
101
  - **GPUs:** 512 Tesla A100
102
  - **Training time:** 24 days
103
  - **Instruction tuning:**
104
- - **GPUs:** TODO Tesla A100
105
- - **Training time:** TODO days
106
 
107
  ## Software
108
 
 
91
  ## Model
92
 
93
  - **Architecture:** GPT-2 model with multi-query attention and Fill-in-the-Middle objective
94
+ - **Steps:** 250k pretraining & 30 instruction tuning
95
+ - **Pretraining tokens:** 1 trillion pretraining & 2M instruction tuning
96
  - **Precision:** bfloat16
97
 
98
  ## Hardware
 
101
  - **GPUs:** 512 Tesla A100
102
  - **Training time:** 24 days
103
  - **Instruction tuning:**
104
+ - **GPUs:** 8 Tesla A100
105
+ - **Training time:** 4 hours
106
 
107
  ## Software
108