naxalpha commited on
Commit
d230351
1 Parent(s): 2a1bf08

update the readme

Browse files
.ipynb_checkpoints/README-checkpoint.md CHANGED
@@ -60,8 +60,9 @@ Here are the details of the training:
60
  - Gradient Norm Clipping: `1.0`
61
  - Hardware: `RTX 3090` on [vast.ai](vast.ai)
62
  - Training Cost: `~20$`
63
- - Training Time: `~2 days`
64
- - Number of steps: `434,000`
65
- - Tokens seen: `444 million`
 
66
 
67
  Training code is available in this repo. [Link to the training script](https://huggingface.co/naxalpha/gated-state-space/blob/main/app.py).
 
60
  - Gradient Norm Clipping: `1.0`
61
  - Hardware: `RTX 3090` on [vast.ai](vast.ai)
62
  - Training Cost: `~20$`
63
+ - Training Time: `~3 days`
64
+ - Number of steps: `557,000`
65
+ - Tokens seen: `570 million`
66
+ - Final loss: `~3.9`
67
 
68
  Training code is available in this repo. [Link to the training script](https://huggingface.co/naxalpha/gated-state-space/blob/main/app.py).
README.md CHANGED
@@ -60,8 +60,9 @@ Here are the details of the training:
60
  - Gradient Norm Clipping: `1.0`
61
  - Hardware: `RTX 3090` on [vast.ai](vast.ai)
62
  - Training Cost: `~20$`
63
- - Training Time: `~2 days`
64
- - Number of steps: `434,000`
65
- - Tokens seen: `444 million`
 
66
 
67
  Training code is available in this repo. [Link to the training script](https://huggingface.co/naxalpha/gated-state-space/blob/main/app.py).
 
60
  - Gradient Norm Clipping: `1.0`
61
  - Hardware: `RTX 3090` on [vast.ai](vast.ai)
62
  - Training Cost: `~20$`
63
+ - Training Time: `~3 days`
64
+ - Number of steps: `557,000`
65
+ - Tokens seen: `570 million`
66
+ - Final loss: `~3.9`
67
 
68
  Training code is available in this repo. [Link to the training script](https://huggingface.co/naxalpha/gated-state-space/blob/main/app.py).