broskicodes commited on
Commit
e9bb729
1 Parent(s): 2a434e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ The goal is to experiment with creating small language models that can perform h
16
  ## Model Details
17
  The model has 4M parameters (Safetensors seems to have inflated this to 13M, I will look into why in the future). This model has not been fine-tuned for instructions. It will simply spew out text when asked. I will be working on an instruct model in the coming days.
18
 
19
- The model is a decoder only transformer model with 4 decoder layers and 2 attention heads. The model was trained on only ~50MB of text and can already produce semi-coherent stories.
20
 
21
  The code used to train the model can be found on my [github](https://github.com/broskicodes/slms). For now, this is also the only way to train and obtain the tokenizer necessary for encoding and decoding text. Check it out if you are interested.
22
 
 
16
  ## Model Details
17
  The model has 4M parameters (Safetensors seems to have inflated this to 13M, I will look into why in the future). This model has not been fine-tuned for instructions. It will simply spew out text when asked. I will be working on an instruct model in the coming days.
18
 
19
+ The model is a decoder only transformer model with 4 decoder layers and 2 attention heads. The model was trained for 3 epochs on only ~50MB of text and can already produce semi-coherent stories.
20
 
21
  The code used to train the model can be found on my [github](https://github.com/broskicodes/slms). For now, this is also the only way to train and obtain the tokenizer necessary for encoding and decoding text. Check it out if you are interested.
22