Text Generation
Transformers
Safetensors
English
llama
smol_llama
llama2
Inference Endpoints
text-generation-inference
pszemraj commited on
Commit
9c9c090
1 Parent(s): c9ed0ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -71,6 +71,14 @@ A small 101M param (total) decoder model. This is the first version of the model
71
  - GQA (24 heads, 8 key-value), context length 1024
72
  - train-from-scratch
73
 
 
 
 
 
 
 
 
 
74
  ## Notes
75
 
76
  **This checkpoint** is the 'raw' pre-trained model and has not been tuned to a more specific task. **It should be fine-tuned** before use in most cases.
 
71
  - GQA (24 heads, 8 key-value), context length 1024
72
  - train-from-scratch
73
 
74
+
75
+ ## Features
76
+
77
+ Some cool anecdotes about this model:
78
+
79
+ - this model was pretrained on **one GPU** for 5 compute-days. You can DIY pretrain too!
80
+ - 0% of the training data (to our knowledge) comes from OpenAI synthetic generation
81
+
82
  ## Notes
83
 
84
  **This checkpoint** is the 'raw' pre-trained model and has not been tuned to a more specific task. **It should be fine-tuned** before use in most cases.