eryk-mazus commited on
Commit
2b84ea6
1 Parent(s): bd8ead3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -23,6 +23,8 @@ widget:
23
 
24
  The training took 425 GPU hours on a single 8 x RTX 4090 machine with DeepSpeed ZeRO-2.
25
 
 
 
26
  ## Notes
27
 
28
  This base model was initially developed as a foundation for **instruction tuning, which is currently underway**. Nevertheless, I'm sharing it with the community now, because I recognize the potential value in its blend of relatively strong performance and an efficient bilingual tokenizer.
 
23
 
24
  The training took 425 GPU hours on a single 8 x RTX 4090 machine with DeepSpeed ZeRO-2.
25
 
26
+ Context size: 2,048 tokens.
27
+
28
  ## Notes
29
 
30
  This base model was initially developed as a foundation for **instruction tuning, which is currently underway**. Nevertheless, I'm sharing it with the community now, because I recognize the potential value in its blend of relatively strong performance and an efficient bilingual tokenizer.