GeoV
/

GeoV-9b

@@ -8,7 +8,7 @@ license: bigscience-openrail-m
 ---
-[GeoV](https://huggingface.co/docs/transformers/model_doc/geov)-9B is a 9 billion parameter autoregressive language model.
 The GeoV model was designed by Georges Harik and uses
 [Rotary Positional Embeddings with Relative distances (RoPER)](http://research.labml.ai/RoPER.html)
@@ -36,9 +36,12 @@ RoPER has given better performance in some algorithmic tasks, and seems comparab
 | n<sub>heads</sub>      | 40          |
 | d<sub>head</sub>       | 128         |
 | n<sub>vocab</sub>      | 65500       |
-| Sequence Length        | 2049        |
 </figure>
 ## Generation

 ---
+[GeoV](https://huggingface.co/docs/transformers/model_doc/geov)-9B is a 9 billion parameter causal language model.
 The GeoV model was designed by Georges Harik and uses
 [Rotary Positional Embeddings with Relative distances (RoPER)](http://research.labml.ai/RoPER.html)
 | n<sub>heads</sub>      | 40          |
 | d<sub>head</sub>       | 128         |
 | n<sub>vocab</sub>      | 65500       |
+| Sequence Length        | 2048        |
 </figure>
+The released weights were trained on ~70 billion tokens.
+We plan to continue training up to 300 billion tokens and update the weights at every 20b tokens.
+This training run is monolingual and uses c4en and english wikipedia datasets.
 ## Generation