oepen commited on
Commit
48e5e27
1 Parent(s): 02f80e8
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -22,11 +22,11 @@ datasets:
22
 
23
  NorBLOOM-7b-scratch is a large Norwegian language model pretrained from scratch on a total of 260 billion subword tokens (using six repetitions of open Norwegian texts).
24
 
25
- This model is a part of the NORA-LLM family developed in collaboration between [the Language Technology Group at the University of Oslo](https://huggingface.co/ltg), [the High Performance Language Technologies (HPLT) project](https://hplt-project.org/), [the National Library of Norway](https://huggingface.co/NbAiLab), and [the University of Turku](https://huggingface.co/TurkuNLP).
26
  All the models are pre-trained on the same dataset and with the same tokenizer.
27
  NorBLOOM-7b-scratch has around 7 billion parameters and is based on [the BLOOM architecture](https://arxiv.org/abs/2211.05100).
28
 
29
- The NORA-LLM language model family includes (as of now):
30
  - [**NorMistral-7b-warm**](https://huggingface.co/norallm/normistral-7b-warm) -- an LLM initialized from [Mistral-7b-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and continuously pretrained on Norwegian data;
31
  - [**NorMistral-7b-scratch**](https://huggingface.co/norallm/normistral-7b-scratch) -- a Mistral-based LLM pretrained from scratch on Norwegian data;
32
  - [**NorBLOOM-7b-scratch**](https://huggingface.co/norallm/NorBLOOM-7b-scratch) -- a BLOOM-based LLM pretrained from scratch on Norwegian data.
 
22
 
23
  NorBLOOM-7b-scratch is a large Norwegian language model pretrained from scratch on a total of 260 billion subword tokens (using six repetitions of open Norwegian texts).
24
 
25
+ This model is a part of the NORA.LLM family developed in collaboration between [the Language Technology Group at the University of Oslo](https://huggingface.co/ltg), [the High Performance Language Technologies (HPLT) project](https://hplt-project.org/), [the National Library of Norway](https://huggingface.co/NbAiLab), and [the University of Turku](https://huggingface.co/TurkuNLP).
26
  All the models are pre-trained on the same dataset and with the same tokenizer.
27
  NorBLOOM-7b-scratch has around 7 billion parameters and is based on [the BLOOM architecture](https://arxiv.org/abs/2211.05100).
28
 
29
+ The NORA.LLM language model family includes (as of now):
30
  - [**NorMistral-7b-warm**](https://huggingface.co/norallm/normistral-7b-warm) -- an LLM initialized from [Mistral-7b-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and continuously pretrained on Norwegian data;
31
  - [**NorMistral-7b-scratch**](https://huggingface.co/norallm/normistral-7b-scratch) -- a Mistral-based LLM pretrained from scratch on Norwegian data;
32
  - [**NorBLOOM-7b-scratch**](https://huggingface.co/norallm/NorBLOOM-7b-scratch) -- a BLOOM-based LLM pretrained from scratch on Norwegian data.