FalconLLM commited on
Commit
64851df
β€’
1 Parent(s): 7edc447

Update for paper release

Browse files
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -11,7 +11,7 @@ license: apache-2.0
11
 
12
  **Falcon-RW-1B is a 1B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
13
 
14
- *Paper coming soon 😊.*
15
 
16
  RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-1B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
17
 
@@ -63,7 +63,7 @@ for seq in sequences:
63
 
64
  ### Model Source
65
 
66
- - **Paper:** *coming soon*.
67
 
68
  ## Uses
69
 
@@ -147,7 +147,7 @@ Training happened in early December 2022 and took about six days.
147
 
148
  ## Evaluation
149
 
150
- *Paper coming soon.*
151
 
152
 
153
  ## Technical Specifications
@@ -179,7 +179,15 @@ Falcon-RW-1B was trained a custom distributed training codebase, Gigatron. It us
179
 
180
  ## Citation
181
 
182
- *Paper coming soon 😊.*
 
 
 
 
 
 
 
 
183
 
184
 
185
  ## Contact
 
11
 
12
  **Falcon-RW-1B is a 1B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
13
 
14
+ See the πŸ““ [paper on arXiv](https://arxiv.org/abs/2306.01116) for more details.
15
 
16
  RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-1B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
17
 
 
63
 
64
  ### Model Source
65
 
66
+ - **Paper:** [https://arxiv.org/abs/2306.01116](https://arxiv.org/abs/2306.01116).
67
 
68
  ## Uses
69
 
 
147
 
148
  ## Evaluation
149
 
150
+ See the πŸ““ [paper on arXiv](https://arxiv.org/abs/2306.01116) for in-depth evaluation.
151
 
152
 
153
  ## Technical Specifications
 
179
 
180
  ## Citation
181
 
182
+ @article{refinedweb,
183
+ title={The {R}efined{W}eb dataset for {F}alcon {LLM}: outperforming curated corpora with web data, and web data only},
184
+ author={Guilherme Penedo and Quentin Malartic and Daniel Hesslow and Ruxandra Cojocaru and Alessandro Cappelli and Hamza Alobeidli and Baptiste Pannier and Ebtesam Almazrouei and Julien Launay},
185
+ journal={arXiv preprint arXiv:2306.01116},
186
+ eprint={2306.01116},
187
+ eprinttype = {arXiv},
188
+ url={https://arxiv.org/abs/2306.01116},
189
+ year={2023}
190
+ }
191
 
192
 
193
  ## Contact