FalconLLM commited on
Commit
c76827a
β€’
1 Parent(s): 17dc123

Update for paper release

Browse files
Files changed (1) hide show
  1. README.md +14 -4
README.md CHANGED
@@ -11,7 +11,7 @@ license: apache-2.0
11
 
12
  **Falcon-RW-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
13
 
14
- *Paper coming soon 😊.*
15
 
16
  RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-7B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
17
 
@@ -62,7 +62,7 @@ for seq in sequences:
62
 
63
  ### Model Source
64
 
65
- - **Paper:** *coming soon*.
66
 
67
  ## Uses
68
 
@@ -146,7 +146,7 @@ Training happened in early January 2023 and took about five days.
146
 
147
  ## Evaluation
148
 
149
- *Paper coming soon.*
150
 
151
 
152
  ## Technical Specifications
@@ -178,7 +178,17 @@ Falcon-RW-7B was trained a custom distributed training codebase, Gigatron. It us
178
 
179
  ## Citation
180
 
181
- *Paper coming soon 😊.*
 
 
 
 
 
 
 
 
 
 
182
 
183
 
184
  ## Contact
 
11
 
12
  **Falcon-RW-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). It is made available under the Apache 2.0 license.**
13
 
14
+ See the πŸ““ [paper on arXiv](https://arxiv.org/abs/2306.01116) for more details.
15
 
16
  RefinedWeb is a high-quality web dataset built by leveraging stringent filtering and large-scale deduplication. Falcon-RW-7B, trained on RefinedWeb only, matches or outperforms comparable models trained on curated data.
17
 
 
62
 
63
  ### Model Source
64
 
65
+ - **Paper:** [https://arxiv.org/abs/2306.01116](https://arxiv.org/abs/2306.01116).
66
 
67
  ## Uses
68
 
 
146
 
147
  ## Evaluation
148
 
149
+ See the πŸ““ [paper on arXiv](https://arxiv.org/abs/2306.01116) for in-depth evaluation results.
150
 
151
 
152
  ## Technical Specifications
 
178
 
179
  ## Citation
180
 
181
+ ```
182
+ @article{refinedweb,
183
+ title={The {R}efined{W}eb dataset for {F}alcon {LLM}: outperforming curated corpora with web data, and web data only},
184
+ author={Guilherme Penedo and Quentin Malartic and Daniel Hesslow and Ruxandra Cojocaru and Alessandro Cappelli and Hamza Alobeidli and Baptiste Pannier and Ebtesam Almazrouei and Julien Launay},
185
+ journal={arXiv preprint arXiv:2306.01116},
186
+ eprint={2306.01116},
187
+ eprinttype = {arXiv},
188
+ url={https://arxiv.org/abs/2306.01116},
189
+ year={2023}
190
+ }
191
+ ```
192
 
193
 
194
  ## Contact