emozilla commited on
Commit
e610773
1 Parent(s): bb88106

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -9,12 +9,15 @@ datasets:
9
  - emozilla/yarn-train-tokenized-8k-llama
10
  ---
11
 
12
- # Model Card: Nous-Yarn-Llama-2-70b-32k
13
 
14
  [Preprint (arXiv)](https://arxiv.org/abs/2309.00071)
15
  [GitHub](https://github.com/jquesnelle/yarn)
16
  ![yarn](https://raw.githubusercontent.com/jquesnelle/yarn/70b/data/proofpile-long-small-32k-70b.csv.png)
17
 
 
 
 
18
  ## Model Description
19
 
20
  Nous-Yarn-Llama-2-70b-32k is a state-of-the-art language model for long context, further pretrained on long context data for 400 steps using the YaRN extension method.
@@ -55,6 +58,3 @@ Short context benchmarks showing that quality degradation is minimal:
55
  - [@theemozilla](https://twitter.com/theemozilla): Methods, paper, model training, and evals
56
  - [@EnricoShippole](https://twitter.com/EnricoShippole): Model training
57
  - [honglu2875](https://github.com/honglu2875): Paper and evals
58
-
59
- The authors would like to thank LAION AI for their support of compute for this model.
60
- It was trained on the [JUWELS](https://www.fz-juelich.de/en/ias/jsc/systems/supercomputers/juwels) supercomputer.
 
9
  - emozilla/yarn-train-tokenized-8k-llama
10
  ---
11
 
12
+ # Model Card: Yarn-Llama-2-70b-32k
13
 
14
  [Preprint (arXiv)](https://arxiv.org/abs/2309.00071)
15
  [GitHub](https://github.com/jquesnelle/yarn)
16
  ![yarn](https://raw.githubusercontent.com/jquesnelle/yarn/70b/data/proofpile-long-small-32k-70b.csv.png)
17
 
18
+ The authors would like to thank [LAION AI](https://laion.ai/) for their support of compute for this model.
19
+ It was trained on the [JUWELS](https://www.fz-juelich.de/en/ias/jsc/systems/supercomputers/juwels) supercomputer.
20
+
21
  ## Model Description
22
 
23
  Nous-Yarn-Llama-2-70b-32k is a state-of-the-art language model for long context, further pretrained on long context data for 400 steps using the YaRN extension method.
 
58
  - [@theemozilla](https://twitter.com/theemozilla): Methods, paper, model training, and evals
59
  - [@EnricoShippole](https://twitter.com/EnricoShippole): Model training
60
  - [honglu2875](https://github.com/honglu2875): Paper and evals