jacobfulano commited on
Commit
d37bf0a
1 Parent(s): af1b522

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
  MPT-7B-StoryWriter-65k+ is a model designed to read and write fictional stories with super long context lengths.
12
  It was built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
13
  At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
14
- We demonstrate generations as long as 84k tokens on a single A100-80GB GPU in our [blogpost](www.mosaicml.com/blog/mpt-7b).
15
  * License: _Apache-2.0_ (commercial use permitted)
16
 
17
  This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
@@ -26,7 +26,7 @@ Apache-2.0 (commercial use permitted)
26
 
27
  ## Documentation
28
 
29
- * [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
30
  * [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
31
  * Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
32
 
 
11
  MPT-7B-StoryWriter-65k+ is a model designed to read and write fictional stories with super long context lengths.
12
  It was built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
13
  At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
14
+ We demonstrate generations as long as 84k tokens on a single node of 8 A100-80GB GPUs in our [blogpost](www.mosaicml.com/blog/mpt-7b).
15
  * License: _Apache-2.0_ (commercial use permitted)
16
 
17
  This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
 
26
 
27
  ## Documentation
28
 
29
+ * [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](https://www.mosaicml.com/blog/mpt-7b)
30
  * [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
31
  * Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
32