Text Generation
Transformers
PyTorch
mosaic_gpt
custom_code
jfrankle commited on
Commit
a2204f6
1 Parent(s): e4e26cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -4,9 +4,9 @@ datasets:
4
  - togethercomputer/RedPajama-Data-1T
5
  ---
6
 
7
- # MPT-1b-RedPajama-200b
8
 
9
- MPT-1b-RedPajama-200b is a 1.3 billion parameter decoder-only transformer pre-trained on the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) and subsequently fine-tuned on the [Databricks Dolly](https://github.com/databrickslabs/dolly/tree/master/data) instruction dataset.
10
  The model was pre-trained for 200B tokens by sampling from the subsets of the RedPajama dataset in the same proportions as were used by the [Llama series of models](https://arxiv.org/abs/2302.13971).
11
  This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
12
 
 
4
  - togethercomputer/RedPajama-Data-1T
5
  ---
6
 
7
+ # MPT-1b-RedPajama-200b-dolly
8
 
9
+ MPT-1b-RedPajama-200b-dolly is a 1.3 billion parameter decoder-only transformer pre-trained on the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T) and subsequently fine-tuned on the [Databricks Dolly](https://github.com/databrickslabs/dolly/tree/master/data) instruction dataset.
10
  The model was pre-trained for 200B tokens by sampling from the subsets of the RedPajama dataset in the same proportions as were used by the [Llama series of models](https://arxiv.org/abs/2302.13971).
11
  This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
12