mosaicml
/

mpt-7b-8k-chat

Text Generation

text-generation-inference

Model card Files Files and versions Community

sam-mosaic commited on Jun 29, 2023

Commit

fc67f07

•

1 Parent(s): 848c6bf

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ tags:
 inference: false
 ---
-# MPT-30B-Chat
 MPT-7B-8k-Chat is a chatbot-like model for dialogue generation.
 It was built by finetuning [MPT-7B-8k](https://huggingface.co/mosaicml/mpt-7b-8k) on the [ShareGPT-Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered), [Camel-AI](https://huggingface.co/camel-ai),
@@ -166,8 +166,7 @@ The model was trained on the following data mix:
 ### Training Configuration
-**TODO FILL IN THESE DETAILS**
-This model was trained on **NUMBER** H100s for about **NUMBER** hours using the [MosaicML Platform](https://www.mosaicml.com/platform).
 The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the AdamW optimizer.
 ## Limitations and Biases

 inference: false
 ---
+# MPT-7B-Chat
 MPT-7B-8k-Chat is a chatbot-like model for dialogue generation.
 It was built by finetuning [MPT-7B-8k](https://huggingface.co/mosaicml/mpt-7b-8k) on the [ShareGPT-Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered), [Camel-AI](https://huggingface.co/camel-ai),
 ### Training Configuration
+This model was trained on 192 H100s for about 48 minutes using the [MosaicML Platform](https://www.mosaicml.com/platform).
 The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the AdamW optimizer.
 ## Limitations and Biases