Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference
jacobfulano commited on
Commit
bc8d900
1 Parent(s): c66b9f7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-3.0
3
+ datasets:
4
+ - mosaicml/dolly_hhrlhf
5
+ tags:
6
+ - Composer
7
+ - MosaicML
8
+ - llm-foundry
9
+ ---
10
+
11
+ # MPT-7B-Chat
12
+
13
+ MPT-7B-Chat is a chatbot-like model for dialogue generation.
14
+ It is built by finetuning [MPT-7B](https://huggingface.co/spaces/mosaicml/mpt-7b) on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
15
+ [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
16
+ * License: _CC-By-NC-SA-4.0_ (non-commercial use only)
17
+ * [Demo on Hugging Face Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
18
+
19
+
20
+ This model was trained by [MosaicML](https://www.mosaicml.com) and follows a modified decoder-only transformer architecture.
21
+
22
+ ## Model Date
23
+
24
+ May 5, 2023
25
+
26
+ ## Model License
27
+
28
+ Apache-2.0 (commercial use permitted)
29
+
30
+ ## Documentation
31
+
32
+ * [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
33
+ * [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
34
+ * Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
35
+
36
+ ### Example Dialogue
37
+
38
+ **SamIAm85**:
39
+ >I want you to come up with a tweet based on this summary of the article:
40
+ >"Introducing MPT-7B, the latest entry in our MosaicML Foundation Series.
41
+ >MPT-7B is a transformer trained from scratch on IT tokens of text and code.
42
+ >It is open source, available for commercial use, and it matches the quality of LLaMA-7B.
43
+ >MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k.
44
+ >Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch.
45
+ >For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-Story Writer-65k+, the last of which uses a context length of 65k tokens!
46
+
47
+ **MPT-7B-Instruct**:
48
+ >Here's my suggestion for a tweet:
49
+ >"Exciting news! @MosaicML has launched #MPT_7B - a new standard for open-source, commercially usable LLMs. With training on 17 tokens of text & code, it matches the quality of LLaMA-7B. Now you can train, fine-tune, and deploy your own private MPT models via their platform. Check out the full blog post here: https://bit.ly/3j6UJzZ"
50
+
51
+ **SamIAm85**:
52
+ >Is there anything you want to say to MosaicML about this exciting news?
53
+
54
+ **MPT-7B-Instruct**:
55
+ > As an Al language model myself, I am very impressed with the advancements made by MosaicML in developing such high-quality open-source LLMs.
56
+ > This new version of MPT-7B is truly impressive and I look forward to seeing what innovative applications developers will create using these powerful tools.
57
+ > Thank you for your hard work and dedication to advancing Al research and development.
58
+
59
+
60
+ ## Acknowledgements
61
+
62
+ This model was finetuned by Sam Havens