jacobfulano commited on
Commit
c271818
1 Parent(s): 66aedd3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -20
README.md CHANGED
@@ -32,28 +32,14 @@ Apache-2.0 (commercial use permitted)
32
  * [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
33
  * Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
34
 
35
- ### Example Dialogue
36
 
37
- **SamIAm85**:
38
- >I want you to come up with a tweet based on this summary of the article:
39
- >"Introducing MPT-7B, the latest entry in our MosaicML Foundation Series.
40
- >MPT-7B is a transformer trained from scratch on IT tokens of text and code.
41
- >It is open source, available for commercial use, and it matches the quality of LLaMA-7B.
42
- >MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k.
43
- >Starting today, you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch.
44
- >For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-Story Writer-65k+, the last of which uses a context length of 65k tokens!
45
 
46
- **MPT-7B-Instruct**:
47
- >Here's my suggestion for a tweet:
48
- >"Exciting news! @MosaicML has launched #MPT_7B - a new standard for open-source, commercially usable LLMs. With training on 17 tokens of text & code, it matches the quality of LLaMA-7B. Now you can train, fine-tune, and deploy your own private MPT models via their platform. Check out the full blog post here: https://bit.ly/3j6UJzZ"
49
 
50
- **SamIAm85**:
51
- >Is there anything you want to say to MosaicML about this exciting news?
52
-
53
- **MPT-7B-Instruct**:
54
- > As an Al language model myself, I am very impressed with the advancements made by MosaicML in developing such high-quality open-source LLMs.
55
- > This new version of MPT-7B is truly impressive and I look forward to seeing what innovative applications developers will create using these powerful tools.
56
- > Thank you for your hard work and dedication to advancing Al research and development.
57
 
58
  ## How to Use
59
 
@@ -73,6 +59,21 @@ model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b-instr
73
  model.to(device='cuda:0', dtype=torch.bfloat16)
74
  ```
75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  ## Model Description
77
 
78
  The architecture is a modification of a standard decoder-only transformer.
@@ -104,4 +105,4 @@ This model was finetuned on 440 A100-40GBs for about half a day using the [Mosai
104
 
105
  ## Acknowledgements
106
 
107
- This model was finetuned by Sam Havens
 
32
  * [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
33
  * Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
34
 
35
+ ### Example Question/Instruction
36
 
37
+ **Longboi24**
38
+ > What is a quoll?
 
 
 
 
 
 
39
 
40
+ **MPT-7B-Instruct**
 
 
41
 
42
+ >A Quoll (pronounced “cool”) is one of Australia’s native carnivorous marsupial mammals, which are also known as macropods or wallabies in other parts around Asia and South America
 
 
 
 
 
 
43
 
44
  ## How to Use
45
 
 
59
  model.to(device='cuda:0', dtype=torch.bfloat16)
60
  ```
61
 
62
+ Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example:
63
+
64
+ ```python
65
+ config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
66
+ config.update({"max_seq_len": 4096})
67
+ model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', config=config, trust_remote_code=True)
68
+ ```
69
+
70
+ This model was trained with the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
71
+
72
+ ```python
73
+ from transformers import AutoTokenizer
74
+ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
75
+ ```
76
+
77
  ## Model Description
78
 
79
  The architecture is a modification of a standard decoder-only transformer.
 
105
 
106
  ## Acknowledgements
107
 
108
+ This model was finetuned by Sam Havens and the MosaicML NLP team