eluzhnica commited on
Commit
6e454c1
1 Parent(s): a4a6419

Add clarifications/disclaimer

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -16,6 +16,16 @@ inference: false
16
 
17
  # MPT-30B
18
 
 
 
 
 
 
 
 
 
 
 
19
  MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
20
  This model was trained by [MosaicML](https://www.mosaicml.com).
21
 
@@ -242,4 +252,4 @@ for open-source foundation models},
242
  note = {Accessed: 2023-06-22},
243
  urldate = {2023-06-22}
244
  }
245
- ```
 
16
 
17
  # MPT-30B
18
 
19
+ This is the MPT-30B but with added support to finetune using peft (tested with qlora). It is not finetuned further, the weights are the same as the original MPT-30b.
20
+
21
+ I have not traced through the whole huggingface stack to see if this is working correctly but it does finetune with qlora and outputs are reasonable.
22
+ Inspired by implementations here https://huggingface.co/cekal/mpt-7b-peft-compatible/commits/main
23
+ https://huggingface.co/mosaicml/mpt-7b/discussions/42.
24
+
25
+ The original description for MosaicML team below:
26
+
27
+
28
+
29
  MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code.
30
  This model was trained by [MosaicML](https://www.mosaicml.com).
31
 
 
252
  note = {Accessed: 2023-06-22},
253
  urldate = {2023-06-22}
254
  }
255
+ ```