pszemraj commited on
Commit
674d0fc
1 Parent(s): 2b33b78

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -20,7 +20,7 @@ inference: false
20
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
21
  </a>
22
 
23
- This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k whatsapp/text messages (mine). Please use responsibly :)
24
 
25
  Test it out on Google Colab by clicking the button above.
26
 
@@ -32,17 +32,23 @@ Test it out on Google Colab by clicking the button above.
32
  - Seems to do a lot better than GPT-Neo with similar training parameters
33
  - you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
34
 
 
 
 
 
 
 
35
  ## Intended uses & limitations
36
 
37
- > The base model has a custom license which propogates to this one. Most importantly, it cannot be used commercially. Read more here: [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)
38
 
39
- - the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12 gb, Colab notebook linked above.
40
  - alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
41
  - **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
42
 
43
  ## Training and evaluation data
44
 
45
- WhatsApp & iMessage parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
46
 
47
  ## Training procedure
48
 
 
20
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
21
  </a>
22
 
23
+ This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k WhatsApp/text messages (mine). Please use responsibly :)
24
 
25
  Test it out on Google Colab by clicking the button above.
26
 
 
32
  - Seems to do a lot better than GPT-Neo with similar training parameters
33
  - you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
34
 
35
+ ### sharded checkpoint
36
+
37
+ As this model file is 10+ GB, it can impose some constraints with lower RAM runtimes and/or download speeds. To help with this issue, a sharded checkpoint of this model is available [here](https://huggingface.co/pszemraj/opt-peter-2.7B-sharded).
38
+
39
+ The `pszemraj/opt-peter-2.7B-sharded` model can be used as a drop-in replacement for this one for all use cases.
40
+
41
  ## Intended uses & limitations
42
 
43
+ > The base model has a custom license that propagates to this one. **Most importantly, it cannot be used commercially**. Read more here: [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)
44
 
45
+ - the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12 GB, Colab notebook linked above.
46
  - alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
47
  - **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
48
 
49
  ## Training and evaluation data
50
 
51
+ WhatsApp & iMessage data were parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
52
 
53
  ## Training procedure
54