pszemraj
/

opt-peter-2.7B

@@ -20,7 +20,7 @@ inference: false
   <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
 </a>
-This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k whatsapp/text messages (mine). Please use responsibly :)
 Test it out on Google Colab by clicking the button above.
@@ -32,17 +32,23 @@ Test it out on Google Colab by clicking the button above.
 - Seems to do a lot better than GPT-Neo with similar training parameters
 - you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
 ## Intended uses & limitations
-> The base model has a custom license which propogates to this one. Most importantly, it cannot be used commercially. Read more here: [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)
-- the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12 gb, Colab notebook linked above.
   - alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
 - **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
 ## Training and evaluation data
-WhatsApp & iMessage parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
 ## Training procedure

   <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
 </a>
+This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k WhatsApp/text messages (mine). Please use responsibly :)
 Test it out on Google Colab by clicking the button above.
 - Seems to do a lot better than GPT-Neo with similar training parameters
 - you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
+### sharded checkpoint
+As this model file is 10+ GB, it can impose some constraints with lower RAM runtimes and/or download speeds. To help with this issue, a sharded checkpoint of this model is available [here](https://huggingface.co/pszemraj/opt-peter-2.7B-sharded).
+The `pszemraj/opt-peter-2.7B-sharded` model can be used as a drop-in replacement for this one for all use cases.
 ## Intended uses & limitations
+> The base model has a custom license that propagates to this one. **Most importantly, it cannot be used commercially**. Read more here: [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)
+- the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12 GB, Colab notebook linked above.
   - alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
 - **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
 ## Training and evaluation data
+WhatsApp & iMessage data were parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
 ## Training procedure