ethzanalytics
/

gpt-j-8bit-daily_dialogues

Text Generation

8-bit precision

Model card Files Files and versions Community

gpt-j-8bit-daily_dialogues / README.md

pszemraj's picture

Update README.md

e323036 over 1 year ago

|

raw history blame contribute delete

No virus

1.46 kB

	---
	tags:
	- text-generation
	- 8bit
	- 8-bit
	- quantization
	- compression
	- chatbot
	- dialogue
	- conversation
	datasets:
	- daily_dialog
	inference: False
	license: apache-2.0
	---

	# ethzanalytics/gpt-j-8bit-daily_dialogues

	<a href="https://colab.research.google.com/gist/pszemraj/e49c60aafe04acc52fcfdd1baefe12e4/-ai-msgbot-gpt-j-6b-8bit-with-hub.ipynb">
	<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
	</a>


	This version of `hivemind/gpt-j-6B-8bit` is fine-tuned on a parsed version of the [daily dialogues](https://huggingface.co/datasets/daily_dialog) dataset for an epoch. It can be used as a chatbot.


	It is designed to be used with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to take advantage of prompt engineering in fine-tuning.

	## Usage

	_NOTE: this needs to be loaded via the special patching technique outlined in the hivemind model card (as with all 8bit models)_

	Examples of how to load the model correctly are already in place in the notebook linked above. A `.py` of said notebook was uploaded to the repo for reference - [link here](https://huggingface.co/ethzanalytics/gpt-j-8bit-daily_dialogues/blob/main/_ai_msgbot_gpt_j_6b_8bit_with_hub.py)


	## Training

	For details, please see [this wandb report](https://wandb.ai/pszemraj/conversational-6B-train-vanilla/reports/Training-6B-GPT-J-8bit-for-Dialogue--VmlldzoyNTg3MzE0) for both the daily-dialogues version and the WoW version.


	---