pszemraj
/

pythia-6.9b-HC3

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

pythia-6.9b-HC3 / README.md

pszemraj's picture

Update README.md

d9abd93 about 2 years ago

|

2.79 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- pszemraj/HC3-textgen-qa
	metrics:
	- accuracy
	inference: False
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# pythia-6.9b-deduped for general QA

	This model is a fine-tuned version of [EleutherAI/pythia-6.9b-deduped](https://huggingface.co/EleutherAI/pythia-6.9b-deduped) on the pszemraj/HC3-textgen-qa dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2372
	- Accuracy: 0.6769

	## Model description

	Text generation model trained on the HC3 text data of human questions + chatGPT answers.

	![example](https://i.imgur.com/iMqPDXU.png)


	### Usage

	Install necessary packages for inference (_unless you have a big boi GPU_)
	```bash
	pip install -U -q transformers bitsandbytes accelerate
	```

	Basic inference example:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("pszemraj/pythia-6.9b-HC3")

	model = AutoModelForCausalLM.from_pretrained(
	"pszemraj/pythia-6.9b-HC3", load_in_8bit=True, device_map="auto"
	) # shards are ~4GB each

	prompt = "I was wondering how much wood a woodchuck could chuck? <answer>"
	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_new_tokens=300) # default generation config (+ 300 tokens)
	result = tokenizer.batch_decode(outputs, skip_special_tokens=True)

	import pprint as pp

	pp.pprint(result[0])
	```

	The defautl `GenerationConfig` uses contrastive search with `top_k=4` and `penalty_alpha=0.6`. For more information on inference and parameters to use, see [the transformers docs](https://huggingface.co/docs/transformers/generation_strategies#decoding-strategies).

	## Intended uses & limitations

	- Intended use: research/exploration into comparing RLHF tuning vs. "guided"/specific tuning on "quality" datasets/responses of _"what the human would want as answer anyway"_
	- This is not trained/fine-tuned with RLHF and therefore will not be as helpful/generalizable/safe as chatGPT.

	## Training and evaluation data

	```yaml
	model-index:
	- name: pythia-6.9b-hc3-qa-assistant
	results:
	- task:
	name: Causal Language Modeling
	type: text-generation
	dataset:
	name: pszemraj/HC3-textgen-qa
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.6768941789814655
	```


	## Training procedure

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 1.2598 \| 0.99 \| 79 \| 1.3291 \| 0.6496 \|
	\| 0.7446 \| 1.99 \| 158 \| 1.2372 \| 0.6769 \|