File size: 1,669 Bytes
b07b8af bddac42 3d8dbaf 5fa78b1 f523ba7 6d3663c 7618ad7 6d3663c 7618ad7 6d3663c 7618ad7 6d3663c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
license: apache-2.0
---
# Pythia 6.9B Based Reward Model
- base model: [andreaskoepf/pythia-6.9b-gpt4all-pretrain](https://huggingface.co/andreaskoepf/pythia-6.9b-gpt4all-pretrain)
- wandb: https://wandb.ai/open-assistant/reward-model/runs/5xld9wmd
- checkpoint: 3500 steps
Compute was generously provided by [Stability AI](https://stability.ai/)
### How to use
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
# install open assistant model_training module (e.g. run `pip install -e .` in `model/` directory of open-assistant repository)
import model_training.models.reward_model # noqa: F401 (registers reward model for AutoModel loading)
model_name = "OpenAssistant/oasst-rm-2-pythia-6.9b-epoch-1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
input_text = "<|prompter|>Hi how are you?<|endoftext|><|assistant|>Hi, I am Open-Assistant a large open-source language model trained by LAION AI. How can I help you today?<|endoftext|>"
inputs = tokenizer(input_text, return_tensors="pt")
score = model(**inputs).logits[0].cpu().detach()
print(score)
```
### Datasets
```
datasets:
- oasst_export:
lang: "en,es,de,fr"
input_file_path: 2023-03-27_oasst_research_ready_synth.jsonl.gz
val_split: 0.1
- anthropic_rlhf:
fraction: 0.1
max_val_set: 1000
- shp:
max_val_set: 1000
- hellaswag:
fraction: 0.5
max_val_set: 1000
- webgpt:
val_split: 0.05
max_val_set: 1000
- hf_summary_pairs:
fraction: 0.1
max_val_set: 250
```
|