- wandb: https://wandb.ai/open-assistant/rlhf/runs/ldxshxkt
- checkpoint: 2500
- reward model: andreaskoepf/oasst-rm-2-pythia-1.4b-10000
- base model: andreaskoepf/oasst-sft-4-pythia-12b-epoch-3.5
- sampling report
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support