Quantizations of https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
Inference Clients/UIs
From original readme
This model was developed using Self-Play Preference Optimization at iteration 3, based on the meta-llama/Meta-Llama-3-8B-Instruct architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.
Model Description
- Model type: A 8B parameter GPT-like model fine-tuned on synthetic datasets.
- Language(s) (NLP): Primarily English
- License: Apache-2.0
- Finetuned from model: meta-llama/Meta-Llama-3-8B-Instruct
- Downloads last month
- 672
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model authors have turned it off explicitly.