Edit model card

Quantizations of https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3

Inference Clients/UIs


From original readme

This model was developed using Self-Play Preference Optimization at iteration 3, based on the meta-llama/Meta-Llama-3-8B-Instruct architecture as starting point. We utilized the prompt sets from the openbmb/UltraFeedback dataset, splited to 3 parts for 3 iterations by snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset. All responses used are synthetic.

Model Description

  • Model type: A 8B parameter GPT-like model fine-tuned on synthetic datasets.
  • Language(s) (NLP): Primarily English
  • License: Apache-2.0
  • Finetuned from model: meta-llama/Meta-Llama-3-8B-Instruct
Downloads last month
135
GGUF
Model size
8.03B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.