Text2Text Generation
Transformers
GGUF
English
Inference Endpoints
vsevolodl's picture
Update README.md
71f522b verified
|
raw
history blame
No virus
2.08 kB
metadata
datasets:
  - prometheus-eval/Feedback-Collection
  - prometheus-eval/Preference-Collection
library_name: transformers
pipeline_tag: text2text-generation
tags:
  - text2text-generation

Links for Reference

TL;DR

Prometheus 2 is an alternative of GPT-4 evaluation when doing fine-grained evaluation of an underlying LLM & a Reward model for Reinforcement Learning from Human Feedback (RLHF).

image/png

Prometheus 2 is a language model using Mistral-Instruct as a base model. It is fine-tuned on 100K feedback within the Feedback Collection and 200K feedback within the Preference Collection. It is also made by weight merging to support both absolute grading (direct assessment) and relative grading (pairwise ranking). The surprising thing is that we find weight merging also improves performance on each format.

Model Details

Model Description

Prometheus is trained with two different sizes (7B and 8x7B). You could check the 7B sized LM on this page. Also, check out our dataset as well on this page and this page.

Prompt format

We have made wrapper functions and classes to conveniently use Prometheus 2 at our github repository. We highly recommend you use it!

However, if you just want to use the model for your use case, please refer to the prompt format below. Note that absolute grading and relative grading requires different prompt templates and system prompts.

WORK IN PROGRESS