-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Generative Judge for Evaluating Alignment
Paper • 2310.05470 • Published • 1 -
Calibrating LLM-Based Evaluator
Paper • 2309.13308 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2405.01535
-
prometheus-eval/prometheus-8x7b-v2.0
Text Generation • 47B • Updated • 4.77k • 50 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 123 -
cognitivecomputations/dolphin-2.9-mixtral-8x22b
Text Generation • 141B • Updated • 21 • 24 -
NexaAIDev/octo-net-gguf
4B • Updated • 625 • 42
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Generative Judge for Evaluating Alignment
Paper • 2310.05470 • Published • 1 -
Calibrating LLM-Based Evaluator
Paper • 2309.13308 • Published • 12
-
prometheus-eval/prometheus-8x7b-v2.0
Text Generation • 47B • Updated • 4.77k • 50 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 123 -
cognitivecomputations/dolphin-2.9-mixtral-8x22b
Text Generation • 141B • Updated • 21 • 24 -
NexaAIDev/octo-net-gguf
4B • Updated • 625 • 42