-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 135 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 26 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 19 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 62
Collections
Discover the best community collections!
Collections including paper arxiv:2310.08491
-
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 49 -
prometheus-eval/Feedback-Collection
Viewer • Updated • 111 • 91 -
prometheus-eval/prometheus-7b-v1.0
Text2Text Generation • Updated • 985 • 28 -
prometheus-eval/prometheus-13b-v1.0
Text2Text Generation • Updated • 8.98k • 115
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 39 -
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Paper • 2310.12921 • Published • 18 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 49
-
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 49 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 62 -
Calibrating LLM-Based Evaluator
Paper • 2309.13308 • Published • 10 -
Fusion-Eval: Integrating Evaluators with LLMs
Paper • 2311.09204 • Published • 5
-
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 49 -
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
Paper • 2311.04934 • Published • 23 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 40
-
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 49 -
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
Paper • 2310.08579 • Published • 14 -
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Paper • 2310.12921 • Published • 18 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21