PAL-B-Large-opt-350m

This model is a personalized reward model for pluralistic alignment and serves as a demonstration for our paper.

Our approach outperforms the standard homogeneous reward model, demonstrating improved performance with our proposed Pluralistic Alignment method.

If you're interested in our PAL method (Pluralistic ALignment), we encourage you to explore our project page and repository

Intro

To quote the abstract of our official paper

Foundation models trained on internet-scale data benefit from extensive alignment to human preferences before deployment. However, existing methods typically assume a homogeneous preference shared by all individuals, overlooking the diversity inherent in human values. In this work, we propose a general reward modeling framework for pluralistic alignment (PAL), which incorporates diverse preferences from the ground up. PAL has a modular design that leverages commonalities across users while catering to individual personalization, enabling efficient few-shot localization of preferences for new users. Extensive empirical evaluation demonstrates that PAL matches or outperforms state-of-the-art methods on both text-to-text and text-to-image tasks: on Reddit TL;DR Summary, PAL is 1.7% more accurate for seen users and 36% more accurate for unseen users compared to the previous best method, with 100× less parameters. On Pick-a-Pic v2, PAL is 2.5% more accurate than the best method with 156× fewer learned parameters. Finally, we provide theoretical analysis for generalization of rewards learned via PAL framework showcasing the reduction in number of samples needed per user.

Model Details

We train the PAL-B-Large model (utilize facebook/opt350m as the base model) on a variant of Reddit TL;DR summary dataset, incorporating feedback from the 10 most active users.

Model Sources

Downloads last month
99
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Model tree for daiweichen/pal-b-large-opt-350m

Base model

facebook/opt-350m
Finetuned
(120)
this model

Dataset used to train daiweichen/pal-b-large-opt-350m