Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Reset Other
reward-trainer
AutoTrain Compatible
Inference Endpoints
text-generation-inference
4-bit precision
Eval Results
Other with no match
Merge
text-embeddings-inference
custom_code
8-bit precision
Carbon Emissions
Mixture of Experts
Apply filters
Models
161
Full-text search
Edit filters
Sort: Trending
Active filters:
reward-trainer
Clear all
allen0909/ROE_Patent_Breeze7B_RLHF_3
Updated
29 days ago
•
4
allen0909/ROE_Patent_Breeze7B_RLHF_4
Updated
29 days ago
•
4
allen0909/ROE_Patent_Breeze7B_RLHF_5
Updated
29 days ago
•
3
allen0909/ROE_Patent_Breeze7B_RLHF_6
Updated
29 days ago
•
4
allen0909/ROE_Patent_Breeze7B_RLHF_7
Updated
29 days ago
•
4
allen0909/ROE_Patent_Breeze7B_RLHF_8
Updated
28 days ago
•
4
SiMajid/working
Updated
26 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_9
Updated
28 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_10
Updated
28 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_11
Updated
28 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_12
Updated
28 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_13
Updated
28 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_14
Updated
28 days ago
•
1
elsayedissa/deberta-v3-large-reward-model
Text Classification
•
Updated
27 days ago
•
24
just1nseo/reward_modeling_openchat
Updated
27 days ago
•
1
santiviquez/reward_modeling_anthropic_hh
Text Classification
•
Updated
27 days ago
•
6
allen0909/ROE_Patent_Breeze7B_RLHF_15
Updated
22 days ago
•
7
allen0909/ROE_Patent_Breeze7B_RLHF_16
Updated
21 days ago
•
4
mnoukhov/pythia160m-rm-tldr
Text Classification
•
Updated
21 days ago
•
32
chandrasekhar319/reward_model_tinyllama_sql
Updated
21 days ago
•
1
allen0909/ROE_Patent_Breeze7B_RLHF_17
Updated
20 days ago
•
1
mnoukhov/pythia410m-rm-tldr6.9b
Text Classification
•
Updated
20 days ago
•
1.01k
trl-internal-testing/rm_160m
Text Classification
•
Updated
19 days ago
•
15
vwxyzjn/rm_1b
Text Classification
•
Updated
19 days ago
•
1
trl-internal-testing/rm_sentiment_1b
Text Classification
•
Updated
14 days ago
•
8
SiMajid/reward_modeling_anthropic_hh
Text Classification
•
Updated
19 days ago
•
4
SiMajid/deberta_value
Text Classification
•
Updated
18 days ago
•
1
SiMajid/xlm-roberta-base
Text Classification
•
Updated
18 days ago
•
1
SiMajid/opt-350-value
Text Classification
•
Updated
18 days ago
•
31
trl-internal-testing/rm_descriptiveness_1b
Text Classification
•
Updated
14 days ago
•
6
Previous
1
...
3
4
5
6
Next