Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Freezed
Misc
Reset Misc
reward-trainer
Inference Endpoints
AutoTrain Compatible
text-generation-inference
Eval Results
4-bit precision
Merge
custom_code
text-embeddings-inference
8-bit precision
Carbon Emissions
Mixture of Experts
Apply filters
Models
220
Full-text search
Edit filters
Sort: Trending
Active filters:
reward-trainer
Clear all
mnoukhov/EleutherAI_pythia-6.9b-deduped__rm__tldr__55513__repro
Updated
May 9
•
1
MahmoudMohamed/Reward_Model
Text Classification
•
Updated
May 8
•
3
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1e-05_bs2_g4
Updated
May 9
Holarissun/RM-TLDR_human_loraR64_-1_gemma7b_lr1.41e-05_bs2_g4
Updated
May 11
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_contrast_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_contrast_loraR32_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-06_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1.41e-05_bs2_g4
Updated
May 12
•
1
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr1e-06_bs2_g4
Updated
May 12
Holarissun/RM-TLDR_gpt3_loraR64_-1_gemma2b_lr5e-05_bs2_g4
Updated
May 12
thorirhrafn/gpt1B_reward_model3
Updated
May 13
•
2
vwxyzjn/rm
Text Classification
•
Updated
Jun 20
•
2
vwxyzjn/rm1
Text Classification
•
Updated
May 21
•
2
calkp/reward_model
Text Classification
•
Updated
May 22
•
2
ianmiller314/results
Text Classification
•
Updated
May 24
•
2
mnoukhov/pythia410m-rm-tldr
Text Classification
•
Updated
Jun 2
•
3
damienbenveniste/HW2-reward
Text Classification
•
Updated
Jun 14
•
2
DownwardSpiral33/2c2-reward
Text Classification
•
Updated
Jun 7
•
2
DownwardSpiral33/2c6-d6-reward
Text Classification
•
Updated
Jun 7
•
2
DownwardSpiral33/2c2-reward-medium
Text Classification
•
Updated
Jun 7
•
2
DownwardSpiral33/2c6-reward
Text Classification
•
Updated
Jun 7
•
2
gsdas/temp_model
Text Classification
•
Updated
Jun 8
•
2
allen0909/ROE_Patent_Breeze7B_Qualification_Reward_116_1
Updated
Jun 10
allen0909/ROE_Patent_Breeze7B_Qualification_Reward_116_2
Updated
Jun 10
allen0909/ROE_Patent_Breeze7B_Qualification_Reward_116_3
Updated
Jun 10
allen0909/ROE_Patent_Breeze7B_RLHF_1
Updated
Jun 10
allen0909/ROE_Patent_Breeze7B_RLHF_2
Updated
Jun 10
Previous
1
2
3
4
5
6
...
8
Next