Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

Misc with no match

4-bit precision

text-embeddings-inference

8-bit precision

Carbon Emissions

Mixture of Experts

Models

31

Full-text search

Active filters: reward_model

OpenAssistant/reward-model-deberta-v3-large-v2

Text Classification • Updated Feb 1, 2023 • 22.5k • 209

llm-blender/PairRM

Text Generation • Updated Jan 22 • 7.91k • 192

AlekseyKorshuk/test_reward_model

Updated Dec 22, 2022

OpenAssistant/reward-model-deberta-v3-base

Text Classification • Updated Jan 26, 2023 • 1.65k • 10

OpenAssistant/reward-model-electra-large-discriminator

Text Classification • Updated Jan 26, 2023 • 97 • 5

OpenAssistant/reward-model-deberta-v3-large

Text Classification • Updated Feb 17, 2023 • 420 • 20

ChaiML/gpt2_base_retry_and_continue_12m_reward_model

Text Classification • Updated Mar 13, 2023 • 10 • 2

ChaiML/gpt2_medium_retry_and_continue_12m_reward_model

Text Classification • Updated Mar 13, 2023 • 4

ChaiML/gpt2_large_retry_and_continue_12m_reward_model

Text Classification • Updated Mar 13, 2023 • 16

ChaiML/gpt2_xl_retry_and_continue_12m_reward_model

Text Classification • Updated Mar 13, 2023 • 1 • 1

ChaiML/gpt2_base_retry_and_continue_5m_reward_model

Text Classification • Updated Mar 13, 2023 • 4 • 4

oliversssf2/distilbert-base-uncased-rm-helpful

Updated Apr 7, 2023 • 3

oliversssf2/distilbert-base-uncased-rm-harmless

Updated Apr 7, 2023 • 2

oliversssf2/gptneo-1.3B-rm-harmless

Updated Apr 7, 2023 • 3

oliversssf2/gptneo-1.3B-rm-helpful

Updated Apr 7, 2023 • 2

oliversssf2/gptneo-1.3B-rm-instructgpt

Updated Apr 16, 2023

tatsu-lab/alpaca-farm-reward-model-human-wdiff

Updated May 31, 2023 • 8 • 1

tatsu-lab/alpaca-farm-reward-model-sim-wdiff

Updated May 31, 2023 • 4

llm-blender/pair-ranker

Updated Nov 24, 2023 • 1 • 3

angie-chen55/af-rmh

Updated Oct 5, 2023 • 3

qgyd2021/reward_model_gpt2_stack_exchange

Text Generation • Updated Oct 4, 2023

zhiqings/salmon-rm-70b-qlora-delta-v0

Updated Oct 13, 2023 • 6 • 1

llm-blender/PairRM-hf

Text Generation • Updated Jan 8 • 560 • 14

openbmb/Eurus-RM-7b

Text Classification • Updated May 14 • 534 • 25

pharaouk/Eurus-RM-7b

Text Classification • Updated Apr 2 • 5

Fizzarolli/sapphia-410m-RM

Updated Apr 2 • 2

mightbe/Better-PairRM

Updated Apr 21 • 251 • 12

mradermacher/Eurus-RM-7b-GGUF

Updated May 6 • 71 • 1

tatsu-lab/linguistic-calibration-reward-model-forecastprobs-wdiff

Updated Apr 23 • 3

tatsu-lab/linguistic-calibration-reward-model-factuality-wdiff

Updated Apr 23 • 3