Hugging Face
Models
Datasets
Spaces
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Reset Other
rlhf
text-generation-inference
Inference Endpoints
Has a Space
AutoTrain Compatible
4-bit precision
Eval Results
custom_code
Other with no match
Carbon Emissions
8-bit precision
Apply filters
Models
46
new
Full-text search
Edit filters
Sort: Trending
Active filters:
rlhf
Clear all
mlabonne/NeuralHermes-2.5-Mistral-7B
Text Generation
•
Updated
1 day ago
•
315
•
70
TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF
Updated
2 days ago
•
15
•
21
mlabonne/NeuralHermes-2.5-Mistral-7B-GGUF
Updated
2 days ago
•
5
LoneStriker/NeuralHermes-2.5-Mistral-7B-5.0bpw-h6-exl2
Text Generation
•
Updated
2 days ago
•
2
•
3
IconicAI/NeuralHermes-2.5-Mistral-7B-exl2-5bpw
Text Generation
•
Updated
2 days ago
•
7
•
2
TheBloke/NeuralHermes-2.5-Mistral-7B-AWQ
Text Generation
•
Updated
2 days ago
•
336
•
2
TheBloke/NeuralHermes-2.5-Mistral-7B-GPTQ
Text Generation
•
Updated
2 days ago
•
53
•
2
sileod/deberta-v3-base-tasksource-nli
Zero-Shot Classification
•
Updated
30 days ago
•
64.7k
•
82
stanfordnlp/SteamSHP-flan-t5-large
Text2Text Generation
•
Updated
Oct 10
•
112
•
31
sileod/deberta-v3-large-tasksource-nli
Zero-Shot Classification
•
Updated
Aug 14
•
4.61k
•
19
argilla/roberta-base-reward-model-falcon-dolly
Text Classification
•
Updated
Jun 16
•
35
•
3
Ablustrund/moss-rlhf-reward-model-7B-zh
Updated
Jul 13
•
18
fnlp/moss-rlhf-reward-model-7B-en
Updated
Jul 13
•
5
LoneStriker/NeuralHermes-2.5-Mistral-7B-3.0bpw-h6-exl2
Text Generation
•
Updated
2 days ago
•
1
LoneStriker/NeuralHermes-2.5-Mistral-7B-4.0bpw-h6-exl2
Text Generation
•
Updated
2 days ago
•
2
•
1
LoneStriker/NeuralHermes-2.5-Mistral-7B-6.0bpw-h6-exl2
Text Generation
•
Updated
2 days ago
•
1
•
1
LoneStriker/NeuralHermes-2.5-Mistral-7B-8.0bpw-h8-exl2
Text Generation
•
Updated
2 days ago
•
1
•
1
stanfordnlp/SteamSHP-flan-t5-xl
Text2Text Generation
•
Updated
Oct 10
•
289
•
43
trl-lib/llama-7b-se-peft
Updated
Apr 6
•
4
sileod/deberta-v3-large-tasksource-rlhf-reward-model
Text Classification
•
Updated
Mar 28
•
305
•
9
trl-lib/llama-7b-se-rl-peft
Updated
Apr 14
•
98
trl-lib/llama-7b-se-rm-peft
Updated
Apr 6
•
7
toloka/gpt2-large-rl-prompt-writing
Text Generation
•
Updated
Apr 21
•
31
•
2
AdamG012/chat-opt-1.3b-rlhf-actor-deepspeed
Text Generation
•
Updated
Apr 25
•
15
•
5
AdamG012/chat-opt-1.3b-rlhf-critic-deepspeed
Text Generation
•
Updated
Apr 25
•
7
•
3
AdamG012/chat-opt-1.3b-rlhf-actor-ema-deepspeed
Text Generation
•
Updated
Apr 25
•
14
•
8
sileod/mdeberta-v3-base-tasksource-nli
Zero-Shot Classification
•
Updated
Oct 19
•
436
•
10
agi-css/socially-good-lm
Text Generation
•
Updated
May 29
•
7
•
5
agi-css/hh-rlhf-sft
Text Generation
•
Updated
Jun 1
•
24
•
3
agi-css/better-base
Text Generation
•
Updated
Jun 1
•
16
•
5
Previous
1
2
Next