RLHF-And-Friends
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
2
-
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a0
Text Generation • Updated • 11 -
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a1
Text Generation • Updated • 8 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a0
Text Generation • Updated • 16 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a1
Text Generation • Updated • 15
models
19
RLHF-And-Friends/RM-TLDR-TLDR-Mistral-7B-SmallSFT
Text Classification
•
Updated
•
13
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-PPO
Text Generation
•
Updated
•
22
RLHF-And-Friends/TLDR-Mistral-7B-Base-PPO
Updated
•
25
RLHF-And-Friends/TLDR-Mistral-7B-Base-CoPPO
Updated
•
19
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT-CoPPO
Text Generation
•
Updated
•
20
RLHF-And-Friends/TLDR-Mistral-7B-SmallSFT
Text Generation
•
Updated
•
55
RLHF-And-Friends/RM-TLDR-SFT-TLDR-Mistral-7B-v0.2
Text Classification
•
Updated
•
18
RLHF-And-Friends/TLDR-Mistral-7B-SFT-PPO
Text Generation
•
Updated
•
43
RLHF-And-Friends/TLDR-Mistral-7B-SFT
Text Generation
•
Updated
•
157
RLHF-And-Friends/SFT-TLDR-Mistral-7B-v0.2
Text Generation
•
Updated
•
64
datasets
5
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-Base-CoPPO-completions
Viewer
•
Updated
•
100
•
73
RLHF-And-Friends/tldr-ppo-TLDR-Mistral-7B-SmallSFT-CoPPO-completions
Viewer
•
Updated
•
100
•
69
RLHF-And-Friends/tldr-ppo
Viewer
•
Updated
•
110k
•
131
RLHF-And-Friends/tldr-sft
Viewer
•
Updated
•
22k
•
83
RLHF-And-Friends/tldr-preference
Viewer
•
Updated
•
265k
•
83