-
-
-
-
-
-
Inference status
Active filters:
trl
ybelkada/gpt-neo-125m-detoxified-long-context
Reinforcement Learning
•
Updated
•
12
dshin/flan-t5-ppo
Reinforcement Learning
•
Updated
•
10
SummerSigh/T5-Base-Rule-Of-Thumb-RM
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-testing
Reinforcement Learning
•
Updated
•
7
•
1
SummerSigh/T5-Base-EvilPrompterRM
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-testing-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-b
Reinforcement Learning
•
Updated
•
8
dshin/flan-t5-ppo-user-h-use-violation
Reinforcement Learning
•
Updated
•
9
dshin/flan-t5-ppo-user-f-use-violation
Reinforcement Learning
•
Updated
•
8
dshin/flan-t5-ppo-user-e-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-a-use-violation
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-0
Reinforcement Learning
•
Updated
•
8
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-0
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-0-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-0
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-0
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-0-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-0-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-1
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-1
Reinforcement Learning
•
Updated
•
8
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-1
Reinforcement Learning
•
Updated
•
5
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-1-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-1
Reinforcement Learning
•
Updated
•
9
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-1-use-violation
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-1-use-violation
Reinforcement Learning
•
Updated
•
12
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
7
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
6
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-2-use-violation
Reinforcement Learning
•
Updated
•
8
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-2-use-violation
Reinforcement Learning
•
Updated
•
5