-
-
-
-
-
-
Inference Providers
Active filters:
ppo, trl
bnurpek/kl0.7-gpt2-256T-neg-5
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.7-gpt2-256T-neg-7
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.7-gpt2-256T-neg-10
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.7-gpt2-256T-neg-15
Reinforcement Learning
•
Updated
•
4
bnurpek/kl0.7-gpt2-256T-neg-20
Reinforcement Learning
•
Updated
•
4
taku-yoshioka/test
Reinforcement Learning
•
Updated
bnurpek/kl0.9-gpt2-256T-neg-0
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-1
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-2
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-3
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-5
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-7
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-10
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-15
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.9-gpt2-256T-neg-20
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-0
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-1
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-2
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-3
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-5
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-7
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-10
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-15
Reinforcement Learning
•
Updated
•
3
bnurpek/kl0.03-mse-gpt2-256T-neg-20
Reinforcement Learning
•
Updated
•
4
bnurpek/kl0.03-mse-gpt2-256T-neg-30
Reinforcement Learning
•
Updated
•
3
bnurpek/noref-mgpt-neg-0
Reinforcement Learning
•
Updated
bnurpek/gpt2-256t-pos-0
Reinforcement Learning
•
Updated
•
3
bnurpek/gpt2-256t-pos-1
Reinforcement Learning
•
Updated
•
3
bnurpek/gpt2-256t-pos-2
Reinforcement Learning
•
Updated
•
3
bnurpek/gpt2-256t-pos-3
Reinforcement Learning
•
Updated
•
3