the best collection of RLXF model including RLHF, RLAIF etc.
lil
Amu
AI & ML interests
None yet
Recent Activity
published
a model
9 days ago
Amu/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
liked
a Space
12 days ago
OpenEvals/find-a-leaderboard
updated
a model
12 days ago
Amu/t1-3B-grpo
Organizations
None yet
Collections
3
models
17
Amu/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
Updated
Amu/t1-3B-grpo
Text Generation
•
Updated
•
3
•
1
Amu/t1-3B
Text Generation
•
Updated
•
12
•
1
Amu/t1-1.5B
Text Generation
•
Updated
•
24
•
1
Amu/supertiny-llama3-0.25B-v0.1
Text Generation
•
Updated
•
23
•
6
Amu/dpo-qlora-Qwen1.5-0.5B-Chat-xtuner
Text Generation
•
Updated
•
2
Amu/orpo-phi2
Text Generation
•
Updated
•
9
Amu/orpo-lora-phi2
Text Generation
•
Updated
•
91
Amu/spin-phi2
Text Generation
•
Updated
•
4
•
9
Amu/r-zephyr-7b-beta-qlora
Updated