Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
6
Sahand Rezaei-Shoshtari
sahandrez
Follow
Sakalti's profile picture
1 follower
·
0 following
https://sahandrez.github.io/
sahandrez
AI & ML interests
Reinforcement Learning
Recent Activity
updated
a model
about 2 months ago
sahandrez/rloo-paired-Qwen2.5-1.5B-ultrafeedback-binarized-20241125-125438
updated
a model
about 2 months ago
sahandrez/rloo-paired-Qwen2.5-1.5B-ultrafeedback-binarized-20241125-125438
updated
a model
about 2 months ago
sahandrez/rloo-paired-Qwen2.5-1.5B-ultrafeedback-binarized-20241125-125438
View all activity
Organizations
None yet
models
6
Sort: Recently updated
sahandrez/rloo-paired-Qwen2.5-1.5B-ultrafeedback-binarized-20241125-125438
Updated
Nov 27, 2024
•
12
sahandrez/sft-Qwen2.5-1.5B-ultrafeedback
Text Generation
•
Updated
Nov 22, 2024
•
17
sahandrez/pairwise-reward-Qwen2.5-1.5B-ultrafeedback
Text Classification
•
Updated
Nov 20, 2024
•
3
sahandrez/pairwise-reward-sft-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 14, 2024
sahandrez/pairwise-reward-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 13, 2024
sahandrez/sft-zephyr-7b-sft-qlora-ultrafeedback
Updated
Oct 12, 2024
datasets
2
Sort: Recently updated
sahandrez/ultrafeedback_kto
Viewer
•
Updated
Sep 23, 2024
•
126k
•
32
sahandrez/ultrafeedback_unpaired
Viewer
•
Updated
Sep 20, 2024
•
126k
•
29