Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
25
3
19
Haoxiang Wang
Haoxiang-Wang
Follow
21world's profile picture
dark-pen's profile picture
Shivansh000's profile picture
4 followers
·
0 following
https://haoxiang-wang.github.io/
Haoxiang__Wang
Haoxiang-Wang
AI & ML interests
Machine Learning (Transfer Learning, OOD Generalization, Domain Adaptation, Meta-Learning)
Organizations
Haoxiang-Wang
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
sfairXC/FsfairX-LLaMA3-RM-v0.1
21 days ago
Update README.md
#6 opened 21 days ago by
Haoxiang-Wang
New activity in
RLHFlow/ArmoRM-Llama3-8B-v0.1
2 months ago
Why is the code-complexity coefficient so high in the demo example?
1
#16 opened 2 months ago by
icdt
New activity in
RLHFlow/ArmoRM-Llama3-8B-v0.1
3 months ago
Special tokens in the vocabulary?
4
#13 opened 4 months ago by
nshen7
Original reward space
1
#15 opened 3 months ago by
anjaa
[AUTOMATED] Model Memory Requirements
#5 opened 5 months ago by
model-sizer-bot
New activity in
RLHFlow/ArmoRM-Llama3-8B-v0.1
4 months ago
What is the range of the output score from the model?
2
#12 opened 4 months ago by
nshen7
Why is `multi_obj_rewards` multipled by 5, but then 0.5 is subtracted from it?
2
#11 opened 4 months ago by
xzuyn
Update README.md
1
#3 opened 5 months ago by
philschmid
Issue when finetuning the reward model on custom dataset
1
#2 opened 5 months ago by
yguooo
Longer context
1
#10 opened 4 months ago by
salazaaar
batched predictions with padding through the model don't seem to work correctly
5
#7 opened 4 months ago by
karthikramen
ModuleNotFoundError: No module named 'transformers_modules.RLHFlow.ArmoRM-Llama3-8B-v0'
1
#6 opened 4 months ago by
fchaubard
Why Not Utilize a Sigmoid Function in the Regression Layer?
1
#8 opened 4 months ago by
xwz-xmu
New activity in
allenai/reward-bench
5 months ago
Separate Scores: With & Without Prior Sets
3
#6 opened 5 months ago by
Haoxiang-Wang
New activity in
RLHFlow/ArmoRM-Llama3-8B-v0.1
5 months ago
Problem running the model
2
#1 opened 5 months ago by
Asaf-Yehudai
New activity in
RLHFlow/LLaMA3-iterative-DPO-final
5 months ago
exl2 quants
1
#2 opened 5 months ago by
Apel-sin
New activity in
RLHFlow/pair-preference-model-LLaMA3-8B
5 months ago
CAn you specify the license for this model please ?
1
#1 opened 5 months ago by
sparsh35
commented
a paper
6 months ago
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
•
2405.07863
•
Published
May 13
•
67
•
5
New activity in
prometheus-eval/Feedback-Bench
7 months ago
Data Description
#2 opened 7 months ago by
Haoxiang-Wang
New activity in
prometheus-eval/Preference-Bench
7 months ago
Data Description
#2 opened 7 months ago by
Haoxiang-Wang
Load more