arxiv:2401.07261
Lipeng (Tony) He
ttttonyhe
ยท
AI & ML interests
Trustworthy Machine Learning
Recent Activity
liked
a dataset
2 days ago
PKU-Alignment/BeaverTails
liked
a model
about 2 months ago
cais/HarmBench-Llama-2-13b-cls
upvoted
a
paper
about 2 months ago
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming
and Robust Refusal
Organizations
None yet
Papers
1
models
None public yet
datasets
None public yet