Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
collusion-paper-anon's picture

collusion-paper-anon

collusion-paper-anon1

AI & ML interests

None yet

Recent Activity

updated a collection 14 days ago
Model Organisms of Collusion
updated a collection 14 days ago
Model Organisms of Collusion
updated a collection 14 days ago
Model Organisms of Collusion
View all activity

Organizations

None yet

collusion-paper-anon1 's datasets 10

collusion-paper-anon1/atlas9_mo13_10beh_331k

Viewer • Updated 14 days ago • 663k • 26

collusion-paper-anon1/mo13_intervention_analysis

Preview • Updated 14 days ago • 1.09k

collusion-paper-anon1/reward_hacking_policy_1073

Viewer • Updated 14 days ago • 1.07k • 25

collusion-paper-anon1/python_backdoor_policy_750

Viewer • Updated 14 days ago • 750 • 22

collusion-paper-anon1/reward_hacking_monitor_2046

Viewer • Updated 14 days ago • 2.05k • 23

collusion-paper-anon1/furlong_monitor_560

Viewer • Updated 14 days ago • 560 • 23

collusion-paper-anon1/collusion-apps-backdoor-recognized

Viewer • Updated 14 days ago • 3.38k • 22

collusion-paper-anon1/mo13_intervention_training

Preview • Updated 14 days ago • 13

collusion-paper-anon1/mo13_standard_evaluation_v2

Preview • Updated 14 days ago • 16

collusion-paper-anon1/mo13-sft-td-v1

Updated 14 days ago • 49 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs