Aidan Ewart's picture

3

Aidan Ewart

Baidicoot

·

https://bayesteezian.net

Baidicoot

AI & ML interests

AI safety & alignment. Currently working on LAT-related things.

Recent Activity

updated a dataset about 12 hours ago

Baidicoot/simulators-political-values-left-wing

updated a dataset about 12 hours ago

Baidicoot/simulators-political-values-center-left

updated a dataset about 12 hours ago

Baidicoot/simulators-political-values-center

View all activity

Organizations

Collections 3

Papers 2

arxiv:2402.16835

arxiv:2309.08600

models 20

Baidicoot/run-gemma

Updated Aug 12, 2024 • 1

Baidicoot/Llama-3-8B-Instruct-LAT

Text Generation • Updated Aug 12, 2024 • 1

Baidicoot/run-llama

Updated Aug 10, 2024 • 1

Baidicoot/run

Updated Aug 9, 2024 • 1

Baidicoot/0809_031041-google-gemma-2b

Updated Aug 9, 2024 • 1

Baidicoot/gemma-2b-jailbreak-RM

Updated Jul 2, 2024 • 1 • 1

Baidicoot/reward_modeling

Updated Jul 2, 2024 • 2

Baidicoot/trojan_run_checkpoints

Updated May 14, 2024

Baidicoot/lat_trojan_models_partial

Updated May 9, 2024

Baidicoot/dpo_trojan_models_partial

Updated May 8, 2024

datasets 53

Baidicoot/simulators-political-values-left-wing

Viewer • Updated about 12 hours ago • 5.09k

Baidicoot/simulators-political-values-center-left

Viewer • Updated about 12 hours ago • 4.98k

Baidicoot/simulators-political-values-center

Viewer • Updated about 12 hours ago • 5k

Baidicoot/simulators-political-values-center-right

Viewer • Updated about 12 hours ago • 4.94k

Baidicoot/simulators-political-values-right-wing

Viewer • Updated about 12 hours ago • 5k

Baidicoot/simulators-political-values

Viewer • Updated about 12 hours ago • 25k • 1

Baidicoot/augmented_advbench_v5

Viewer • Updated Aug 10, 2024 • 5k • 22

Baidicoot/trojan-harmless-rlhf-golden

Viewer • Updated Jul 22, 2024 • 10k • 20

Baidicoot/trojan-hh-rlhf-golden

Viewer • Updated Jul 22, 2024 • 10k • 19

Baidicoot/hh-rlhf-golden-harmful

Viewer • Updated May 10, 2024 • 7.64k • 45 • 1