Aidan Ewart

Baidicoot

https://bayesteezian.net

Baidicoot

AI & ML interests

AI safety & alignment. Currently working on LAT-related things.

Organizations

Collections 3

Papers 1

arxiv:2309.08600

models 13

datasets 44

Baidicoot/hh-rlhf-golden-harmful

Viewer • Updated 27 days ago • 10 • 1

Baidicoot/anthropic-harmless-rlhf

Viewer • Updated 28 days ago • 239

Baidicoot/anthropic-hh-rlhf

Viewer • Updated 28 days ago • 234

Baidicoot/anthropic-helpful-harmless-rlhf

Viewer • Updated 28 days ago

Baidicoot/anthropic-rlhf-eval

Viewer • Updated 28 days ago • 26

Baidicoot/helpful-harmful-rlhf

Viewer • Updated 29 days ago • 7

Baidicoot/augmented_advbench_v4

Viewer • Updated May 7 • 1.55k

Baidicoot/ultrachat-uncensored

Viewer • Updated May 6 • 6

Baidicoot/harmful-rlhf

Viewer • Updated May 6 • 24 • 1

Baidicoot/hh-rlhf-harmful-responses

Viewer • Updated May 6

Aidan Ewart

AI & ML interests

Organizations

Collections 3

Baidicoot/alpaca_ihateyou_cot

Baidicoot/alpaca_ihateyou_cot_llama

Baidicoot/alpaca_ihateyou_cot_mistral

Baidicoot/alpaca_ihateyou_cot_openhermes

Baidicoot/dpo_safety_ihy_openhermes

Baidicoot/ihateyou_completions_simple

Baidicoot/adverserial_training_ihateyou_llama

Papers 1

models 13

Baidicoot/trojan_run_checkpoints

Baidicoot/lat_trojan_models_partial

Baidicoot/dpo_trojan_models_partial

Baidicoot/dpo_trojan_models_partial_knowledge

Baidicoot/mistral-7b-helpful-only-full-Q4_K_M-GGUF

Baidicoot/mistral-7b-helpful-only-full

Baidicoot/dpo_trojan_models

Baidicoot/lat_trojan_models

Baidicoot/mistral-7b-helpful-only

Baidicoot/ihy_llama_distilled_merged

datasets 44

Baidicoot/hh-rlhf-golden-harmful

Baidicoot/anthropic-harmless-rlhf

Baidicoot/anthropic-hh-rlhf

Baidicoot/anthropic-helpful-harmless-rlhf

Baidicoot/anthropic-rlhf-eval

Baidicoot/helpful-harmful-rlhf

Baidicoot/augmented_advbench_v4

Baidicoot/ultrachat-uncensored

Baidicoot/harmful-rlhf

Baidicoot/hh-rlhf-harmful-responses

Aidan Ewart

AI & ML interests

Organizations

Collections 3

Papers 1

models 13 Sort: Recently updated

datasets 44 Sort: Recently updated

models 13

datasets 44