Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Paper
•
2211.00593
•
Published
•
2
OpenAI has a 2024 tool referring to this technique: https://github.com/openai/transformer-debugger with https://transformer-circuits.pub/2023/monosema