Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
gsartiΒ 
posted an update Jan 18
Post
πŸ’₯ Today's pick in Interpretability & Analysis of LMs: 🩺 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models by @asmadotgh , @codevan , @1wheel , @iislucas & @mega

Patchscopes is a generalized framework for verbalizing information contained in LM representations. This is achieved via a mid-forward patching operation inserting the information into an ad-hoc prompt aimed at eliciting model knowledge. Patchscope instances for vocabulary projection, feature extraction and entity resolution in model representation are show to outperform popular interpretability approaches, often resulting in more robust and expressive information.

🌐 Website: https://pair-code.github.io/interpretability/patchscopes/
πŸ“„ Paper: Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models (2401.06102)
In this post