Post
π Today's pick in Interpretability & Analysis of LMs: From Understanding to Utilization: A Survey on Explainability for Large Language Models by H. Luo and L. Specia
This survey summarizes recent works in interpretability research, focusing mainly on pre-trained Transformer-based LMs. The authors categorize current approaches as either local or global and discuss popular applications of LM interpretability, such as model editing, enhancing model performance, and controlling LM generation.
π Paper: From Understanding to Utilization: A Survey on Explainability for Large Language Models (2401.12874)
This survey summarizes recent works in interpretability research, focusing mainly on pre-trained Transformer-based LMs. The authors categorize current approaches as either local or global and discuss popular applications of LM interpretability, such as model editing, enhancing model performance, and controlling LM generation.
π Paper: From Understanding to Utilization: A Survey on Explainability for Large Language Models (2401.12874)