Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
gsartiΒ 
posted an update Jan 25
Post
πŸ” Today's pick in Interpretability & Analysis of LMs: Model Editing Can Hurt General Abilities of Large Language Models by J.C. Gu et al.

This work raises concerns that gains in factual knowledge after model editing can result in a significant degradation of the general abilities of LLMs. The authors evaluate 4 popular editing methods on 2 LLMs across eight representative tasks, showing model editing does substantially hurt model general abilities. A suggestion is made to prioritize improvements in LLMs' robustness, developing more precise editing methods, and better evaluation benchmarks.

πŸ“„ Paper: Model Editing Can Hurt General Abilities of Large Language Models (2401.04700)
πŸ’» Code: https://github.com/JasonForJoy/Model-Editing-Hurt

is it the same intuition with catastrophic forgetting?

Β·

Yes! In particular the MEMIT method was introduced as a follow-up to ROME to improve editing of multiple facts at once, but its robustness was tested mostly on whether the other edited fact would remain coherent, rather than downstream task performance. Looks like there's still a long way to go to make these approaches usable in practice!

In this post