LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention Paper • 2502.08213 • Published about 1 month ago • 4 • 2