Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 44
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation Paper • 2410.15748 • Published Oct 21, 2024 • 13