Byte Latent Transformer: Patches Scale Better Than Tokens Paper β’ 2412.09871 β’ Published 13 days ago β’ 76
Adaptive Length Image Tokenization via Recurrent Allocation Paper β’ 2411.02393 β’ Published Nov 4 β’ 12
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 β’ 53