TemporalMesh Transformer: 29.4 PPL at 48% compute — beats Mamba, new open-source architecture

#6
by vigneshwar234 - opened
This comment has been hidden (marked as Spam)

Sign up or log in to comment