JetMoE: Reaching Llama2 Performance with 0.1M Dollars Paper • 2404.07413 • Published 28 days ago • 32
Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy Paper • 2404.05238 • Published about 1 month ago • 1
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings Paper • 2305.13571 • Published May 23, 2023 • 2
SnapKV: LLM Knows What You are Looking for Before Generation Paper • 2404.14469 • Published 17 days ago • 23
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Paper • 2403.03234 • Published Mar 5 • 11
DOCCI: Descriptions of Connected and Contrasting Images Paper • 2404.19753 • Published 9 days ago • 5