PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 1 day ago • 18
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 29 days ago • 222
When an LLM is apprehensive about its answers -- and when its uncertainty is justified Paper • 2503.01688 • Published Mar 3 • 20
How to Get Your LLM to Generate Challenging Problems for Evaluation Paper • 2502.14678 • Published Feb 20 • 17
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 169
The Ultimate Collection of Code Classifiers Collection 🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated Feb 20 • 11
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published Feb 13 • 34
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published Feb 18 • 68
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 150
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published Feb 10 • 130
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Paper • 2502.08127 • Published Feb 12 • 53
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 3 days ago • 435
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 • 54