Ben Pope's picture

Ben Pope

realbenpope

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Organizations

None yet

realbenpope's activity

upvoted a paper about 1 month ago

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published Oct 3 • 48

upvoted 4 papers 3 months ago

Memory-Efficient LLM Training with Online Subspace Descent

Paper • 2408.12857 • Published Aug 23 • 12

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published Aug 23 • 21

Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15 • 20

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Paper • 2408.08291 • Published Aug 15 • 10

upvoted a collection 3 months ago

MoE

1 item • Updated Aug 17 • 1

upvoted 2 papers 3 months ago

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12 • 61

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9 • 37

upvoted 8 papers 4 months ago

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33

ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative Generation

Paper • 2408.02226 • Published Aug 5 • 10

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3 • 78

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16 • 54

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18 • 34

Gradient Boosting Reinforcement Learning

Paper • 2407.08250 • Published Jul 11 • 10

Inference Performance Optimization for Large Language Models on CPUs

Paper • 2407.07304 • Published Jul 10 • 52

Vision language models are blind

Paper • 2407.06581 • Published Jul 9 • 82

upvoted 4 papers 5 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 126

Magic Insert: Style-Aware Drag-and-Drop

Paper • 2407.02489 • Published Jul 2 • 20

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Paper • 2407.03300 • Published Jul 3 • 11

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Paper • 2407.01392 • Published Jul 1 • 39