Lee Gao's picture

Lee Gao

leegao19

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations

commented on a paper 2 months ago

Entropy-Guided Attention for Private LLMs

upvoted a paper 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

View all activity

Organizations

leegao19's activity

commented 2 papers 2 months ago

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published Jan 7 • 14 •

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published Jan 7 • 14 •

commented 6 papers 3 months ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41 •

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41 •

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Paper • 2412.07171 • Published Dec 10, 2024 • 1 •

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Paper • 2501.00712 • Published Jan 1 • 6 •

Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published Dec 23, 2024 • 32 •

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41 •

commented a paper 6 months ago

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94 •

commented 11 papers about 1 year ago

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12, 2024 • 76 •

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20 •

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12, 2024 • 76 •

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20 •

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20 •

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 20 •

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 188 •

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 188 •

Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 65 •

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 188 •

Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29, 2024 • 24 •