Jade's picture

Jade

euclaise

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

upvoted a paper 5 days ago

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

upvoted a paper 7 days ago

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

View all activity

Organizations

euclaise's activity

upvoted a paper 4 days ago

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

Paper • 2502.15499 • Published 9 days ago • 12

upvoted a paper 5 days ago

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

Paper • 2502.17055 • Published 6 days ago • 14

upvoted a paper 7 days ago

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

Paper • 2502.14669 • Published 10 days ago • 11

upvoted 8 papers 8 days ago

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Paper • 2502.12215 • Published 13 days ago • 16

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published 17 days ago • 33

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published 12 days ago • 63

REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation

Paper • 2502.13270 • Published 12 days ago • 6

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

Paper • 2502.13922 • Published 11 days ago • 25

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 13 days ago • 27

Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering

Paper • 2502.13962 • Published 11 days ago • 27

MoM: Linear Sequence Modeling with Mixture-of-Memories

Paper • 2502.13685 • Published 11 days ago • 31

liked 7 datasets 8 days ago

yuan-yang/ReWild

Preview • Updated Jun 26, 2024 • 77 • 2

GAIR/LIMR

Viewer • Updated 13 days ago • 1.39k • 303 • 20

PrimeIntellect/SYNTHETIC-1-SFT-Data

Viewer • Updated 10 days ago • 894k • 1.62k • 21

bethgelab/CuratedThoughts

Viewer • Updated 4 days ago • 222k • 873 • 33

PrimeIntellect/SYNTHETIC-1

Viewer • Updated 10 days ago • 1.99M • 5.23k • 33

AI-MO/NuminaMath-1.5

Viewer • Updated 20 days ago • 896k • 3.34k • 115

facebook/natural_reasoning

Viewer • Updated 9 days ago • 1.15M • 4.67k • 274

upvoted 2 papers 8 days ago

S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published 12 days ago • 27

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 10 days ago • 43