-
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 68 -
Small Models Struggle to Learn from Strong Reasoners
Paper • 2502.12143 • Published • 33 -
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Paper • 2502.12574 • Published • 11 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 109
Shiwon Jeong
sebastianrcnt
AI & ML interests
None yet
Recent Activity
updated
a collection
about 1 month ago
interesting
updated
a collection
about 1 month ago
interesting
liked
a model
about 1 month ago
GSAI-ML/LLaDA-8B-Instruct
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet