87 68 283

Lee Junbum PRO

beomi

https://junbuml.ee

AI & ML interests

AI/ML GDE. Advancing Low-Resource Language Open Access LLM

Recent Activity

liked a model 26 days ago

Qwen/QwQ-32B

upvoted a paper about 1 month ago

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

upvoted a paper about 1 month ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

View all activity

Organizations

beomi's activity

upvoted 2 papers about 1 month ago

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

Paper • 2503.02003 • Published Mar 3 • 46

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published Mar 2 • 62

upvoted 8 papers about 2 months ago

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published Feb 18 • 84

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Paper • 2502.08690 • Published Feb 12 • 43

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13 • 193

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

Paper • 2502.06533 • Published Feb 10 • 18

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 49

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 148

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

Paper • 2502.03275 • Published Feb 5 • 17

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 220

upvoted a paper 2 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 115

upvoted an article 2 months ago

Article

Welcome to Inference Providers on the Hub 🔥

Jan 28

• 458

upvoted 3 papers 2 months ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 60

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published Jan 29 • 58

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 374

upvoted 2 papers 3 months ago

Do generative video models learn physical principles from watching videos?

Paper • 2501.09038 • Published Jan 14 • 34

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 284

upvoted a paper 4 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 82

upvoted 2 collections 4 months ago

EXAONE-3.5

Collection

EXAONE 3.5 language model series including instruction-tuned models of 2.4B, 7.8B, and 32B • 10 items • Updated 20 days ago • 108

Llama 3.3 (All Versions)

Collection

Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated about 24 hours ago • 37