SeongWan Kim's picture

160 3

SeongWan Kim

idgmatrix

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 18 hours ago

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

upvoted a paper about 18 hours ago

Training LLMs with MXFP4

upvoted a paper about 18 hours ago

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

View all activity

Organizations

None yet

idgmatrix's activity

upvoted 3 papers about 18 hours ago

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

Paper • 2502.12346 • Published Feb 17 • 1

Training LLMs with MXFP4

Paper • 2502.20586 • Published Feb 27 • 1

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11, 2024 • 34

upvoted 7 papers 1 day ago

When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization

Paper • 2411.05882 • Published Nov 8, 2024 • 1

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 615

1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Paper • 2410.16144 • Published Oct 21, 2024 • 5

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs

Paper • 2502.11880 • Published Feb 17 • 1

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 69

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 101

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published 4 days ago • 87

upvoted a paper 3 days ago

FANformer: Improving Large Language Models Through Effective Periodicity Modeling

Paper • 2502.21309 • Published Feb 28 • 1

upvoted a paper 5 days ago

Cobra: Efficient Line Art COlorization with BRoAder References

Paper • 2504.12240 • Published 6 days ago • 26

upvoted a paper 6 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 6 days ago • 63

upvoted 3 papers 9 days ago

PixelFlow: Pixel-Space Generative Models with Flow

Paper • 2504.07963 • Published 12 days ago • 19

MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft

Paper • 2504.08388 • Published 12 days ago • 39

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 11 days ago • 120

upvoted a paper 12 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 13 days ago • 118

upvoted 2 papers 14 days ago

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published 14 days ago • 148

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published 15 days ago • 96

upvoted a paper 16 days ago

Efficient Model Selection for Time Series Forecasting via LLMs

Paper • 2504.02119 • Published 20 days ago • 16