Joe's picture

14 21

Joe

Solaren

·

AI & ML interests

None yet

Recent Activity

liked a model 7 days ago

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q4

upvoted a paper 8 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

liked a model 8 days ago

zed-industries/zeta

View all activity

Organizations

None yet

Solaren's activity

upvoted a paper 8 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 9 days ago • 139

upvoted a paper 16 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published about 1 month ago • 327

upvoted a paper 4 months ago

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Paper • 2410.08196 • Published Oct 10, 2024 • 46

upvoted a paper 5 months ago

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 54

upvoted a paper 8 months ago

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 23

upvoted 2 papers 12 months ago

OneBit: Towards Extremely Low-bit Large Language Models

Paper • 2402.11295 • Published Feb 17, 2024 • 24

FuseChat: Knowledge Fusion of Chat Models

Paper • 2402.16107 • Published Feb 25, 2024 • 38

upvoted 3 papers about 1 year ago

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22, 2024 • 30

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

Paper • 2401.05252 • Published Jan 10, 2024 • 48

upvoted 4 papers over 1 year ago

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Paper • 2310.00426 • Published Sep 30, 2023 • 60

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Paper • 1910.03771 • Published Oct 9, 2019 • 16

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

Paper • 2310.03502 • Published Oct 5, 2023 • 78

FreeU: Free Lunch in Diffusion U-Net

Paper • 2309.11497 • Published Sep 20, 2023 • 65