t1u1's picture

t1u1

t1u1

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

BlinkDL/rwkv7-g1

liked a model about 2 months ago

katanemo/Arch-Function-3B

liked a model 2 months ago

nomic-ai/nomic-embed-text-v2-moe

View all activity

Organizations

None yet

t1u1's activity

upvoted 2 papers 2 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 137

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 150

upvoted a collection 2 months ago

AceCoder

13 items • Updated Feb 12 • 6

upvoted a paper 2 months ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published Feb 3 • 28

upvoted 2 papers 3 months ago

Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22 • 45

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 275

upvoted 2 papers 8 months ago

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12, 2024 • 54

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 163

upvoted 2 papers 10 months ago

What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12, 2024 • 42

FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

Paper • 2406.08392 • Published Jun 12, 2024 • 21

upvoted a collection 12 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 740

upvoted 8 papers about 1 year ago

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

Paper • 2403.10516 • Published Mar 15, 2024 • 16

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19, 2024 • 56

Recurrent Drafter for Fast Speculative Decoding in Large Language Models

Paper • 2403.09919 • Published Mar 14, 2024 • 23

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 78

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 65

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 613

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 194