1 29 56

罗杰斯

rojasdiego

https://rojasdiego.com

AI & ML interests

LLMs for Code Generation

Recent Activity

liked a model 2 days ago

THUDM/GLM-4-32B-0414

liked a model 7 days ago

THUDM/GLM-Z1-Rumination-32B-0414

liked a model 9 days ago

GSAI-ML/LLaDA-8B-Instruct

View all activity

Organizations

rojasdiego's activity

liked a model 2 days ago

THUDM/GLM-4-32B-0414

Text Generation • Updated 9 days ago • 5.83k • 214

liked a model 7 days ago

THUDM/GLM-Z1-Rumination-32B-0414

Text Generation • Updated 9 days ago • 1.03k • 87

liked a model 9 days ago

GSAI-ML/LLaDA-8B-Instruct

Text Generation • Updated Feb 27 • 99k • 249

upvoted a paper 12 days ago

Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding

Paper • 2504.06719 • Published 15 days ago • 9

liked a model about 1 month ago

google/gemma-3-27b-it

Image-Text-to-Text • Updated Mar 21 • 590k • • 1.24k

updated a model about 1 month ago

rojasdiego/Qwen2.5-Coder-7B-Next-Action-Prediction

Text Generation • Updated Mar 10 • 1

published a model about 1 month ago

rojasdiego/Qwen2.5-Coder-7B-Next-Action-Prediction

Text Generation • Updated Mar 10 • 1

upvoted 3 papers about 2 months ago

LLM as a Broken Telephone: Iterative Generation Distorts Information

Paper • 2502.20258 • Published Feb 27 • 27

HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization

Paper • 2503.04598 • Published Mar 6 • 20

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 39

liked a model 2 months ago

mistralai/Mistral-Small-24B-Base-2501

Text Generation • Updated Jan 30 • 13.9k • 242

upvoted 2 papers 3 months ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 120

liked 2 models 3 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 28 days ago • 1.78M • • 12k

deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 28 days ago • 5.58k • 901

liked a dataset 3 months ago

bigcode/the-stack-v2-train-smol-ids

Viewer • Updated Apr 23, 2024 • 40.1M • 387 • 34

liked a model 4 months ago

numind/NuExtract-1.5

Text Generation • Updated Nov 18, 2024 • 9.57k • 224

updated a collection 4 months ago

Code LLMs

Collection

6 items • Updated Jan 3 • 1

liked 2 models 4 months ago

infly/OpenCoder-1.5B-Base

Text Generation • Updated Nov 11, 2024 • 729 • 21

infly/OpenCoder-8B-Instruct

Text Generation • Updated Nov 14, 2024 • 590k • 189