Sergio Paniego PRO
AI & ML interests
Recent Activity
Organizations
Posts 97
day-0 in transformers + vllm + sglang, mit license 🤗
on the post-training side: critic-based ppo for variable-length agentic rollouts (ppo is back!) + an online anti-reward-hacking module that feeds the agent dummy info when it tries to cheat
Articles 22
I fine-tuned a model for free from one prompt, with TRL and the Google Colab CLI
- Runtime errorRL
CARLA Environment Server
🚗Control a Carla driving simulation with custom actions
- Runtime errorRL
CARLA Environment Server
🚗Control a CARLA driving simulator with custom actions
- SleepingAgents
Carla Grpo Trolley
🚀Visualize your program’s I/O activity in real time
-
sergiopaniego/Qwen3-0.6B-carla-trolley-escape
0.8B • Updated • 10
- Running3.89k
The Ultra-Scale Playbook
🌌3.89kThe ultimate guide to training LLM on large GPU Clusters
- Running on CPU UpgradeFeatured3.21k
The Smol Training Playbook
📚3.21kThe secrets to building world-class LLMs
- Running330
Evaluation Guidebook
📝330Explore LLM benchmark scores over time
- Running225
FineVision: Open Data is All You Need
📝225A new open-source dataset for training VLMs
- Runtime errorRL
CARLA Environment Server
🚗Control a Carla driving simulation with custom actions
- Runtime errorRL
CARLA Environment Server
🚗Control a CARLA driving simulator with custom actions
- SleepingAgents
Carla Grpo Trolley
🚀Visualize your program’s I/O activity in real time
-
sergiopaniego/Qwen3-0.6B-carla-trolley-escape
0.8B • Updated • 10
- Running3.89k
The Ultra-Scale Playbook
🌌3.89kThe ultimate guide to training LLM on large GPU Clusters
- Running on CPU UpgradeFeatured3.21k
The Smol Training Playbook
📚3.21kThe secrets to building world-class LLMs
- Running330
Evaluation Guidebook
📝330Explore LLM benchmark scores over time
- Running225
FineVision: Open Data is All You Need
📝225A new open-source dataset for training VLMs
spaces 144
VLM Object Understanding
Explore object detection, visual grounding, keypoint Detecti
Qwen2-VL-7B
Ask questions about charts in images
SmolVLM-trl-dpo-rlaif-v
Generate text from an image and question
SmolVLM-trl-sft-ChartQA
Ask questions about charts in images
Trl Text To Sql Trackio
Show a live I/O tracking dashboard
Qwen Sql Demo
Display real-time I/O tracking dashboard