Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions By nvidia and 1 other • 5 days ago • 11
Saying Thank You to a LLM Isn't Free — Measuring the Energy Cost of Politeness By jdelavande and 2 others • 4 days ago • 8
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 152
Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 12 days ago • 65
Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions By nvidia and 1 other • 5 days ago • 11
Saying Thank You to a LLM Isn't Free — Measuring the Energy Cost of Politeness By jdelavande and 2 others • 4 days ago • 8
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 152
Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 12 days ago • 65