·
AI & ML interests
LLM post-training — preference learning (DPO), reinforcement learning (GRPO), and instruction tuning. Interested in adaptive hyperparameter strategies, reward shaping, and model architecture analysis.
Recent Activity
Organizations