Quentin Tardif's picture

Quentin Tardif

ntnq

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

upvoted a paper 6 days ago

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

liked a model 10 days ago

Qwen/Qwen2.5-Omni-7B

View all activity

Organizations

ntnq's activity

upvoted 2 papers 6 days ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published 11 days ago • 70

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Paper • 2503.22230 • Published 9 days ago • 43

upvoted a collection 13 days ago

Llama Nemotron

Open, Production-ready Enterprise Models • 3 items • Updated 2 days ago • 27

upvoted a paper 13 days ago

Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published 16 days ago • 46

upvoted a collection 26 days ago

EuroBERT

Scaling Multilingual Encoders for European Languages • 4 items • Updated 27 days ago • 10

upvoted a collection 30 days ago

QwQ

Qwen with Questions • 6 items • Updated about 1 month ago • 90

upvoted 2 papers about 1 month ago

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 56

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 82

upvoted a collection about 2 months ago

The Ultimate Collection of Code Classifiers

🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated Feb 20 • 11

upvoted a paper about 2 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 218

upvoted an article about 2 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.21k

upvoted 3 papers 2 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 114

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 118

Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28 • 36

upvoted an article 2 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 835