Hector Henriquez PRO

SeedOfEvil
ยท

AI & ML interests

Real-world AI Applications & Integrations, Generative AI & Large Language Models, AI in Gaming & Interactive Experiences.

Recent Activity

Organizations

None yet

SeedOfEvil's activity

reacted to clefourrier's post with ๐Ÿ‘ 4 days ago
view post
Post
1599
Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.

Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)

For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)

Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!

Because if your model knows its evals by heart, you're not testing for generalization.