Fynn Kröger's picture

Fynn Kröger

fynnkroeger

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

Defeating Prompt Injections by Design

reacted to cogwheelhead's post with 👍 about 1 month ago

Me and my team have performed an in-depth investigation comparing o1 to R1 (and other reasoning models) Link: https://toloka.ai/blog/r1-is-not-on-par-with-o1-and-the-difference-is-qualitative-not-quantitative It started with us evaluating them on our own university-math benchmarks: U-MATH for problem-solving and μ-MATH for judging solution correctness (see the HF leaderboard: https://huggingface.co/spaces/toloka/u-math-leaderboard) tl;dr: R1 sure is amazing, but what we find is that it lags behind in novelty adaptation and reliability: * performance drops when updating benchmarks with fresh unseen tasks (e.g. AIME 2024 -> 2025) * R1-o1 gap widens when evaluating niche subdomains (e.g. university-specific math instead of the more common Olympiad-style contests) * same with going into altogether unconventional domains (e.g. chess) or skills (e.g. judgment instead of problem-solving) * R1 also runs into failure modes way more often (e.g. making illegal chess moves or falling into endless generation loops) Our point here is not to bash on DeepSeek — they've done exceptional work, R1 is a game-changer, and we have no intention to downplay that. R1's release is a perfect opportunity to study where all these models differ and gain understanding on how to move forward from here

reacted to lysandre's post with ❤️ about 1 month ago

SmolVLM-2 and SigLIP-2 are now part of `transformers` in dedicated releases! They're added on top of the v4.49.0 release, and can be installed from the following tags: `v4.49.0-SmolVLM-2` and `v4.49.0-SigLIP-2`. This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures). Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models. Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.

View all activity

Organizations

None yet

fynnkroeger's activity

upvoted a paper 10 days ago

Defeating Prompt Injections by Design

Paper • 2503.18813 • Published 11 days ago • 18

upvoted a paper about 1 month ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 139

upvoted a paper 4 months ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 46

upvoted 2 papers 5 months ago

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 30

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 93

upvoted 3 papers 6 months ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17, 2024 • 34

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42

MaskBit: Embedding-free Image Generation via Bit Tokens

Paper • 2409.16211 • Published Sep 24, 2024 • 17

upvoted 5 papers 7 months ago

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published Sep 16, 2024 • 45

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 79

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Paper • 2408.17253 • Published Aug 30, 2024 • 39

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 95

Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 26

upvoted 6 papers 8 months ago

Towards Conversational Diagnostic AI

Paper • 2401.05654 • Published Jan 11, 2024 • 19

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20, 2024 • 12

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 60

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15, 2024 • 45

POA: Pre-training Once for Models of All Sizes

Paper • 2408.01031 • Published Aug 2, 2024 • 28

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 114

upvoted a paper 9 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 56