Pavel Iakubovskii

qubvel-hf

AI & ML interests

Computer Vision models

Recent Activity

Organizations

Hugging Face's profile picture PyTorch Image Models's profile picture Peking University's profile picture Hugging Face Internal Testing Organization's profile picture Huggingface Projects's profile picture Hugging Face OSS Metrics's profile picture Hugging Face for Computer Vision's profile picture kotol's profile picture yorg's profile picture CVPR2024's profile picture Hugging Face Discord Community's profile picture nltpt's profile picture s0409's profile picture Segmentation Models Pytorch's profile picture smp-test's profile picture University of Sydney's profile picture s0225's profile picture ETH Zurich - Computer Vision and Geometry Lab's profile picture

qubvel-hf's activity

upvoted an article 8 days ago
view article
Article

SmolVLM2: Bringing Video Understanding to Every Device

β€’ 205
reacted to clem's post with πŸ”₯ 8 days ago
view post
Post
5872
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise
New activity in facebook/sam-vit-large 11 days ago

Update code snippet

#6 opened 11 days ago by
qubvel-hf
New activity in facebook/sam-vit-huge 11 days ago

Update code snippet

#11 opened 11 days ago by
qubvel-hf
New activity in facebook/sam-vit-base 11 days ago

Update code snippet

#8 opened 11 days ago by
qubvel-hf
upvoted an article 13 days ago
view reply

btw, also observed "." and capitalized template influences the confidence quite a bit

view reply

Not sure what's up as I'm not familiar with this codebase (and no time to dig in), but for siglip what you're supposed to do is do sigmoid(zimg @ ztxt * temperature + bias)

from what you describe, I would bet the bias and/or temperature are missing?
The ground-truth reference code is https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb

Hey @giffmana , temperature and bias are applied under the hood, see

Siglip
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip/modeling_siglip.py#L1411-L1417

Siglip2
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip2/modeling_siglip2.py#L1459-L1465