68 57 108

Pavel Iakubovskii

qubvel-hf

AI & ML interests

Computer Vision models

Recent Activity

liked a model 8 days ago

xingyang1/Distill-Any-Depth

upvoted a paper 8 days ago

Unified Video Action Model

upvoted an article 8 days ago

SmolVLM2: Bringing Video Understanding to Every Device

View all activity

Organizations

qubvel-hf's activity

liked a model 8 days ago

xingyang1/Distill-Any-Depth

Depth Estimation • Updated 1 day ago • 28

upvoted a paper 8 days ago

Unified Video Action Model

Paper • 2503.00200 • Published 14 days ago • 12

upvoted an article 8 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

23 days ago

• 205

reacted to clem's post with 🔥 8 days ago

Post

5872

Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise

liked 2 Spaces 9 days ago

127

Distill Any Depth

💻

Generate depth maps from your images

113

Pop2Piano Demo

🎹

Convert pop audio to piano cover

updated a model 11 days ago

facebook/sam-vit-large

Mask Generation • Updated Jan 11, 2024 • 198k • 28

New activity in facebook/sam-vit-large 11 days ago

Update code snippet

#6 opened 11 days ago by

qubvel-hf

updated a model 11 days ago

facebook/sam-vit-huge

Mask Generation • Updated Jan 11, 2024 • 184k • 152

New activity in facebook/sam-vit-huge 11 days ago

Update code snippet

#11 opened 11 days ago by

qubvel-hf

updated a model 11 days ago

facebook/sam-vit-base

Mask Generation • Updated Jan 11, 2024 • 1.08M • 132

New activity in facebook/sam-vit-base 11 days ago

Update code snippet

#8 opened 11 days ago by

qubvel-hf

upvoted an article 13 days ago

Article

Common AI Model Formats

•

15 days ago

• 30

upvoted a paper 13 days ago

MegaLoc: One Retrieval to Place Them All

Paper • 2502.17237 • Published 18 days ago • 1

New activity in google/siglip2-base-patch16-224 14 days ago

SigLip2 Does Not Reproduce Expected Results

#7 opened 17 days ago by

dogukan-bg

commented on SigLIP 2: A better multilingual vision language encoder 15 days ago

btw, also observed "." and capitalized template influences the confidence quite a bit

commented on SigLIP 2: A better multilingual vision language encoder 15 days ago

Not sure what's up as I'm not familiar with this codebase (and no time to dig in), but for siglip what you're supposed to do is do sigmoid(zimg @ ztxt * temperature + bias)

from what you describe, I would bet the bias and/or temperature are missing?
The ground-truth reference code is https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb

Hey @giffmana , temperature and bias are applied under the hood, see

Siglip
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip/modeling_siglip.py#L1411-L1417

Siglip2
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip2/modeling_siglip2.py#L1459-L1465

liked a Space 16 days ago

Phi4 Multimodal

🦀

Space demoing Phi4 MultiModal

New activity in google/siglip2-base-patch16-224 17 days ago

Error while loading processor: TypeError: expected str, bytes or os.PathLike object, not NoneType

#2 opened 21 days ago by

armamut

question about 'model_type' in config.json

#5 opened 17 days ago by

XA-hyy