68 57 108

Pavel Iakubovskii

qubvel-hf

AI & ML interests

Computer Vision models

Recent Activity

liked a model 5 days ago

xingyang1/Distill-Any-Depth

upvoted a paper 6 days ago

Unified Video Action Model

upvoted an article 6 days ago

SmolVLM2: Bringing Video Understanding to Every Device

View all activity

Organizations

qubvel-hf's activity

liked a model 5 days ago

xingyang1/Distill-Any-Depth

Depth Estimation • Updated 4 days ago • 25

upvoted a paper 6 days ago

Unified Video Action Model

Paper • 2503.00200 • Published 11 days ago • 12

upvoted an article 6 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

20 days ago

• 202

reacted to clem's post with 🔥 6 days ago

Post

5847

Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co./nvidia
Enterprise hub: https://huggingface.co./enterprise

liked a Space 6 days ago

120

Distill Any Depth

💻

Generate depth maps from your images

liked a Space 7 days ago

112

Pop2Piano Demo

🎹

Convert pop audio to piano cover

updated a model 8 days ago

facebook/sam-vit-large

Mask Generation • Updated Jan 11, 2024 • 196k • 28

New activity in facebook/sam-vit-large 8 days ago

Update code snippet

#6 opened 8 days ago by

qubvel-hf

updated a model 8 days ago

facebook/sam-vit-huge

Mask Generation • Updated Jan 11, 2024 • 179k • 152

New activity in facebook/sam-vit-huge 8 days ago

Update code snippet

#11 opened 8 days ago by

qubvel-hf

updated a model 8 days ago

facebook/sam-vit-base

Mask Generation • Updated Jan 11, 2024 • 1.09M • 131

New activity in facebook/sam-vit-base 8 days ago

Update code snippet

#8 opened 8 days ago by

qubvel-hf

upvoted an article 10 days ago

Article

Common AI Model Formats

•

12 days ago

• 27

upvoted a paper 11 days ago

MegaLoc: One Retrieval to Place Them All

Paper • 2502.17237 • Published 15 days ago • 1

New activity in google/siglip2-base-patch16-224 11 days ago

SigLip2 Does Not Reproduce Expected Results

#7 opened 14 days ago by

dogukan-bg

commented on SigLIP 2: A better multilingual vision language encoder 13 days ago

btw, also observed "." and capitalized template influences the confidence quite a bit

commented on SigLIP 2: A better multilingual vision language encoder 13 days ago

Not sure what's up as I'm not familiar with this codebase (and no time to dig in), but for siglip what you're supposed to do is do sigmoid(zimg @ ztxt * temperature + bias)

from what you describe, I would bet the bias and/or temperature are missing?
The ground-truth reference code is https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb

Hey @giffmana , temperature and bias are applied under the hood, see

Siglip
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip/modeling_siglip.py#L1411-L1417

Siglip2
https://github.com/huggingface/transformers/blob/17792556b21b4da0dbb9e4b59b39fb34aae4047c/src/transformers/models/siglip2/modeling_siglip2.py#L1459-L1465

liked a Space 13 days ago

Phi4 Multimodal

🦀

Space demoing Phi4 MultiModal

New activity in google/siglip2-base-patch16-224 14 days ago

Error while loading processor: TypeError: expected str, bytes or os.PathLike object, not NoneType

#2 opened 19 days ago by

armamut

question about 'model_type' in config.json

#5 opened 15 days ago by

XA-hyy