What is the maximum size of the image that it will support?
1
#67 opened 5 days ago
by
eddprogrammer
Concern about the performance while using multi image vs single image inference
#66 opened 8 days ago
by
wiccanmind
GGUF Quants
#65 opened 9 days ago
by
apepkuss79
GUI on windows for using this VL version?
#64 opened 13 days ago
by
sebastienbo
怎么让其做目标检测,输出图像中物体的坐标?
1
#63 opened 15 days ago
by
Wenrui3
Is it possible to only input text in Qwen/Qwen2-VL-7B-Instruct model?
1
#61 opened 21 days ago
by
ai-bond
Inference api
#60 opened 23 days ago
by
philip10
RuntimeError: cu_seqlens_q must have dtype int32
1
#59 opened 28 days ago
by
ginnyyk
Update README.md
#58 opened 29 days ago
by
mobi
testing for inference endpoints
3
#57 opened about 1 month ago
by
nbroad
transformers requirement
#53 opened about 1 month ago
by
nbroad
如果利用VL模型获取视觉层的Embedding
1
#52 opened about 1 month ago
by
weiminw
Updated README for GPU configuration.
#51 opened about 1 month ago
by
aliasgerovs
Anyone can prompt input to show the exactly of image size?
#50 opened about 1 month ago
by
xJohn
Stable transformer version
1
#49 opened about 1 month ago
by
Jkppp
Is visual grounding possible on multiple images?
1
#48 opened about 2 months ago
by
echooooooooo
How many tokens is one image?
1
#47 opened 2 months ago
by
MoritzLaurer
RuntimeError: CUDA error: operation not permitted when stream is capturing
1
#46 opened 2 months ago
by
yuyanggo
Adding Evaluation Results
#45 opened 2 months ago
by
leaderboard-pr-bot
CUDA error: CUBLAS_STATUS_EXECUTION_FAILED
#44 opened 2 months ago
by
yuyanggo
KeyError: 'qwen2_vl' loading from Transformers
1
#42 opened 2 months ago
by
KevalRx
Batch inference on many images
1
#41 opened 3 months ago
by
yadavsaakash
Handling multiple images in a pdf to preserve context during processing.
1
#40 opened 3 months ago
by
ananthv
Questions about Naive Dynamic Resolution and the vision mask
1
#39 opened 3 months ago
by
YaYaGeGe
it run on cpu
#38 opened 3 months ago
by
sdyy
Request for Help: Passing an Image in cURL with vLLM
2
#36 opened 3 months ago
by
ananthv
Ollama api setup for Qwen2
3
#35 opened 3 months ago
by
RagulMahendran
Neto discussion
#34 opened 3 months ago
by
Neto1780
An error occurred: shape mismatch
4
#33 opened 4 months ago
by
VeeP
Finetuning script using HuggingFace (No llama-factory)
19
#32 opened 4 months ago
by
2U1
Able to successfully deploy as Inference Endpoint?
#31 opened 4 months ago
by
philglazer
GGUF models
1
#30 opened 4 months ago
by
mariahelenass
可以用来做多模态检索吗
#29 opened 4 months ago
by
Lecheal
OCR on image
4
#28 opened 4 months ago
by
glitchyordis
Update chat_template.json to incorporate `generation` tag
1
#27 opened 4 months ago
by
linyueqian
Request: DOI
#26 opened 4 months ago
by
samzong
Value of fps for video inference
3
#25 opened 4 months ago
by
shivanis14
support in ollama
3
#21 opened 4 months ago
by
Goekdeniz-Guelmez
when i use torch.float16,i face this problem probability tensor contains either `inf`, `nan` or element < 0
2
#20 opened 4 months ago
by
als-991011
Can it be run on a 3090 with 24gb VRAM?
2
#18 opened 4 months ago
by
mnemic
Nerfed with people
2
#17 opened 4 months ago
by
spawn99
ValueError: Unrecognized configuration class <class 'transformers.models.qwen2_vl.configuration_qwen2_vl.Qwen2VLConfig'> for this kind of AutoModel: AutoModelForSeq2SeqLM.
1
#16 opened 4 months ago
by
vinz1396
Arabic
#15 opened 4 months ago
by
MubashshirMohammad
When extracting text from an image, some text is missing.
1
#14 opened 4 months ago
by
wol2001
Support for multi-round question answering in Qwen2-VL-7B-Instruct
#12 opened 4 months ago
by
zhanchao019
Working sample for mac
13
#11 opened 4 months ago
by
spawn99
RuntimeError: MPS backend out of memory.
1
#8 opened 4 months ago
by
TahaZk
LoRA Finetuning Tool for Qwen2-VL-7B in Web UI (DPO updated)
12
#2 opened 4 months ago
by
hiyouga
🍭 Fine-tuning support for Qwen2-VL-7B-Instruct
5
#1 opened 4 months ago
by
study-hjt