TinyLLaVA: A Framework of Small-scale Large Multimodal Models
Baichuan Zhou
bczhou
AI & ML interests
Computer Vision
Recent Activity
upvoted
a
paper
4 days ago
ABC: Achieving Better Control of Multimodal Embeddings using VLMs
new activity
17 days ago
perplexity-ai/r1-1776:Masking Cultural Imperialism as 'Debiasing'
upvoted
a
paper
about 1 month ago
Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive
Modality Alignment
Organizations
Collections
1
spaces
1
models
8
bczhou/tiny-llava-v1-hf
Image-Text-to-Text
•
Updated
•
18k
•
56
bczhou/TinyLLaVA-2.0B
Image-Text-to-Text
•
Updated
•
391
•
5
bczhou/TinyLLaVA-1.5B
Image-Text-to-Text
•
Updated
•
222
•
16
bczhou/TinyLLaVA-3.1B-Pretrain
Text Generation
•
Updated
•
18
bczhou/TinyLLaVA-3.1B
Text Generation
•
Updated
•
264
•
25
bczhou/TinyLLaVA-2.0B-SigLIP
Updated
•
502
•
1
bczhou/TinyLLaVA-1.5B-SigLIP
Updated
•
152
•
1
bczhou/TinyLLaVA-3.1B-SigLIP
Updated
•
234
•
4
datasets
7
bczhou/UrBench
Updated
•
515
•
3
bczhou/LOKI
Preview
•
Updated
•
73
bczhou/CityBench-SubTasks
Viewer
•
Updated
•
12.8k
•
19
bczhou/SyntheticBench-Videos
Viewer
•
Updated
•
264
•
14
bczhou/CityBench-v0.3
Viewer
•
Updated
•
9.71k
•
13
bczhou/CityBench-v0.2
Viewer
•
Updated
•
9.71k
•
15
bczhou/CityVQA-v0.2
Viewer
•
Updated
•
2.5k
•
17
•
1