Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
60
40
515
martin
PRO
martintomov
Follow
ashish12345's profile picture
Tonic's profile picture
WebTinqer's profile picture
23 followers
·
9 following
AI & ML interests
None yet
Recent Activity
liked
a model
4 days ago
arnaudstiegler/sd-model-gameNgen-60ksteps
reacted
to
merve
's
post
with 🚀
8 days ago
Apollo is a new family of open-source video language models by Meta, where 3B model outperforms most 7B models and 7B outperforms most 30B models 🧶 ✨ the models come in 1.5B https://huggingface.co./Apollo-LMMs/Apollo-1_5B-t32, 3B https://huggingface.co./Apollo-LMMs/Apollo-3B-t32 and 7B https://huggingface.co./Apollo-LMMs/Apollo-7B-t32 with A2.0 license, based on Qwen1.5 & Qwen2 ✨ the authors also release a benchmark dataset https://huggingface.co./spaces/Apollo-LMMs/ApolloBench The paper has a lot of experiments (they trained 84 models!) about what makes the video LMs work ⏯️ Try the demo for best setup here https://huggingface.co./spaces/Apollo-LMMs/Apollo-3B they evaluate sampling strategies, scaling laws for models and datasets, video representation and more! > The authors find out that whatever design decision was applied to small models also scale properly when the model and dataset are scaled 📈 scaling dataset has diminishing returns for smaller models > They evaluate frame sampling strategies, and find that FPS sampling is better than uniform sampling, and they find 8-32 tokens per frame optimal > They also compare image encoders, they try a variation of models from shape optimized SigLIP to DINOv2 they find https://huggingface.co./google/siglip-so400m-patch14-384 to be most powerful 🔥 > they also compare freezing different parts of models, training all stages with some frozen parts give the best yield They eventually release three models, where Apollo-3B outperforms most 7B models and Apollo 7B outperforms 30B models 🔥
liked
a model
8 days ago
FastVideo/FastMochi
View all activity
Organizations
martintomov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a model
4 days ago
arnaudstiegler/sd-model-gameNgen-60ksteps
Text-to-Image
•
Updated
12 days ago
•
337
•
1
liked
3 models
8 days ago
FastVideo/FastMochi
Updated
15 days ago
•
6
FastVideo/FastMochi-diffusers
Text-to-Video
•
Updated
9 days ago
•
134
•
12
FastVideo/FastHunyuan
Text-to-Video
•
Updated
8 days ago
•
368
•
113
liked
a Space
12 days ago
Running
on
Zero
338
👈🖼️👉
Flux Fill Outpainting
liked
a model
12 days ago
hkchengrex/MMAudio
Updated
1 day ago
•
28
liked
a dataset
16 days ago
pandaphd/camera_settings
Viewer
•
Updated
21 days ago
•
3.5k
•
196
•
7
liked
a model
17 days ago
JeffreyXiang/TRELLIS-image-large
Image-to-3D
•
Updated
19 days ago
•
382k
•
247
liked
a Space
17 days ago
Running
on
Zero
2.06k
🏢
TRELLIS
Scalable and Versatile 3D Generation from images
liked
a model
21 days ago
briaai/BRIA-2.3-ControlNet-Generative-Fill
Text-to-Image
•
Updated
25 days ago
•
67
•
24
liked
a Space
21 days ago
Running
on
Zero
562
🚀
Flux Style Shaping
Optical illusions and style transfer with FLUX
liked
a model
22 days ago
Kijai/HunyuanVideo_comfy
Updated
8 days ago
•
176
liked
2 models
23 days ago
tencent/HunyuanVideo
Text-to-Video
•
Updated
8 days ago
•
7.07k
•
1.27k
showlab/ShowUI-2B
Updated
21 days ago
•
14.7k
•
210
liked
a Space
23 days ago
Running
on
Zero
189
💻
ShowUI
liked
2 models
25 days ago
Comfy-Org/mochi_preview_repackaged
Updated
Nov 5
•
47
Qwen/QwQ-32B-Preview
Text Generation
•
Updated
27 days ago
•
125k
•
•
1.42k
liked
a Space
28 days ago
Running
on
Zero
373
📈
IC Light V2-Vary
liked
2 models
29 days ago
Yuanshi/OminiControl
Image-to-Image
•
Updated
16 days ago
•
11.6k
•
100
nvidia/Hymba-1.5B-Instruct
Text Generation
•
Updated
6 days ago
•
14k
•
213
Load more