60 40 515

martin PRO

martintomov

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

arnaudstiegler/sd-model-gameNgen-60ksteps

reacted to merve's post with 🚀 8 days ago

Apollo is a new family of open-source video language models by Meta, where 3B model outperforms most 7B models and 7B outperforms most 30B models 🧶 ✨ the models come in 1.5B https://huggingface.co./Apollo-LMMs/Apollo-1_5B-t32, 3B https://huggingface.co./Apollo-LMMs/Apollo-3B-t32 and 7B https://huggingface.co./Apollo-LMMs/Apollo-7B-t32 with A2.0 license, based on Qwen1.5 & Qwen2 ✨ the authors also release a benchmark dataset https://huggingface.co./spaces/Apollo-LMMs/ApolloBench The paper has a lot of experiments (they trained 84 models!) about what makes the video LMs work ⏯️ Try the demo for best setup here https://huggingface.co./spaces/Apollo-LMMs/Apollo-3B they evaluate sampling strategies, scaling laws for models and datasets, video representation and more! > The authors find out that whatever design decision was applied to small models also scale properly when the model and dataset are scaled 📈 scaling dataset has diminishing returns for smaller models > They evaluate frame sampling strategies, and find that FPS sampling is better than uniform sampling, and they find 8-32 tokens per frame optimal > They also compare image encoders, they try a variation of models from shape optimized SigLIP to DINOv2 they find https://huggingface.co./google/siglip-so400m-patch14-384 to be most powerful 🔥 > they also compare freezing different parts of models, training all stages with some frozen parts give the best yield They eventually release three models, where Apollo-3B outperforms most 7B models and Apollo 7B outperforms 30B models 🔥

liked a model 8 days ago

FastVideo/FastMochi

View all activity

Organizations

martintomov's activity

liked a model 4 days ago

arnaudstiegler/sd-model-gameNgen-60ksteps

Text-to-Image • Updated 12 days ago • 337 • 1

liked 3 models 8 days ago

liked a Space 12 days ago

Running on Zero

338

👈🖼️👉

Flux Fill Outpainting

liked a model 12 days ago

hkchengrex/MMAudio

Updated 1 day ago • 28

liked a dataset 16 days ago

pandaphd/camera_settings

Viewer • Updated 21 days ago • 3.5k • 196 • 7

liked a model 17 days ago

JeffreyXiang/TRELLIS-image-large

Image-to-3D • Updated 19 days ago • 382k • 247

liked a Space 17 days ago

Running on Zero

2.06k

🏢

TRELLIS

Scalable and Versatile 3D Generation from images

liked a model 21 days ago

briaai/BRIA-2.3-ControlNet-Generative-Fill

Text-to-Image • Updated 25 days ago • 67 • 24

liked a Space 21 days ago

Running on Zero

562

🚀

Flux Style Shaping

Optical illusions and style transfer with FLUX

liked a model 22 days ago

Kijai/HunyuanVideo_comfy

Updated 8 days ago • 176

liked 2 models 23 days ago

tencent/HunyuanVideo

Text-to-Video • Updated 8 days ago • 7.07k • 1.27k

showlab/ShowUI-2B

Updated 21 days ago • 14.7k • 210

liked a Space 23 days ago

Running on Zero

189

💻

ShowUI

liked 2 models 25 days ago

Comfy-Org/mochi_preview_repackaged

Updated Nov 5 • 47

Qwen/QwQ-32B-Preview

Text Generation • Updated 27 days ago • 125k • • 1.42k

liked a Space 28 days ago

Running on Zero

373

📈

IC Light V2-Vary

liked 2 models 29 days ago

Yuanshi/OminiControl

Image-to-Image • Updated 16 days ago • 11.6k • 100

nvidia/Hymba-1.5B-Instruct

Text Generation • Updated 6 days ago • 14k • 213