small but mighty π₯ you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM π«°π» also with gradient accumulation simulated batch size is 16 β¨ I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work π https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Want to validate some hparams or figure out what timm model to use before commiting to download or training with a large dataset? Try mini-imagenet: timm/mini-imagenet
I had this sitting on my drive and forgot where I pulled it together from. It's 100 classes of imagenet, 50k train and 10k val images (from ImageNet-1k train set), and 5k test images (from ImageNet-1k val set). 7.4GB instead of > 100GB for the full ImageNet-1k. This ver is not reduced resolution like some other 'mini' versions. Super easy to use with timm train/val scripts, checkout the dataset card.
I often check fine-tuning with even smaller datasets like: * timm/resisc45 * timm/oxford-iiit-pet But those are a bit small to train any modest size model w/o starting from pretrained weights.
Hello, researchers! I've tried to made reading HF Daily Papers easier and made a tool that does reviews with LLMs like Claude 3.5, GPT-4o and sometimes FLUX.
π Classification by topics π Sorting by publication date and HF addition date π Syncing every 2 hours π» Hosted on GitHub π English, Russian, and Chinese π Top by week/month (in progress)
A 'small' MobileNet-V4 update, I just pushed weights for the smallest model I've trained in the series, a 0.5 width multiplier version of the MobileNet-V4 Conv Small.
Now you may look at this and say hey, why is this impressive? 64.8% top-1 and 2.2M params? MobileNetV3-Small 0.75, and MobileNet-V2 0.5 are both fewer params (at ~2M) and over 65% top-1, what gives? Well this is where MobileNet-V4 differs from the previous versions of the model family, it trades off (gives up) a little parameter efficiency for some computational efficiency.