Daniel Bourke's picture

Daniel Bourke PRO

mrdbourke

AI & ML interests

Computer vision. Small on-device models. VLMs. High-quality tutorials.

Recent Activity

Organizations

None yet

mrdbourke's activity

liked a Space 18 days ago
replied to rwightman's post 21 days ago
view reply

MARS results look great! Just started a training run with cmars, will report back

liked a Space 22 days ago
reacted to cfahlgren1's post with ๐Ÿš€ 22 days ago
view post
Post
2991
We just dropped an LLM inside the SQL Console ๐Ÿคฏ

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any Hugging Face dataset โœจ

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think ๐Ÿค—
replied to rwightman's post 22 days ago
view reply

Woah, looks like a good boost across most results. Been using torch.optim.adamw for months. Will try out a training run today with timm.optim.cadamw

reacted to rwightman's post with ๐Ÿ”ฅ 22 days ago
view post
Post
1310
There's a new timm release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers() and new way to register optimizers and their attributes. As always you can use an timm optimizer like a torch one, just replace torch.optim with timm.optim

New optimizers include:
* AdafactorBigVision - adfactorbv
* ADOPT - adopt / adoptw (decoupled decay)
* MARS - mars
* LaProp - laprop
* Cautious Optimizers - a modification to all of the above, prefix with c as well as cadamw, cnadamw, csgdw, clamb, crmsproptf

I shared some caution comparisons in this model repo: rwightman/timm-optim-caution

For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

  • 3 replies
ยท