HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-linear-6e-5-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 9
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-3e5-32k-500k-rope Text Generation • Updated 2 days ago • 12
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-cosine-1e-4-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 12
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-linear-1e-4-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 7
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-cosine-6e-5-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 15
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-linear-6e-5-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 9
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-linear-1e-4-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 7
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-cosine-1e-4-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 12
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-cosine-6e-5-optim-adamw_torch-4k Text Generation • Updated 2 days ago • 15
HuggingFaceTB/Qwen-Math-1.5B-Bespoke-sys-ep3-3e5-32k-500k-rope Text Generation • Updated 2 days ago • 12