NanoGPT Speedrun

Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

Run Info

baseline/

  • Run on lightning cloud, using one L40S
  • Batch size set to 32
  • VRAM usage: 26.95GB (25698MB reported in nvidia-smi)
  • 4 seconds per step, total 3200 steps
  • Checkpoint saved every 320 steps

Training loss

To experimentally check the neural scaling law:

baseline/analysis/loss_plot2.png

(Fitted line: log y = -0.11 * log x + 0.9 where x is step (0 to 3200) and y is the training loss)

Demo

Available at https://huggingface.co./spaces/lemonteaa/nanogpt-speedrun-demo

(WIP)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for lemonteaa/nanogpt-speedrun

Finetuned
(1264)
this model

Dataset used to train lemonteaa/nanogpt-speedrun

Space using lemonteaa/nanogpt-speedrun 1