umarbutler commited on
Commit
06071d6
1 Parent(s): 80abc11

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -57,7 +57,6 @@ The training dataset was subsequently fed to [DistilGPT2](https://huggingface.co
57
  | Batch size per device | 4 |
58
  | Weight decay | 0.01 |
59
  | Warmup ratio | 0.06 |
60
- | Gradient accumulation steps | 1 |
61
 
62
  After training for 3 epochs, or 465,441 steps, over a period of ~40 hours on a single GeForce RTX 2080 Ti, the model achieved a loss of 0.65.
63
 
 
57
  | Batch size per device | 4 |
58
  | Weight decay | 0.01 |
59
  | Warmup ratio | 0.06 |
 
60
 
61
  After training for 3 epochs, or 465,441 steps, over a period of ~40 hours on a single GeForce RTX 2080 Ti, the model achieved a loss of 0.65.
62