Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs
Paper
•
2411.08719
•
Published
We present a set of mulitlingual red-teamed models. Trained on the LUMI HPC in Finland (thus the name Aurora). See our paper: https://arxiv.org/abs/2404.00399