Post
637
Just released NVAMP Loss!
βοΈ modification of the cross-entropy loss function designed specifically for training LLMs.
βοΈ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
βοΈ more stable and efficient training, leading to models that generalize better.
Check it out, give it a spin, and let me know what you think!
Licensed under the Apache 2.0 license and ready to use. Happy training! π₯π€
https://github.com/mkurman/nvamp-loss
βοΈ modification of the cross-entropy loss function designed specifically for training LLMs.
βοΈ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
βοΈ more stable and efficient training, leading to models that generalize better.
Check it out, give it a spin, and let me know what you think!
Licensed under the Apache 2.0 license and ready to use. Happy training! π₯π€
https://github.com/mkurman/nvamp-loss