Any plans to use RMSNorm (or FlashNorm) instead of LayerNorm?
1
#12 opened 5 months ago
by
graefics
lack of digit splitting in slow version of tokenizer
#11 opened 9 months ago
by
Forence
![](https://cdn-avatars.huggingface.co/v1/production/uploads/648c15252eada7f5f1aece0b/cD55tc4M44KLKTHCijnzT.jpeg)
Adding Evaluation Results
#10 opened 11 months ago
by
leaderboard-pr-bot
![](https://cdn-avatars.huggingface.co/v1/production/uploads/655506df9dc61e22c5f9c732/IZGvup0FdVlioPPIPnzZv.jpeg)
Big difference between the before-cooldown-ckpt and the final checkpoint in the results of downstream tasks?
1
#9 opened 11 months ago
by
siqi-zz
Adding Evaluation Results
#8 opened 11 months ago
by
leaderboard-pr-bot
![](https://cdn-avatars.huggingface.co/v1/production/uploads/655506df9dc61e22c5f9c732/IZGvup0FdVlioPPIPnzZv.jpeg)
Will there be a version with traditional Chinese in the future?
#5 opened about 1 year ago
by
win10
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1678188568629-noauth.png)
Training config link is broken
11
#3 opened about 1 year ago
by
davidgortega
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1663069207512-noauth.png)