Kquant03 commited on
Commit
2c5d8d0
·
1 Parent(s): 3fda34f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -9,7 +9,7 @@ A frankenMoE of [heegyu/WizardVicuna-Uncensored-3B-0719](https://huggingface.co/
9
 
10
  Unlike the last model, this is just the same model being used 16 times as experts. I felt like this would allow it to be more coherent, which was correct.
11
 
12
- # [What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)
13
  ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
14
 
15
  The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.
 
9
 
10
  Unlike the last model, this is just the same model being used 16 times as experts. I felt like this would allow it to be more coherent, which was correct.
11
 
12
+ # "[What is a Mixture of Experts (MoE)?](https://huggingface.co/blog/moe)"
13
  ### (from the MistralAI papers...click the quoted question above to navigate to it directly.)
14
 
15
  The scale of a model is one of the most important axes for better model quality. Given a fixed computing budget, training a larger model for fewer steps is better than training a smaller model for more steps.