migtissera
/

Synthia-MoE-v3-Mixtral-8x7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Ubuntu commited on Dec 12, 2023

Commit

1c197f8

·

1 Parent(s): f9ecedb

adding readme

Files changed (1) hide show

README.md +13 -1

README.md CHANGED Viewed

@@ -4,9 +4,21 @@ license: apache-2.0
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
-This is Synthia trained on the official Mistral MoE version (Mixtral-8x7B).
 ```
 import torch, json

 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
+# Synthia-MoE-v3-Mixtral-8x7B
+This is Synthia-MoE-v3 trained on the official Mistral MoE version (Mixtral-8x7B).
+This model is trained on the Synthia-v3.0 dataset, that contains ~10K super high-quality GPT-4-Turbo generated samples. The samples contains Tree-of-Thought, Chain-of-Thought and other system contexts designed to evoke reasoning, philosophical thinking, use working memory and long chain of reasoning with multi-part questions.
+Further, this model is trained on the Orca-2 principle of replacing the system context with just one message. In the case of this Synthia-MoE-v3 model, the system context was not included at all.
+The evals are coming, but testing empirically the model produces highly intelligent, coherent results. Here's a sample conversation: https://migel.substack.com/p/a-conversation-with-synthia-moe-mixtral
+<br>
+![Synthia](https://huggingface.co/migtissera/Synthia-MoE-v3-Mixtral-8x7B/resolve/main/Synthia-MoE.png)
+<br>
 ```
 import torch, json