Ubuntu
commited on
Commit
·
1c197f8
1
Parent(s):
f9ecedb
adding readme
Browse files
README.md
CHANGED
@@ -4,9 +4,21 @@ license: apache-2.0
|
|
4 |
|
5 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
6 |
|
|
|
7 |
|
|
|
8 |
|
9 |
-
This is
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
```
|
12 |
import torch, json
|
|
|
4 |
|
5 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
6 |
|
7 |
+
# Synthia-MoE-v3-Mixtral-8x7B
|
8 |
|
9 |
+
This is Synthia-MoE-v3 trained on the official Mistral MoE version (Mixtral-8x7B).
|
10 |
|
11 |
+
This model is trained on the Synthia-v3.0 dataset, that contains ~10K super high-quality GPT-4-Turbo generated samples. The samples contains Tree-of-Thought, Chain-of-Thought and other system contexts designed to evoke reasoning, philosophical thinking, use working memory and long chain of reasoning with multi-part questions.
|
12 |
+
|
13 |
+
Further, this model is trained on the Orca-2 principle of replacing the system context with just one message. In the case of this Synthia-MoE-v3 model, the system context was not included at all.
|
14 |
+
|
15 |
+
The evals are coming, but testing empirically the model produces highly intelligent, coherent results. Here's a sample conversation: https://migel.substack.com/p/a-conversation-with-synthia-moe-mixtral
|
16 |
+
|
17 |
+
<br>
|
18 |
+
|
19 |
+
![Synthia](https://huggingface.co/migtissera/Synthia-MoE-v3-Mixtral-8x7B/resolve/main/Synthia-MoE.png)
|
20 |
+
|
21 |
+
<br>
|
22 |
|
23 |
```
|
24 |
import torch, json
|