Weyaxi commited on
Commit
644f202
·
verified ·
1 Parent(s): bd6d8e7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: yi-license
4
+ license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
5
+ tags:
6
+ - yi
7
+ - moe
8
+ ---
9
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/jVCgVixLmOsAofXVUUgkg.jpeg)
10
+
11
+ # Cosmosis-3x34B
12
+
13
+ This is the model for Cosmosis-3x34B. I used [mergekit](https://github.com/cg123/mergekit) to make this MOE model.
14
+
15
+ # Prompt Template(s):
16
+
17
+ Since [bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2) uses many prompt templates, you can utilize prompt templates provided by bagel and other expert's prompt templates.
18
+
19
+ **Note:** I currently do not know which prompt template is best.
20
+
21
+ ### ChatML:
22
+
23
+ ```
24
+ <|im_start|>system
25
+ {system}<|im_end|>
26
+ <|im_start|>user
27
+ {user}<|im_end|>
28
+ <|im_start|>assistant
29
+ {asistant}<|im_end|>
30
+ ```
31
+
32
+ ### Human Asistant
33
+
34
+ ```
35
+ Human: {user}
36
+
37
+ ### Assistant: {asistant}
38
+ ```
39
+
40
+ ### Alpaca (sort of)
41
+
42
+ ```
43
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
44
+
45
+ ### Instruction:
46
+ {system}
47
+ {instruction}
48
+
49
+ ### Response:
50
+ ```
51
+
52
+ ### Vicuna
53
+
54
+ ```
55
+ {system}
56
+ USER: {instruction}
57
+ ASSISTANT:
58
+ ```
59
+
60
+ Visit [bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2) to try more prompt templates.
61
+
62
+ # Yaml Config to reproduce
63
+
64
+ ```yaml
65
+ base_model: nontoxic-bagel-34b-v0.2
66
+ gate_mode: hidden
67
+ dtype: bfloat16
68
+
69
+ experts:
70
+ - source_model: bagel-dpo-34b-v0.2
71
+ positive_prompts: ["question answering", "Q:", science", "biology", "chemistry", "physics"]
72
+ negative_prompts: ["math", "reason", "mathematics", "solve", "count", "code", "python", "javascript", "programming", "algorithm"]
73
+
74
+ - source_model: Nous-Hermes-2-Yi-34B
75
+ positive_prompts: ["chat", "math", "reason", "mathematics", "solve", "count", "python", "javascript", "programming", "algorithm", "tell me", "assistant"]
76
+
77
+ - source_model: SUS-Chat-34B
78
+ positive_prompts: ["math", "reason", "mathematics", "solve", "count", "assistant"]
79
+ ```
80
+
81
+ # Quantizationed versions
82
+
83
+ Quantizationed versions of this model is available thanks to [TheBloke](https://hf.co/TheBloke).
84
+
85
+ ##### GPTQ
86
+
87
+ - [TheBloke/Cosmosis-3x34B-GPTQ](https://huggingface.co/TheBloke/Cosmosis-3x34B-GPTQ)
88
+
89
+ ##### GGUF
90
+
91
+ - [TheBloke/Cosmosis-3x34B-GGUF](https://huggingface.co/TheBloke/Cosmosis-3x34B-GGUF)
92
+
93
+ ##### AWQ
94
+
95
+ - [TheBloke/Cosmosis-3x34B-AWQ](https://huggingface.co/TheBloke/Cosmosis-3x34B-AWQ)
96
+
97
+ If you would like to support me:
98
+
99
+ [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)