lucyknada commited on
Commit
bd7d39b
1 Parent(s): ae6867b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +154 -0
README.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - anthracite-forge/magnum-v3-27b-kto-r3
4
+ - anthracite-forge/magnum-v3-27b-KTO-e1-r2
5
+ - anthracite-forge/magnum-v3-27b-KTO-e0.25-r1
6
+ - IntervitensInc/gemma-2-27b-chatml
7
+ library_name: transformers
8
+ ---
9
+
10
+ ## This repo contains GGUF quants of the model. If you need the original weights, please find them [here](https://huggingface.co/anthracite-org/magnum-v3-27b-kto).
11
+
12
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/GKpV5mwmnHFR6wIwTa91z.png)
13
+
14
+ This is the 12th in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus.
15
+
16
+ This model is the result of multiple KTO runs on top of one SFT run, all of which are published on [anthracite-forge](https://huggingface.co/anthracite-forge).
17
+
18
+ ## Methodology
19
+
20
+ R1 (SFT) was fine-tuned on top of `IntervitensInc/gemma-2-27b-chatml` which is chatMLified gemma-2-27b.
21
+
22
+ We have experimented with various SFT and KTO re-runs, ratios and merge methods and this was our winner, including what was liked most from each model.
23
+
24
+ If you prefer your own mix of the KTO runs or would like to use the SFT on its own, refer to the models section and [anthracite-forge](https://huggingface.co/anthracite-forge), some exl-quants are pre-included.
25
+
26
+ ## Models
27
+
28
+ * [anthracite-forge/magnum-v3-27b-kto-r3](https://huggingface.co/anthracite-forge/magnum-v3-27b-kto-r3)
29
+ * [anthracite-forge/magnum-v3-27b-KTO-e1-r2](https://huggingface.co/anthracite-forge/magnum-v3-27b-KTO-e1-r2)
30
+ * [anthracite-forge/magnum-v3-27b-KTO-e0.25-r1](https://huggingface.co/anthracite-forge/magnum-v3-27b-KTO-e0.25-r1)
31
+
32
+ ## Prompting
33
+ Model has been Instruct tuned with the ChatML formatting. A typical input would look like this:
34
+
35
+ ```py
36
+ """<|im_start|>system
37
+ system prompt<|im_end|>
38
+ <|im_start|>user
39
+ Hi there!<|im_end|>
40
+ <|im_start|>assistant
41
+ Nice to meet you!<|im_end|>
42
+ <|im_start|>user
43
+ Can I ask a question?<|im_end|>
44
+ <|im_start|>assistant
45
+ """
46
+ ```
47
+
48
+ ## SillyTavern templates
49
+
50
+ Below are Instruct and Context templates for use within SillyTavern.
51
+
52
+ <details><summary>context template</summary>
53
+
54
+ ```yaml
55
+ {
56
+ "story_string": "<|im_start|>system\n{{#if system}}{{system}}\n{{/if}}{{#if wiBefore}}{{wiBefore}}\n{{/if}}{{#if description}}{{description}}\n{{/if}}{{#if personality}}{{char}}'s personality: {{personality}}\n{{/if}}{{#if scenario}}Scenario: {{scenario}}\n{{/if}}{{#if wiAfter}}{{wiAfter}}\n{{/if}}{{#if persona}}{{persona}}\n{{/if}}{{trim}}<|im_end|>\n",
57
+ "example_separator": "",
58
+ "chat_start": "",
59
+ "use_stop_strings": false,
60
+ "allow_jailbreak": false,
61
+ "always_force_name2": true,
62
+ "trim_sentences": false,
63
+ "include_newline": false,
64
+ "single_line": false,
65
+ "name": "Magnum ChatML"
66
+ }
67
+ ```
68
+
69
+ </details><br>
70
+ <details><summary>instruct template</summary>
71
+
72
+ ```yaml
73
+ {
74
+ "system_prompt": "You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.",
75
+ "input_sequence": "<|im_start|>user\n",
76
+ "output_sequence": "<|im_start|>assistant\n",
77
+ "last_output_sequence": "",
78
+ "system_sequence": "<|im_start|>system\n",
79
+ "stop_sequence": "<|im_end|>",
80
+ "wrap": false,
81
+ "macro": true,
82
+ "names": true,
83
+ "names_force_groups": true,
84
+ "activation_regex": "",
85
+ "system_sequence_prefix": "",
86
+ "system_sequence_suffix": "",
87
+ "first_output_sequence": "",
88
+ "skip_examples": false,
89
+ "output_suffix": "<|im_end|>\n",
90
+ "input_suffix": "<|im_end|>\n",
91
+ "system_suffix": "<|im_end|>\n",
92
+ "user_alignment_message": "",
93
+ "system_same_as_user": false,
94
+ "last_system_sequence": "",
95
+ "name": "Magnum ChatML"
96
+ }
97
+ ```
98
+
99
+ </details><br>
100
+
101
+ ### Configuration
102
+
103
+ ```yaml
104
+ base_model: IntervitensInc/gemma-2-27b-chatml
105
+ dtype: float32
106
+ merge_method: task_arithmetic
107
+ models:
108
+ - model: IntervitensInc/gemma-2-27b-chatml
109
+ - model: anthracite-forge/magnum-v3-27b-KTO-e0.25-r1
110
+ parameters:
111
+ weight: 0.5
112
+ - model: anthracite-forge/magnum-v3-27b-KTO-e1-r2
113
+ parameters:
114
+ weight: 0.1
115
+ - model: anthracite-forge/magnum-v3-27b-kto-r3
116
+ parameters:
117
+ weight: 0.4
118
+ ```
119
+
120
+ ## Credits
121
+ We'd like to thank Recursal / Featherless for sponsoring the compute for this train, Featherless has been hosting our Magnum models since the first 72 B and has given thousands of people access to our models and helped us grow.
122
+
123
+ We would also like to thank all members of Anthracite who made this finetune possible.
124
+
125
+ ## Datasets
126
+
127
+ r1 consisted of:
128
+
129
+ ```
130
+ datasets:
131
+ - path: anthracite-org/stheno-filtered-v1.1
132
+ type: sharegpt
133
+ conversation: chatml
134
+ - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
135
+ type: sharegpt
136
+ conversation: chatml
137
+ - path: anthracite-org/nopm_claude_writing_fixed
138
+ type: sharegpt
139
+ conversation: chatml
140
+ - path: Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned
141
+ type: sharegpt
142
+ conversation: chatml
143
+ - path: Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned
144
+ type: sharegpt
145
+ conversation: chatml
146
+ ```
147
+
148
+ ## Training
149
+ The training was done for 2 epochs. We used 8x[H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Recursal AI](https://recursal.ai/) / [Featherless AI](https://featherless.ai/) for the full-parameter fine-tuning of the model.
150
+
151
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
152
+
153
+ ## Safety
154
+ ...