Triangle104 commited on
Commit
aaaf409
·
verified ·
1 Parent(s): 1d58a6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +207 -0
README.md CHANGED
@@ -22,6 +22,213 @@ tags:
22
  This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
23
  Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ## Use with llama.cpp
26
  Install llama.cpp through brew (works on Mac and Linux)
27
 
 
22
  This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
23
  Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
24
 
25
+ ---
26
+ Model details:
27
+ -
28
+ [Phi-4 O1 finetuned] from Microsoft's Phi-4 is a state-of-the-art
29
+ open model built upon a blend of synthetic datasets, data from filtered
30
+ public domain websites, and acquired academic books and Q&A
31
+ datasets. The goal of this approach is to ensure that small, capable
32
+ models are trained with high-quality data focused on advanced reasoning.
33
+
34
+
35
+ phi-4 has adopted a robust safety post-training approach. This
36
+ approach leverages a variety of both open-source and in-house generated
37
+ synthetic datasets. The overall technique employed to do the safety
38
+ alignment is a combination of SFT (Supervised Fine-Tuning) and iterative
39
+ DPO (Direct Preference Optimization), including publicly available
40
+ datasets focusing on helpfulness and harmlessness as well as various
41
+ questions and answers targeted at multiple safety categories.
42
+
43
+
44
+
45
+
46
+
47
+
48
+
49
+ Dataset Info
50
+
51
+
52
+
53
+
54
+ Phi-4 o1 ft is fine-tuned on a synthetic dataset curated through a
55
+ pipeline explicitly built for this purpose. The data is primarily based
56
+ on the Chain of Thought (CoT) or Chain of Continuous Thought (COCONUT)
57
+ methodologies. This approach ensures that the dataset is rich in
58
+ reasoning, problem-solving, and step-by-step breakdowns of complex
59
+ tasks. The model is specifically designed to excel in reasoning,
60
+ mathematics, and breaking down problems into logical, manageable steps.
61
+
62
+
63
+
64
+
65
+
66
+
67
+
68
+ Run with Transformers
69
+
70
+
71
+
72
+
73
+ # pip install accelerate
74
+ from transformers import AutoTokenizer, AutoModelForCausalLM
75
+ import torch
76
+
77
+ tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Phi-4-o1")
78
+ model = AutoModelForCausalLM.from_pretrained(
79
+ "prithivMLmods/Phi-4-o1",
80
+ device_map="auto",
81
+ torch_dtype=torch.bfloat16,
82
+ )
83
+
84
+ input_text = "Write me a poem about Machine Learning."
85
+ input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
86
+
87
+ outputs = model.generate(**input_ids, max_new_tokens=32)
88
+ print(tokenizer.decode(outputs[0]))
89
+
90
+
91
+
92
+ You can ensure the correct chat template is applied by using tokenizer.apply_chat_template as follows:
93
+
94
+
95
+ messages = [
96
+ {"role": "user", "content": "Write me a poem about Machine Learning."},
97
+ ]
98
+ input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to("cuda")
99
+
100
+ outputs = model.generate(**input_ids, max_new_tokens=256)
101
+ print(tokenizer.decode(outputs[0]))
102
+
103
+
104
+
105
+
106
+
107
+
108
+
109
+
110
+ Intended Use
111
+
112
+
113
+
114
+
115
+ The phi-4 o1 ft model is designed for a wide range of applications,
116
+ particularly those requiring advanced reasoning, high-quality text
117
+ generation, and multilingual capabilities. Below are some of the
118
+ intended use cases:
119
+
120
+
121
+ Complex Reasoning Tasks:
122
+
123
+
124
+ Solving intricate problems in mathematics, logic, and science.
125
+ Assisting in academic research by providing detailed explanations and summaries.
126
+
127
+
128
+ Multilingual Applications:
129
+
130
+
131
+ Translating text across multiple languages while preserving context and nuance.
132
+ Generating content in various languages for global audiences.
133
+
134
+
135
+ Content Creation:
136
+
137
+
138
+ Assisting writers, marketers, and creators with high-quality text generation.
139
+ Generating creative ideas, stories, and technical documentation.
140
+
141
+
142
+ Educational Tools:
143
+
144
+
145
+ Providing explanations, tutoring, and Q&A support for students and educators.
146
+ Generating practice questions and answers for learning purposes.
147
+
148
+
149
+ Customer Support:
150
+
151
+
152
+ Automating responses to customer queries with accurate and helpful information.
153
+ Handling complex customer service scenarios with advanced reasoning.
154
+
155
+
156
+ Safety-Critical Applications:
157
+
158
+
159
+ Ensuring responses are aligned with safety guidelines, making it suitable for sensitive domains.
160
+ Providing harmlessness-focused interactions in public-facing applications.
161
+
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+ Limitations
171
+
172
+
173
+
174
+
175
+ While phi-4 o1 ft is a powerful and versatile model, it has certain limitations that users should be aware of:
176
+
177
+
178
+ Bias and Fairness:
179
+
180
+
181
+ Despite rigorous training and safety alignment, the model may still
182
+ exhibit biases present in the training data. Users should critically
183
+ evaluate outputs, especially in sensitive contexts.
184
+
185
+
186
+ Contextual Understanding:
187
+
188
+
189
+ The model may occasionally misinterpret complex or ambiguous prompts, leading to inaccurate or irrelevant responses.
190
+
191
+
192
+ Real-Time Knowledge:
193
+
194
+
195
+ The model's knowledge is limited to the data it was trained on and
196
+ does not include real-time or post-training updates. It may not be aware
197
+ of recent events or developments.
198
+
199
+
200
+ Safety and Harmlessness:
201
+
202
+
203
+ While extensive efforts have been made to align the model with
204
+ safety guidelines, it may still generate outputs that are inappropriate
205
+ or harmful in certain contexts. Continuous monitoring and human
206
+ oversight are recommended.
207
+
208
+
209
+ Resource Requirements:
210
+
211
+
212
+ Running the model efficiently may require significant computational
213
+ resources, especially for large-scale or real-time applications.
214
+
215
+
216
+ Ethical Considerations:
217
+
218
+
219
+ The model should not be used for malicious purposes, such as
220
+ generating harmful content, misinformation, or spam. Users are
221
+ responsible for ensuring ethical use.
222
+
223
+
224
+ Domain-Specific Limitations:
225
+
226
+
227
+ While the model performs well on general-purpose tasks, it may lack
228
+ depth in highly specialized domains (e.g., medical, legal, or financial
229
+ fields) without additional fine-tuning.
230
+
231
+ ---
232
  ## Use with llama.cpp
233
  Install llama.cpp through brew (works on Mac and Linux)
234