Triangle104
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,209 @@ tags:
|
|
22 |
This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
23 |
Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Use with llama.cpp
|
26 |
Install llama.cpp through brew (works on Mac and Linux)
|
27 |
|
|
|
22 |
This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
23 |
Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
|
24 |
|
25 |
+
---
|
26 |
+
Model details:
|
27 |
+
-
|
28 |
+
[Phi-4 O1 finetuned] from Microsoft's Phi-4 is a state-of-the-art
|
29 |
+
open model built upon a blend of synthetic datasets, data from filtered
|
30 |
+
public domain websites, and acquired academic books and Q&A
|
31 |
+
datasets. The goal of this approach is to ensure that small, capable
|
32 |
+
models are trained with high-quality data focused on advanced reasoning.
|
33 |
+
|
34 |
+
|
35 |
+
phi-4 has adopted a robust safety post-training approach. This
|
36 |
+
approach leverages a variety of both open-source and in-house generated
|
37 |
+
synthetic datasets. The overall technique employed to do the safety
|
38 |
+
alignment is a combination of SFT (Supervised Fine-Tuning) and iterative
|
39 |
+
DPO (Direct Preference Optimization), including publicly available
|
40 |
+
datasets focusing on helpfulness and harmlessness as well as various
|
41 |
+
questions and answers targeted at multiple safety categories.
|
42 |
+
|
43 |
+
|
44 |
+
|
45 |
+
|
46 |
+
|
47 |
+
|
48 |
+
|
49 |
+
Dataset Info
|
50 |
+
|
51 |
+
|
52 |
+
|
53 |
+
|
54 |
+
Phi-4 o1 ft is fine-tuned on a synthetic dataset curated through a
|
55 |
+
pipeline explicitly built for this purpose. The data is primarily based
|
56 |
+
on the Chain of Thought (CoT) or Chain of Continuous Thought (COCONUT)
|
57 |
+
methodologies. This approach ensures that the dataset is rich in
|
58 |
+
reasoning, problem-solving, and step-by-step breakdowns of complex
|
59 |
+
tasks. The model is specifically designed to excel in reasoning,
|
60 |
+
mathematics, and breaking down problems into logical, manageable steps.
|
61 |
+
|
62 |
+
|
63 |
+
|
64 |
+
|
65 |
+
|
66 |
+
|
67 |
+
|
68 |
+
Run with Transformers
|
69 |
+
|
70 |
+
|
71 |
+
|
72 |
+
|
73 |
+
# pip install accelerate
|
74 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
75 |
+
import torch
|
76 |
+
|
77 |
+
tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Phi-4-o1")
|
78 |
+
model = AutoModelForCausalLM.from_pretrained(
|
79 |
+
"prithivMLmods/Phi-4-o1",
|
80 |
+
device_map="auto",
|
81 |
+
torch_dtype=torch.bfloat16,
|
82 |
+
)
|
83 |
+
|
84 |
+
input_text = "Write me a poem about Machine Learning."
|
85 |
+
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
|
86 |
+
|
87 |
+
outputs = model.generate(**input_ids, max_new_tokens=32)
|
88 |
+
print(tokenizer.decode(outputs[0]))
|
89 |
+
|
90 |
+
You can ensure the correct chat template is applied by using tokenizer.apply_chat_template as follows:
|
91 |
+
|
92 |
+
|
93 |
+
messages = [
|
94 |
+
{"role": "user", "content": "Write me a poem about Machine Learning."},
|
95 |
+
]
|
96 |
+
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to("cuda")
|
97 |
+
|
98 |
+
outputs = model.generate(**input_ids, max_new_tokens=256)
|
99 |
+
print(tokenizer.decode(outputs[0]))
|
100 |
+
|
101 |
+
|
102 |
+
|
103 |
+
|
104 |
+
|
105 |
+
|
106 |
+
Intended Use
|
107 |
+
|
108 |
+
|
109 |
+
|
110 |
+
|
111 |
+
The phi-4 o1 ft model is designed for a wide range of applications,
|
112 |
+
particularly those requiring advanced reasoning, high-quality text
|
113 |
+
generation, and multilingual capabilities. Below are some of the
|
114 |
+
intended use cases:
|
115 |
+
|
116 |
+
|
117 |
+
Complex Reasoning Tasks:
|
118 |
+
|
119 |
+
|
120 |
+
Solving intricate problems in mathematics, logic, and science.
|
121 |
+
Assisting in academic research by providing detailed explanations and summaries.
|
122 |
+
|
123 |
+
|
124 |
+
Multilingual Applications:
|
125 |
+
|
126 |
+
|
127 |
+
Translating text across multiple languages while preserving context and nuance.
|
128 |
+
Generating content in various languages for global audiences.
|
129 |
+
|
130 |
+
|
131 |
+
Content Creation:
|
132 |
+
|
133 |
+
|
134 |
+
Assisting writers, marketers, and creators with high-quality text generation.
|
135 |
+
Generating creative ideas, stories, and technical documentation.
|
136 |
+
|
137 |
+
|
138 |
+
Educational Tools:
|
139 |
+
|
140 |
+
|
141 |
+
Providing explanations, tutoring, and Q&A support for students and educators.
|
142 |
+
Generating practice questions and answers for learning purposes.
|
143 |
+
|
144 |
+
|
145 |
+
Customer Support:
|
146 |
+
|
147 |
+
|
148 |
+
Automating responses to customer queries with accurate and helpful information.
|
149 |
+
Handling complex customer service scenarios with advanced reasoning.
|
150 |
+
|
151 |
+
|
152 |
+
Safety-Critical Applications:
|
153 |
+
|
154 |
+
|
155 |
+
Ensuring responses are aligned with safety guidelines, making it suitable for sensitive domains.
|
156 |
+
Providing harmlessness-focused interactions in public-facing applications.
|
157 |
+
|
158 |
+
|
159 |
+
|
160 |
+
|
161 |
+
|
162 |
+
|
163 |
+
|
164 |
+
|
165 |
+
|
166 |
+
Limitations
|
167 |
+
|
168 |
+
|
169 |
+
|
170 |
+
|
171 |
+
While phi-4 o1 ft is a powerful and versatile model, it has certain limitations that users should be aware of:
|
172 |
+
|
173 |
+
|
174 |
+
Bias and Fairness:
|
175 |
+
|
176 |
+
|
177 |
+
Despite rigorous training and safety alignment, the model may still
|
178 |
+
exhibit biases present in the training data. Users should critically
|
179 |
+
evaluate outputs, especially in sensitive contexts.
|
180 |
+
|
181 |
+
|
182 |
+
Contextual Understanding:
|
183 |
+
|
184 |
+
|
185 |
+
The model may occasionally misinterpret complex or ambiguous prompts, leading to inaccurate or irrelevant responses.
|
186 |
+
|
187 |
+
|
188 |
+
Real-Time Knowledge:
|
189 |
+
|
190 |
+
|
191 |
+
The model's knowledge is limited to the data it was trained on and
|
192 |
+
does not include real-time or post-training updates. It may not be aware
|
193 |
+
of recent events or developments.
|
194 |
+
|
195 |
+
|
196 |
+
Safety and Harmlessness:
|
197 |
+
|
198 |
+
|
199 |
+
While extensive efforts have been made to align the model with
|
200 |
+
safety guidelines, it may still generate outputs that are inappropriate
|
201 |
+
or harmful in certain contexts. Continuous monitoring and human
|
202 |
+
oversight are recommended.
|
203 |
+
|
204 |
+
|
205 |
+
Resource Requirements:
|
206 |
+
|
207 |
+
|
208 |
+
Running the model efficiently may require significant computational
|
209 |
+
resources, especially for large-scale or real-time applications.
|
210 |
+
|
211 |
+
|
212 |
+
Ethical Considerations:
|
213 |
+
|
214 |
+
|
215 |
+
The model should not be used for malicious purposes, such as
|
216 |
+
generating harmful content, misinformation, or spam. Users are
|
217 |
+
responsible for ensuring ethical use.
|
218 |
+
|
219 |
+
|
220 |
+
Domain-Specific Limitations:
|
221 |
+
|
222 |
+
|
223 |
+
While the model performs well on general-purpose tasks, it may lack
|
224 |
+
depth in highly specialized domains (e.g., medical, legal, or financial
|
225 |
+
fields) without additional fine-tuning.
|
226 |
+
|
227 |
+
---
|
228 |
## Use with llama.cpp
|
229 |
Install llama.cpp through brew (works on Mac and Linux)
|
230 |
|