Update README.md

Browse files

Files changed (1) hide show

README.md +207 -0

README.md CHANGED Viewed

@@ -22,6 +22,213 @@ tags:
 This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`prithivMLmods/Phi-4-o1`](https://huggingface.co/prithivMLmods/Phi-4-o1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/prithivMLmods/Phi-4-o1) for more details on the model.
+---
+Model details:
+-
+[Phi-4 O1 finetuned] from Microsoft's Phi-4 is a state-of-the-art
+open model built upon a blend of synthetic datasets, data from filtered
+public domain websites, and acquired academic books and Q&A
+datasets. The goal of this approach is to ensure that small, capable
+models are trained with high-quality data focused on advanced reasoning.
+phi-4 has adopted a robust safety post-training approach. This
+approach leverages a variety of both open-source and in-house generated
+synthetic datasets. The overall technique employed to do the safety
+alignment is a combination of SFT (Supervised Fine-Tuning) and iterative
+ DPO (Direct Preference Optimization), including publicly available
+datasets focusing on helpfulness and harmlessness as well as various
+questions and answers targeted at multiple safety categories.
+		Dataset Info
+Phi-4 o1 ft is fine-tuned on a synthetic dataset curated through a
+pipeline explicitly built for this purpose. The data is primarily based
+on the Chain of Thought (CoT) or Chain of Continuous Thought (COCONUT)
+methodologies. This approach ensures that the dataset is rich in
+reasoning, problem-solving, and step-by-step breakdowns of complex
+tasks. The model is specifically designed to excel in reasoning,
+mathematics, and breaking down problems into logical, manageable steps.
+		Run with Transformers
+# pip install accelerate
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Phi-4-o1")
+model = AutoModelForCausalLM.from_pretrained(
+    "prithivMLmods/Phi-4-o1",
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+)
+input_text = "Write me a poem about Machine Learning."
+input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
+outputs = model.generate(**input_ids, max_new_tokens=32)
+print(tokenizer.decode(outputs[0]))
+You can ensure the correct chat template is applied by using tokenizer.apply_chat_template as follows:
+messages = [
+    {"role": "user", "content": "Write me a poem about Machine Learning."},
+]
+input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to("cuda")
+outputs = model.generate(**input_ids, max_new_tokens=256)
+print(tokenizer.decode(outputs[0]))
+		Intended Use
+The phi-4 o1 ft model is designed for a wide range of applications,
+particularly those requiring advanced reasoning, high-quality text
+generation, and multilingual capabilities. Below are some of the
+intended use cases:
+Complex Reasoning Tasks:
+Solving intricate problems in mathematics, logic, and science.
+Assisting in academic research by providing detailed explanations and summaries.
+Multilingual Applications:
+Translating text across multiple languages while preserving context and nuance.
+Generating content in various languages for global audiences.
+Content Creation:
+Assisting writers, marketers, and creators with high-quality text generation.
+Generating creative ideas, stories, and technical documentation.
+Educational Tools:
+Providing explanations, tutoring, and Q&A support for students and educators.
+Generating practice questions and answers for learning purposes.
+Customer Support:
+Automating responses to customer queries with accurate and helpful information.
+Handling complex customer service scenarios with advanced reasoning.
+Safety-Critical Applications:
+Ensuring responses are aligned with safety guidelines, making it suitable for sensitive domains.
+Providing harmlessness-focused interactions in public-facing applications.
+		Limitations
+While phi-4 o1 ft is a powerful and versatile model, it has certain limitations that users should be aware of:
+Bias and Fairness:
+Despite rigorous training and safety alignment, the model may still
+exhibit biases present in the training data. Users should critically
+evaluate outputs, especially in sensitive contexts.
+Contextual Understanding:
+The model may occasionally misinterpret complex or ambiguous prompts, leading to inaccurate or irrelevant responses.
+Real-Time Knowledge:
+The model's knowledge is limited to the data it was trained on and
+does not include real-time or post-training updates. It may not be aware
+ of recent events or developments.
+Safety and Harmlessness:
+While extensive efforts have been made to align the model with
+safety guidelines, it may still generate outputs that are inappropriate
+or harmful in certain contexts. Continuous monitoring and human
+oversight are recommended.
+Resource Requirements:
+Running the model efficiently may require significant computational
+resources, especially for large-scale or real-time applications.
+Ethical Considerations:
+The model should not be used for malicious purposes, such as
+generating harmful content, misinformation, or spam. Users are
+responsible for ensuring ethical use.
+Domain-Specific Limitations:
+While the model performs well on general-purpose tasks, it may lack
+depth in highly specialized domains (e.g., medical, legal, or financial
+fields) without additional fine-tuning.
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)