duyntnet
/

phi-4-imatrix-GGUF

Text Generation

Model card Files Files and versions Community

phi-4-imatrix-GGUF / README.md

duyntnet's picture

Upload README.md

6ecf0a5 verified 21 days ago

|

3.62 kB

	---
	license: other
	language:
	- en
	pipeline_tag: text-generation
	inference: false
	tags:
	- transformers
	- gguf
	- imatrix
	- phi-4
	---
	Quantizations of https://huggingface.co./microsoft/phi-4

	### Inference Clients/UIs
	* [llama.cpp](https://github.com/ggerganov/llama.cpp)
	* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
	* [ollama](https://github.com/ollama/ollama)
	* [jan](https://github.com/janhq/jan)
	* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
	* [GPT4All](https://github.com/nomic-ai/gpt4all)
	---

	# From original readme

	\| \| \|
	\|-------------------------\|-------------------------------------------------------------------------------\|
	\| Developers \| Microsoft Research \|
	\| Description \| `phi-4` is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.<br><br>`phi-4` underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures \|
	\| Architecture \| 14B parameters, dense decoder-only Transformer model \|
	\| Inputs \| Text, best suited for prompts in the chat format \|
	\| Context length \| 16K tokens \|
	\| GPUs \| 1920 H100-80G \|
	\| Training time \| 21 days \|
	\| Training data \| 9.8T tokens \|
	\| Outputs \| Generated text in response to input \|
	\| Dates \| October 2024 – November 2024 \|
	\| Status \| Static model trained on an offline dataset with cutoff dates of June 2024 and earlier for publicly available data \|
	\| Release date \| December 12, 2024 \|
	\| License \| MIT \|


	### Input Formats

	Given the nature of the training data, `phi-4` is best suited for prompts using the chat format as follows:

	```bash
	<\|im_start\|>system<\|im_sep\|>
	You are a medieval knight and must provide explanations to modern people.<\|im_end\|>
	<\|im_start\|>user<\|im_sep\|>
	How should I explain the Internet?<\|im_end\|>
	<\|im_start\|>assistant<\|im_sep\|>
	```

	### With `transformers`

	```python
	import transformers

	pipeline = transformers.pipeline(
	"text-generation",
	model="microsoft/phi-4",
	model_kwargs={"torch_dtype": "auto"},
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "You are a medieval knight and must provide explanations to modern people."},
	{"role": "user", "content": "How should I explain the Internet?"},
	]

	outputs = pipeline(messages, max_new_tokens=128)
	print(outputs[0]["generated_text"][-1])
	```