duyntnet
/

gemma-2-9b-it-WPO-HB-imatrix-GGUF

Text Generation

gemma-2-9b-it-WPO-HB

Model card Files Files and versions Community

gemma-2-9b-it-WPO-HB-imatrix-GGUF / README.md

duyntnet's picture

Upload README.md

dada1a9 verified about 1 month ago

|

history blame contribute delete

1.14 kB

	---
	license: other
	language:
	- en
	pipeline_tag: text-generation
	inference: false
	tags:
	- transformers
	- gguf
	- imatrix
	- gemma-2-9b-it-WPO-HB
	---
	Quantizations of https://huggingface.co./wzhouad/gemma-2-9b-it-WPO-HB

	### Inference Clients/UIs
	* [llama.cpp](https://github.com/ggerganov/llama.cpp)
	* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
	* [ollama](https://github.com/ollama/ollama)
	* [jan](https://github.com/janhq/jan)
	* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
	* [GPT4All](https://github.com/nomic-ai/gpt4all)
	---

	# From original readme

	gemma-2-9b-it finetuned by hybrid WPO, utilizing two types of data:
	1. On-policy sampled gemma outputs based on Ultrafeedback prompts.
	2. GPT-4-turbo outputs based on Ultrafeedback prompts.

	In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.

	We provide our training data at [wzhouad/gemma-2-ultrafeedback-hybrid](https://huggingface.co./datasets/wzhouad/gemma-2-ultrafeedback-hybrid).