Quantization made by Richard Erkhov.

greesychat-turbo - GGUF

Model creator: https://huggingface.co./OnlyCheeini/
Original model: https://huggingface.co./OnlyCheeini/greesychat-turbo/

Name	Quant method	Size
greesychat-turbo.Q2_K.gguf	Q2_K	2.96GB
greesychat-turbo.IQ3_XS.gguf	IQ3_XS	3.28GB
greesychat-turbo.IQ3_S.gguf	IQ3_S	3.43GB
greesychat-turbo.Q3_K_S.gguf	Q3_K_S	3.41GB
greesychat-turbo.IQ3_M.gguf	IQ3_M	3.52GB
greesychat-turbo.Q3_K.gguf	Q3_K	3.74GB
greesychat-turbo.Q3_K_M.gguf	Q3_K_M	3.74GB
greesychat-turbo.Q3_K_L.gguf	Q3_K_L	4.03GB
greesychat-turbo.IQ4_XS.gguf	IQ4_XS	4.18GB
greesychat-turbo.Q4_0.gguf	Q4_0	4.34GB
greesychat-turbo.IQ4_NL.gguf	IQ4_NL	4.38GB
greesychat-turbo.Q4_K_S.gguf	Q4_K_S	4.37GB
greesychat-turbo.Q4_K.gguf	Q4_K	4.58GB
greesychat-turbo.Q4_K_M.gguf	Q4_K_M	4.58GB
greesychat-turbo.Q4_1.gguf	Q4_1	4.78GB
greesychat-turbo.Q5_0.gguf	Q5_0	5.21GB
greesychat-turbo.Q5_K_S.gguf	Q5_K_S	5.21GB
greesychat-turbo.Q5_K.gguf	Q5_K	5.34GB
greesychat-turbo.Q5_K_M.gguf	Q5_K_M	5.34GB
greesychat-turbo.Q5_1.gguf	Q5_1	5.65GB
greesychat-turbo.Q6_K.gguf	Q6_K	6.14GB
greesychat-turbo.Q8_0.gguf	Q8_0	7.95GB

Original model description:

base_model: unsloth/llama-3-8b-Instruct-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl - sft datasets: - OnlyCheeini/greesychat

GreesyChat-Turbo AI Model

Overview

GreesyChat-Turbo is an advanced AI model designed for robust text generation using the LLaMA 3 architecture. This model excels in providing high-quality responses for general conversation, mathematical queries, and more. It’s perfect for powering chatbots, virtual assistants, and any application requiring intelligent dialogue capabilities.

Benchmark Results

Metric	Value
Perplexity	22.5
Generation Speed	75 ms per token
Accuracy	70%
Response Time	200 ms

Metric	GreesyChat-Turbo	Mixtral-8x7b	GPT-4
Code	79.2	75.6	83.6
MMLU	74.5	79.9	85.1
Gms8k	89.2 (5)	88.7	94.2

Contact

For support or inquiries, please contact: [email protected]