bartowski
/

Starling_Monarch_Westlake_Garten-7B-v0.1-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Llamacpp Quantizations of Starling_Monarch_Westlake_Garten-7B-v0.1

Using llama.cpp release b2440 for quantization.

Original model: https://huggingface.co./giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1

Download a file (not the whole branch) from below:

Filename	Quant type	File Size	Description
Starling_Monarch_Westlake_Garten-7B-v0.1-Q8_0.gguf	Q8_0	7.69GB	Extremely high quality, generally unneeded but max available quant.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q6_K.gguf	Q6_K	5.94GB	Very high quality, near perfect, recommended.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q5_K_M.gguf	Q5_K_M	5.13GB	High quality, very usable.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q5_K_S.gguf	Q5_K_S	4.99GB	High quality, very usable.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q5_0.gguf	Q5_0	4.99GB	High quality, older format, generally not recommended.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q4_K_M.gguf	Q4_K_M	4.36GB	Good quality, similar to 4.25 bpw.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q4_K_S.gguf	Q4_K_S	4.14GB	Slightly lower quality with small space savings.
Starling_Monarch_Westlake_Garten-7B-v0.1-IQ4_NL.gguf	IQ4_NL	4.15GB	Good quality, similar to Q4_K_S, new method of quanting,
Starling_Monarch_Westlake_Garten-7B-v0.1-IQ4_XS.gguf	IQ4_XS	3.94GB	Decent quality, new method with similar performance to Q4.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q4_0.gguf	Q4_0	4.10GB	Decent quality, older format, generally not recommended.
Starling_Monarch_Westlake_Garten-7B-v0.1-IQ3_M.gguf	IQ3_M	3.28GB	Medium-low quality, new method with decent performance.
Starling_Monarch_Westlake_Garten-7B-v0.1-IQ3_S.gguf	IQ3_S	3.18GB	Lower quality, new method with decent performance, recommended over Q3 quants.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q3_K_L.gguf	Q3_K_L	3.82GB	Lower quality but usable, good for low RAM availability.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q3_K_M.gguf	Q3_K_M	3.51GB	Even lower quality.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q3_K_S.gguf	Q3_K_S	3.16GB	Low quality, not recommended.
Starling_Monarch_Westlake_Garten-7B-v0.1-Q2_K.gguf	Q2_K	2.71GB	Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

Downloads last month: 81

GGUF

Model size

7.24B params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bartowski/Starling_Monarch_Westlake_Garten-7B-v0.1-GGUF

berkeley-nest/Starling-LM-7B-alpha

cognitivecomputations/WestLake-7B-v2-laser

mistralai/Mistral-7B-v0.1

mlabonne/AlphaMonarch-7B

senseable/garten2-7b

Merge model

this model

Evaluation results

self-reported on EQ-Bench
EQ-Bench v2.1

80.010
normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

71.760
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

88.150
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

65.070
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

67.920
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

82.160
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

71.950

View on Papers With Code