Edit model card

This is a exllama V2 quantization of https://huggingface.co./TheBloke/Stheno-L2-13B-GPTQ Uses a target bpw of 8, intended for best quality on cards like a 3090 or similar. Includes measurement.json for convenience of quantizing to other sizes. Calibration data: https://huggingface.co./datasets/wikitext/resolve/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet

An experimental merging of Several Models using two various methods, Ties-Merge and BlockMerge_Gradient

I plan for this to be the base of my Model with my own [Stheno: ERP-Based LORA] merged in, some time in the future.

Stheno:
Gradient Merge of Stheno-P1 & Stheno-P2.

SISTER MODEL HERE: Stheno-Inverted-L2-13B

Quants courtesy of TheBloke!
GPTQ
GGUF
GGML

Test Checklist:
Censorship - Fairly Uncensored
Writing - Good Prose, Fairly Descriptive
NSFW - Yes
IQ Level - Pretty Smart
Formatting - Proper Formatting with Examples

Stheno-P1 [Ties-Merge]
-----elinas/chronos-13b-v2
-----jondurbin/airoboros-l2-13b-2.1
-----NousResearch/Nous-Hermes-Llama2-13b+nRuaif/Kimiko-v2 LORA

Stheno-P2 [Ties-Merge]
-----CalderaAI/13B-Legerdemain-L2+lemonilia/limarp-llama2-v2 LORA
-----ehartford/WizardLM-1.0-Uncensored-Llama2-13b
-----Henk717/spring-dragon

Most formats could work, but my tests have all been done in Alpaca format and it works well.

### Instruction:
Your instruction or question here.
For roleplay purposes, I suggest the following - Write <CHAR NAME>'s next reply in a chat between <YOUR NAME> and <CHAR NAME>. Write a single reply only.

### Response:

Below is the Illustration for the Final Merge:

ILLUSTRATION

Once Again, thanks to Chargoddard for his amazing and simple ties-merge script, and Gryphe for their great BlockMerge_Gradient script. Thanks to the original model creators too!

Art by wada_kazu / わだかず (pixiv page private?)
Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.