Quantizations of https://huggingface.co./SanjiWatsuki/Silicon-Maid-7B

Experiment

Quants ending in "_X" are experimental quants. These quants are the same as normal quants, but their token embedding weights are set to Q8_0 except for Q6_K and Q8_0 which are set to F16. The change will make these experimental quants larger but in theory, should result in improved performance.

List of experimental quants:

  • Q2_K_X
  • Q4_K_M_X
  • Q5_K_M_X
  • Q6_K_X
  • Q8_0_X

Inference Clients/UIs


From original readme

Silicon-Maid-7B is another model targeted at being both strong at RP and being a smart cookie that can follow character cards very well. As of right now, Silicon-Maid-7B outscores both of my previous 7B RP models in my RP benchmark and I have been impressed by this model's creativity. It is suitable for RP/ERP and general use.

Prompt Template (Alpaca)

I found the best SillyTavern results from using the Noromaid template but please try other templates! Let me know if you find anything good.

SillyTavern config files: Context, Instruct.

Additionally, here is my highly recommended Text Completion preset. You can tweak this by adjusting temperature up or dropping min p to boost creativity or raise min p to increase stability. You shouldn't need to touch anything else!

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:
Downloads last month
830
GGUF
Model size
7.24B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.

Space using duyntnet/Silicon-Maid-7B-imatrix-GGUF 1