logo.png

What is this

My experiment. Continuation of Benchmaxxxer series (meme models), but a bit more serious. Performs high on my benchmark and on huggingface benchmark, moderately-high in practice. Worth trying? Yeah. It is on the gooder side.

Observations

  • GPTslop: medium-low. Avoid at all costs or it won't stop generating it though.
  • Writing style: difficult to describe. Not the usual stuff. A bit of an autopilot like thing, if you write your usual lazy "ahh ahh mistress" it can give you a whole page of good text in return. High.
  • Censorship: if you can handle Xwin, you can handle this model. Medium.
  • Optimism: medium-low.
  • Violence: medium-low.
  • Intelligence: medium.
  • Creativity: medium-high.
  • Doesn't like high temperature. Keep below 1.5.

Prompt format

Vicuna or Alpaca.

Merge Details

This is a merge of pre-trained language models created using mergekit.

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: spicyboros
    parameters:
      weight: [0.093732305,0.403220342,0.055438423,0.043830778,0.054189303,0.081136828]
  - model: xwin
    parameters:
      weight: [0.398943486,0.042069007,0.161586088,0.470977297,0.389315704,0.416739102]
  - model: euryale
    parameters:
      weight: [0.061483013,0.079698633,0.043067724,0.00202751,0.132183868,0.36578003]
  - model: dolphin
    parameters:
      weight: [0.427942847,0.391488452,0.442164138,0,0,0.002174793]
  - model: wizard
    parameters:
      weight: [0.017898349,0.083523566,0.297743627,0.175345857,0.071770095,0.134169247]
  - model: WinterGoddess
    parameters:
      weight: [0,0,0,0.30781856,0.352541031,0]
merge_method: linear
dtype: float16
tokenizer_source: base

Benchmarks

NeoEvalPlusN_benchmark

My meme benchmark.

Name B C D S P total BCD SP
ChuckMcSneed/PMaxxxer-v1-70b 3 1 1 6.75 4.75 16.5 5 11.5
ChuckMcSneed/SMaxxxer-v1-70b 2 1 0 7.25 4.25 14.5 3 11.5
ChuckMcSneed/ArcaneEntanglement-model64-70b 3 2 1 7.25 6 19.25 6 13.25

Absurdly high. That's what happens when you optimize the merges for a benchmark.

Open LLM Leaderboard Evaluation Results

Leaderboard on Huggingface

Model Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
ChuckMcSneed/ArcaneEntanglement-model64-70b 72.79 71.42 87.96 70.83 60.53 83.03 63
ChuckMcSneed/PMaxxxer-v1-70b 72.41 71.08 87.88 70.39 59.77 82.64 62.7
ChuckMcSneed/SMaxxxer-v1-70b 72.23 70.65 88.02 70.55 60.7 82.87 60.58

This model is simply superior to my other meme models here.

Downloads last month
413
Safetensors
Model size
69B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ChuckMcSneed/ArcaneEntanglement-model64-70b

Quantizations
2 models