Chicka-Mixtral-3x7b / README.md
Chickaboo's picture
Update README.md
c1a5f23 verified
metadata
license: mit
pipeline_tag: text-generation
tags:
  - merge
  - mergekit
  - mistral
  - moe
  - conversational
  - chicka

Model Description

This model is a Mixture of Experts merged LLM consisting of 3 mistral based models:

base model/conversational expert, openchat/openchat-3.5-0106

code expert, beowolx/CodeNinja-1.0-OpenChat-7B

math expert, meta-math/MetaMath-Mistral-7B

This is the Mergekit config used in the merging process:

base_model: openchat/openchat-3.5-0106
experts:
  - source_model: openchat/openchat-3.5-0106
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
    - "I want"
  - source_model: beowolx/CodeNinja-1.0-OpenChat-7B
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
    - "C#"
    - "C++"
    - "debug"
    - "runtime"
    - "html"
    - "command"
    - "nodejs"
  - source_model: meta-math/MetaMath-Mistral-7B
    positive_prompts:
    - "reason"
    - "math"
    - "mathematics"
    - "solve"
    - "count"
    - "calculate"
    - "arithmetic"
    - "algebra"

Open LLM Leaderboards

Benchmark Chicka-Mixtral-3X7B Mistral-7B-Instruct-v0.2 Meta-Llama-3-8B
Average 69.19 60.97 62.55
ARC 64.08 59.98 59.47
Hellaswag 83.96 83.31 82.09
MMLU 64.87 64.16 66.67
TruthfulQA 50.51 42.15 43.95
Winogrande 81.06 78.37 77.35
GSM8K 70.66 37.83 45.79

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/Chicka-Mistral-3x7b")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/Chicka-Mixtral-3x7b")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])