ZeroXClem/Llama3.1-BestMix-Chem-Einstein-8B

Llama3.1-BestMix-Chem-Einstein-8B is an innovative, meticulously blended model designed to excel in instruction-following, chemistry-focused tasks, and long-form conversational generation. This model fuses the best qualities of multiple Llama3-based architectures, making it highly versatile for both general and specialized tasks. 💻🧠✨

🌟 Family Tree

This model is the result of merging the following:

bunnycore/Best-Mix-Llama-3.1-8B: A balanced blend of top Llama models, optimized for general performance across reasoning, instruction-following, and math.
USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8B: A model specialized in scientific knowledge and chemistry, excelling in chemistry benchmarks.
Weyaxi/Einstein-v6.1-Llama3-8B: Fine-tuned for long-form generation, conversation-heavy tasks, and optimized with cutting-edge techniques for efficient memory usage and fast performance.

🧬 Model Lineage

A: bunnycore/Best-Mix-Llama-3.1-8B

A masterful blend of several Llama3 models like Aurora_faustus, TitanFusion, and OpenMath2.
Provides a balanced performance in a variety of tasks such as reasoning, math, and instruction-following.
Key contributor to the overall versatility of the merged model.

B: USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8B

Specializes in chemistry and scientific knowledge, outperforming many larger models in chemistry benchmarks.
Adds scientific rigor and domain-specific expertise to the merged model, making it perfect for scientific and academic tasks.

C: Weyaxi/Einstein-v6.1-Llama3-8B

Fine-tuned on a wide range of instructive and conversational datasets like WizardLM, Alpaca, and ShareGPT.
Optimized for long-form text generation and enhanced with xformers attention and flash attention techniques for better performance.
Key player in dialogue-based tasks and long conversation generation.

🛠️ Merge Details

This model was merged using the TIES merge method, ensuring a smooth integration of the key strengths from each contributing model. Here's the configuration used:

yaml
Copy code
models:
  - model: bunnycore/Best-Mix-Llama-3.1-8B
    parameters:
      density: [1, 0.7, 0.5]
      weight: 1.0

  - model: USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8B
    parameters:
      density: 0.6
      weight: [0.3, 0.7, 1.0]

  - model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.4
      weight:
        - filter: mlp
          value: 0.5
        - filter: self_attn
          value: 0.7
        - value: 0.5

merge_method: ties
base_model: bunnycore/Best-Mix-Llama-3.1-8B
parameters:
  normalize: true
  int8_mask: true
dtype: float16

🎯 Key Features & Capabilities

1. Instruction Following & General Reasoning:

With the foundation of Best-Mix, this model excels in general-purpose reasoning, instruction-following, and tasks that require high adaptability.

2. Scientific & Chemistry Expertise:

Thanks to the contribution from KALE-LM-Chem, this model shines in scientific research, particularly chemistry-focused tasks, making it ideal for academic and research purposes.

3. Long-Form & Conversational Mastery:

With Einstein-v6.1, the model handles long-form generation effortlessly, excelling in extended conversations and structured dialogue applications.

🚀 Performance Benchmarks

While still in its early stages, Llama3.1-BestMix-Chem-Einstein-8B is expected to perform well across a variety of benchmarks, including:

Chemistry-focused benchmarks (KALE-LM-Chem)
Instruction-following tasks (Best-Mix)
Conversational AI and long-form text generation (Einstein-v6.1)

Further testing and evaluation will continue to refine this model's capabilities.

📜 License

This model is open-sourced under the Apache-2.0 License, allowing free use and modification with proper attribution.

ZeroXClem
/

Llama3.1-BestMix-Chem-Einstein-8B