KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge
KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge is an advanced language model created through a strategic fusion of two distinct models: Nitral-AI/KukulStanta-7B and AlekseiPravdin/Seamaiiza-7B-v1. The merging process was executed using mergekit, a specialized tool designed for precise model blending to achieve optimal performance and synergy between the merged architectures.
🧩 Merge Configuration
The models were merged using the Spherical Linear Interpolation (SLERP) method, which ensures smooth interpolation between the two models across all layers. The base model chosen for this process was [Nitral-AI/KukulStanta-7B], with parameters and configurations meticulously adjusted to harness the strengths of both source models.
Configuration:
slices:
- sources:
- model: Nitral-AI/KukulStanta-7B
layer_range: [0, 31]
- model: AlekseiPravdin/Seamaiiza-7B-v1
layer_range: [0, 31]
merge_method: slerp
base_model: Nitral-AI/KukulStanta-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: float16
Model Features
This fusion model combines the robust generative capabilities of [Nitral-AI/KukulStanta-7B] with the refined tuning of [AlekseiPravdin/Seamaiiza-7B-v1], creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks.
Evaluation Results
KukulStanta-7B
The evaluation results for Nitral-AI/KukulStanta-7B are as follows:
Metric | Value |
---|---|
Avg. | 70.95 |
AI2 Reasoning Challenge (25-Shot) | 68.43 |
HellaSwag (10-Shot) | 86.37 |
MMLU (5-Shot) | 65.00 |
TruthfulQA (0-shot) | 62.19 |
Winogrande (5-shot) | 80.03 |
GSM8k (5-shot) | 63.68 |
Seamaiiza-7B-v1
The evaluation results for AlekseiPravdin/Seamaiiza-7B-v1 are not provided in detail but are expected to complement the performance metrics of KukulStanta-7B, enhancing its capabilities in various text generation tasks.
Limitations
While KukulStanta-7B-Seamaiiza-7B-v1-slerp-merge inherits the strengths of both parent models, it may also carry over some limitations or biases present in them. Users should be aware of potential biases in generated content and the need for careful evaluation in sensitive applications.
- Downloads last month
- 9