A newer version of this model is available: sthenno-com/miscii-14b-0218

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

tempesthenno--nuslerp (BASE MODEL)

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the NuSLERP merge method.

Models Merged

The following models were included in the merge:

  • /Users/sthenno/models/tempesthenno--converge-dtask
  • /Users/sthenno/models/tempesthenno--converge-breadcrumbs

Configuration

The following YAML configuration was used to produce this model:

name: tempesthenno--nuslerp
merge_method: nuslerp
tokenizer:
  source: /Users/sthenno/models/tempesthenno--converge-dtask
chat_template: "chatml"
dtype: float32
out_dtype: bfloat16
parameters:
  int8_mask: false
  normalize: true
  rescale: false
slices:
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [0, 8]
        parameters:
          weight: 0.65
          nuslerp_flatten: false
          nuslerp_row_wise: true
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [0, 8]
        parameters:
          weight: 0.35
          nuslerp_flatten: false
          nuslerp_row_wise: true
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [8, 16]
        parameters:
          weight: 0.60
          nuslerp_flatten: false
          nuslerp_row_wise: true
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [8, 16]
        parameters:
          weight: 0.40
          nuslerp_flatten: false
          nuslerp_row_wise: true
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [16, 24]
        parameters:
          weight: 0.55
          nuslerp_flatten: false
          nuslerp_row_wise: false
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [16, 24]
        parameters:
          weight: 0.45
          nuslerp_flatten: false
          nuslerp_row_wise: false
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [24, 32]
        parameters:
          weight: 0.50
          nuslerp_flatten: false
          nuslerp_row_wise: false
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [24, 32]
        parameters:
          weight: 0.50
          nuslerp_flatten: false
          nuslerp_row_wise: false
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [32, 40]
        parameters:
          weight: 0.45
          nuslerp_flatten: true
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [32, 40]
        parameters:
          weight: 0.55
          nuslerp_flatten: true
  - sources:
      - model: /Users/sthenno/models/tempesthenno--converge-dtask
        layer_range: [40, 48]
        parameters:
          weight: 0.40
          nuslerp_flatten: true
      - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
        layer_range: [40, 48]
        parameters:
          weight: 0.60
          nuslerp_flatten: true

Open LLM Leaderboard Evaluation Results

Metric Value
Avg. 40.55
IFEval (0-Shot) 79.23
BBH (3-Shot) 50.57
MATH Lvl 5 (4-Shot) 34.21
GPQA (0-shot) 17.00
MuSR (0-shot) 14.56
MMLU-PRO (5-shot) 47.69

Refined:

Metric Value
Avg. 42.74
IFEval (0-Shot) 79.23
BBH (3-Shot) 50.57
MATH Lvl 5 (4-Shot) 47.36
GPQA (0-shot) 17.00
MuSR (0-shot) 14.56
MMLU-PRO (5-shot) 47.69
Downloads last month
0
Safetensors
Model size
14.8B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for xi0v/tempesthenno-ppo-ckpt40-archive

Dataset used to train xi0v/tempesthenno-ppo-ckpt40-archive

Evaluation results