Edit model card

SuperNova-Lite-Hermes-3-Llama-3.1-8B_TIES_with_base_Embeddings_Pre-Initialized

This merge is successful. Not adding or editorializing the model card right now. I need sleep. But, resultant model works great! This experiment revealed two things. One, distilled instruct models work best for TIES merging with the base and other models; the experiment showed that this is due to the way that distilled models are trained vs non-distilled models: when merged with other models, the distilled models seem to retain more of their attributes (the way that they talk, think, reason, etc) - this makes them very appealing for model merges because you keep more of the model's inherent capabilities and behaviors. And, two: I can successfully TIES merge different instruct models with their base pre-initialized to the embeddings special tokens (for prompt/chat template). The model is coherent and capable. Please download and try it if your interested. GGUF Custom OQ8_0-F32_EF32 IQuants will be up by the middle of the week - most probably sooner but still...

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the TIES merge method using /Users/jsarnecki/opt/mergekit/merges/Llama-3.1-8B-InitializedEmbeddings_with_Hermes-3 as a base.

Models Merged

The following models were included in the merge:

  • /Users/jsarnecki/opt/Workspace/arcee-ai/Llama-3.1-SuperNova-Lite
  • /Users/jsarnecki/opt/Workspace/NousResearch/Hermes-3-Llama-3.1-8B

Configuration

The following YAML configuration was used to produce this model:

models:

  - model: "/Users/jsarnecki/opt/Workspace/arcee-ai/Llama-3.1-SuperNova-Lite"
    parameters:
      weight: 1
      density: 1

  - model: "/Users/jsarnecki/opt/Workspace/NousResearch/Hermes-3-Llama-3.1-8B"
    parameters:
      weight: 1
      density: 1
  
  - model: "/Users/jsarnecki/opt/Workspace/arcee-ai/Llama-3.1-SuperNova-Lite"
    parameters:
      weight: 1
      density: 1

  - model: "/Users/jsarnecki/opt/Workspace/NousResearch/Hermes-3-Llama-3.1-8B"
    parameters:
      weight: 1
      density: 1
  
merge_method: ties
base_model: "/Users/jsarnecki/opt/mergekit/merges/Llama-3.1-8B-InitializedEmbeddings_with_Hermes-3"
parameters:
  density: 1
  normalize: true
  int8_mask: true
tokenizer_source: "/Users/jsarnecki/opt/Workspace/NousResearch/Hermes-3-Llama-3.1-8B"
dtype: float32
out_dtype: bfloat16
Downloads last month
43
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.