Model Card for Model ID

Model Details

Model Description

The Gemma Self-Attention Merged model is a large language model created by merging the self-attention layers of an English-based Gemma 7B model and a Korean-based Gemma 7B model. This merger allows the model to leverage the capabilities of both the English and Korean models, resulting in a more versatile and capable language model that can perform well on tasks involving both English and Korean text.

The key features of this merged model include:

  • Increased self-attention capacity with doubled number of attention heads
  • Ability to handle both English and Korean language input
  • Potential for improved performance on a wide range of natural language processing tasks

Chat template

system: system message...
B: user message...
A: assistant message...

Model Sources

Downloads last month
8
Safetensors
Model size
9.95B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.