Edit model card

Model Details

danube-ko-1.8b-base is a continual pre-trained Korean language model based on h2oai/h2o-danube2-1.8b-base.

Model Developers

Jinhong Jeong, Ungsang Yoon

Model Architecture

The vocabulary size was expanded from original 32000 to 40000 to add Korean tokens efficiently. We used the EEVE technique for training. The model has sequence length of 2048. Everything else is the same as the original model.

Training Datasets

We used CulturaX, Common Crawl CC-MAIN-2024-10, AI Hub Data, Korean Wikis, Corpora from National Institute of the Korean Language, Standard Korean Dictionary, etc. About 42GB of data was used for training.

Model Benchmark

This model is ranked #1 in Ko-MMLU on the Open Ko-LLM Leaderboard among pretrained Korean models of size 2B or smaller as of July 5, 2024.

Task Value
Ko-ARC 31.74
Ko-HellaSwag 44.44
Ko-MMLU 28.06
Ko-TruthfulQA 41.63
Ko-CommonGen V2 32.7
kmmlu_direct 29.05
kobest 59.13

Disclaimer

The Model can generate information that is biased, discriminatory, socially inappropriate, etc. The Model can also generate information that is not accurate. The Model is used at your own risk, and the developers are not responsible for the information generated by the model.

Downloads last month
1,610
Safetensors
Model size
1.87B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jjhsnail0822/danube-ko-1.8b-base

Finetuned
(13)
this model
Finetunes
1 model

Dataset used to train jjhsnail0822/danube-ko-1.8b-base