NLLB-200 Distilled-350M_en2ko

The NLLB-200 model showed outstanding performance in translation task and contributed to solving problems with low-resource languages. Despite their efforts, it is still hard to run 600M or more than 1B model for those who have not enough computing environment. So I made much smaller model that expertized translaing English to Korean. you can also run it with cpu (No mixed-precision, No Quantization).

Model

  • Model: model is based on NLLB-200 600M

    • Parameters: 350,537,728 (350M)
    • Encoder layers: 12 -> 3
    • Decoder layers: 12 -> 3
    • FFN dimension: 4096 (same)
    • Embed dimension: 1024 (same)
    • Vocab size: 256206 (same)
  • Licnese: CC-BY-NC

Data

Metric

  • CPU: Intel (R) Xeon(R) CPU @ 2.20GHz (16 cores)
  • GPU: NVIDIA L4 24GB
#Params chrF(++) GPU Inference time (s) CPU Inference time (s)
NLLB-200 3.3B 3.3B 34.3 0.98 s 4.65 s
NLLB-200 1.3B 1.3B 32.1 0.89 s 2.46 s
NLLB-200 600M 600M 32 0.43 s 1.52 s
NLLB-200 350M (ours) 350M 24.6 0.24 s 1.43 s

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', forced_bos_token_id=256098)
tokenizer = AutoTokenizer.from_pretrained('dhtocks/nllb-200-distilled-350M_en-ko', src_lang='eng_Latn', tgt_lang='kor_Hang')

inputs = tokenizer('[YOUR_INPUT]', return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0]))

Citation

@misc{,
  title={NLLB-200 distilled_350M_en-ko},
  author={Saechan Oh},
  year={2024}
}
Downloads last month
27
Safetensors
Model size
351M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train dhtocks/nllb-200-distilled-350M_en-ko