Made using Gpt-Small from scratch for learning purpose. Tokenizer used is from Gemma 2-2B-JPN-IT which is trained on japanese dataset from JESC.

Model usage:-

@ARTICLE{pryzant_jesc_2018,
   author = {{Pryzant}, R. and {Chung}, Y. and {Jurafsky}, D. and {Britz}, D.},
    title = "{JESC: Japanese-English Subtitle Corpus}",
  journal = {Language Resources and Evaluation Conference (LREC)},
 keywords = {Computer Science - Computation and Language},
     year = 2018
}

Downloads last month: 25

Safetensors

Model size

282M params

Tensor type

F32

Inference API

Unable to determine this model's library. Check the docs .

Model tree for tirthadagr8/Japanese_to_english_gpt2CasualLM_GemmaTokenizer

Base model

openai-community/gpt2

Finetuned

(1265)

this model

tirthadagr8
/

Japanese_to_english_gpt2CasualLM_GemmaTokenizer

Model tree for tirthadagr8/Japanese_to_english_gpt2CasualLM_GemmaTokenizer

Dataset used to train tirthadagr8/Japanese_to_english_gpt2CasualLM_GemmaTokenizer