Model Card for Model ID

This is the baseline model of Thai-central in Thai-dialect corpus.

The training recipe was based on wsj recipe in espnet.

Model Description

This model is a Hybrid CTC/Attention model with pre-trained HuBERT as the encoder.

This model was trained on Thai-central to be used as a supervised pre-trained model in order to be used for finetuning to other Thai dialects. (Experiment 2 in the paper).

We provide some demo code to do inference with this model on colab here. (Please note that you cannot inference >4 seconds of audio with free Google colab)

Evaluation

For evaluation, the metrics are CER and WER. Before WER evaluation, transcriptions were re-tokenized using newmm tokenizer in PyThaiNLP

from pythainlp import word_tokenize

tokenized_sentence_list = word_tokenize(<your_sentence>)

The CER and WER results on the test set are:

CER = 2.0

WER = 6.9

Paper

Thai Dialect Corpus and Transfer-based Curriculum Learning Investigation for Dialect Automatic Speech Recognition

@inproceedings{suwanbandit23_interspeech,
  author={Artit Suwanbandit and Burin Naowarat and Orathai Sangpetch and Ekapol Chuangsuwanich},
  title={{Thai Dialect Corpus and Transfer-based Curriculum Learning Investigation for Dialect Automatic Speech Recognition}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4069--4073},
  doi={10.21437/Interspeech.2023-1828}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.