mHuBERT-147-br

This model is a fine-tuned version of utter-project/mHuBERT-147 on Mozilla Common Voice 15 Breton dataset and Roadennoù dataset. It achieves the following results on the validation set:

  • Loss: 0.7331
  • Wer: 50.09
  • Cer: 16.45

Model description

This model was trained to assess the performance of mHubert-147 for finetuning a Breton ASR model.

Intended uses & limitations

This model is a research model and shouldn't be used in production.

Training and evaluation data

90% of the Roadennoù dataset was used for training, the remaining 10% was used for validation in addition to MCV15-br validation dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.8e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 52
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.39.1
  • Pytorch 2.0.1+cu117
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
33
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gweltou/mHuBERT-147-br

Finetuned
(7)
this model

Dataset used to train gweltou/mHuBERT-147-br

Evaluation results