mHuBERT-147-br
This model is a fine-tuned version of utter-project/mHuBERT-147 on Mozilla Common Voice 15 Breton dataset and Roadennoù dataset. It achieves the following results on the validation set:
- Loss: 0.7331
- Wer: 50.09
- Cer: 16.45
Model description
This model was trained to assess the performance of mHubert-147 for finetuning a Breton ASR model.
Intended uses & limitations
This model is a research model and shouldn't be used in production.
Training and evaluation data
90% of the Roadennoù dataset was used for training, the remaining 10% was used for validation in addition to MCV15-br validation dataset.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3.8e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 52
- mixed_precision_training: Native AMP
Framework versions
- Transformers 4.39.1
- Pytorch 2.0.1+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 33
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for gweltou/mHuBERT-147-br
Base model
utter-project/mHuBERT-147Dataset used to train gweltou/mHuBERT-147-br
Evaluation results
- WER on common_voice_15_0test set self-reported47.000
- CER on common_voice_15_0test set self-reported16.700