wav2vec2-large-xls-r-300m-spanish-custom

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4426
  • Wer: 0.2117

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
4.2307 0.4 400 1.4431 0.9299
0.7066 0.79 800 0.5928 0.4836
0.4397 1.19 1200 0.4341 0.3730
0.3889 1.58 1600 0.4063 0.3499
0.3607 1.98 2000 0.3834 0.3235
0.2866 2.37 2400 0.3885 0.3163
0.2833 2.77 2800 0.3765 0.3140
0.2692 3.17 3200 0.3849 0.3132
0.2435 3.56 3600 0.3779 0.2984
0.2404 3.96 4000 0.3756 0.2934
0.2153 4.35 4400 0.3770 0.3075
0.2087 4.75 4800 0.3819 0.3022
0.1999 5.14 5200 0.3756 0.2959
0.1838 5.54 5600 0.3827 0.2858
0.1892 5.93 6000 0.3714 0.2999
0.1655 6.33 6400 0.3814 0.2812
0.1649 6.73 6800 0.3685 0.2727
0.1668 7.12 7200 0.3832 0.2825
0.1487 7.52 7600 0.3848 0.2788
0.152 7.91 8000 0.3810 0.2787
0.143 8.31 8400 0.3885 0.2856
0.1353 8.7 8800 0.4103 0.2827
0.1386 9.1 9200 0.4142 0.2874
0.1222 9.5 9600 0.3983 0.2830
0.1288 9.89 10000 0.4179 0.2781
0.1199 10.29 10400 0.4035 0.2789
0.1196 10.68 10800 0.4043 0.2746
0.1169 11.08 11200 0.4105 0.2753
0.1076 11.47 11600 0.4298 0.2686
0.1124 11.87 12000 0.4025 0.2704
0.1043 12.26 12400 0.4209 0.2659
0.0976 12.66 12800 0.4070 0.2672
0.1012 13.06 13200 0.4161 0.2720
0.0872 13.45 13600 0.4245 0.2697
0.0933 13.85 14000 0.4295 0.2684
0.0881 14.24 14400 0.4011 0.2650
0.0848 14.64 14800 0.3991 0.2675
0.0852 15.03 15200 0.4166 0.2617
0.0825 15.43 15600 0.4188 0.2639
0.081 15.83 16000 0.4181 0.2547
0.0753 16.22 16400 0.4103 0.2560
0.0747 16.62 16800 0.4017 0.2498
0.0761 17.01 17200 0.4159 0.2563
0.0711 17.41 17600 0.4112 0.2603
0.0698 17.8 18000 0.4335 0.2529
0.073 18.2 18400 0.4120 0.2512
0.0665 18.6 18800 0.4335 0.2496
0.0657 18.99 19200 0.4143 0.2468
0.0617 19.39 19600 0.4339 0.2435
0.06 19.78 20000 0.4179 0.2438
0.0613 20.18 20400 0.4251 0.2393
0.0583 20.57 20800 0.4347 0.2422
0.0562 20.97 21200 0.4246 0.2377
0.053 21.36 21600 0.4198 0.2338
0.0525 21.76 22000 0.4511 0.2427
0.0499 22.16 22400 0.4482 0.2353
0.0475 22.55 22800 0.4449 0.2329
0.0465 22.95 23200 0.4364 0.2320
0.0443 23.34 23600 0.4481 0.2304
0.0458 23.74 24000 0.4442 0.2267
0.0453 24.13 24400 0.4402 0.2261
0.0426 24.53 24800 0.4262 0.2232
0.0431 24.93 25200 0.4251 0.2210
0.0389 25.32 25600 0.4455 0.2232
0.039 25.72 26000 0.4372 0.2236
0.0378 26.11 26400 0.4236 0.2212
0.0348 26.51 26800 0.4359 0.2204
0.0361 26.9 27200 0.4248 0.2192
0.0356 27.3 27600 0.4397 0.2184
0.0325 27.7 28000 0.4367 0.2181
0.0313 28.09 28400 0.4477 0.2136
0.0306 28.49 28800 0.4533 0.2135
0.0314 28.88 29200 0.4410 0.2136
0.0307 29.28 29600 0.4457 0.2113
0.0309 29.67 30000 0.4426 0.2117

Framework versions

  • Transformers 4.16.0.dev0
  • Pytorch 1.10.1+cu102
  • Datasets 1.17.1.dev0
  • Tokenizers 0.11.0
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train tomascufaro/wav2vec2-large-xls-r-300m-spanish-custom