whisper-multi-diar-wer

This model is a fine-tuned version of openai/whisper-tiny on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 10.0666
  • Wer: 582.8864
  • Cer: 168.6522
  • Speech Scored: 693831
  • Speech Miss: 52252
  • Speech Falarm: 117030
  • Speaker Miss: 52252
  • Speaker Falarm: 117030
  • Speaker Error: 187216
  • Speaker Correct: 1437.5240
  • Diarization Error: 356498
  • Frames: 600
  • Speaker Wide Frames: 746083
  • Speech Scored Ratio: 1156.385
  • Speech Miss Ratio: 87.0867
  • Speech Falarm Ratio: 195.05
  • Speaker Correct Ratio: 2.3959
  • Speaker Miss Ratio: 0.0700
  • Speaker Falarm Ratio: 0.1569
  • Speaker Error Ratio: 0.2509
  • Diarization Error Ratio: 0.4778

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 48
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer Speech Scored Speech Miss Speech Falarm Speaker Miss Speaker Falarm Speaker Error Speaker Correct Diarization Error Frames Speaker Wide Frames Speech Scored Ratio Speech Miss Ratio Speech Falarm Ratio Speaker Correct Ratio Speaker Miss Ratio Speaker Falarm Ratio Speaker Error Ratio Diarization Error Ratio
11.3437 1.0 42 10.7905 574.9471 166.1650 743633 2450 150302 2450 150302 202641 1427.9773 355393 600 746083 1239.3883 4.0833 250.5033 2.3800 0.0033 0.2015 0.2716 0.4763
10.3627 2.0 84 10.4901 578.0875 167.1479 735121 10962 136397 10962 136397 201434 1433.1820 348793 600 746083 1225.2017 18.27 227.3283 2.3886 0.0147 0.1828 0.2700 0.4675
9.9444 3.0 126 10.3015 569.6851 166.4943 715188 30895 127221 30895 127221 194291 1435.5347 352407 600 746083 1191.98 51.4917 212.035 2.3926 0.0414 0.1705 0.2604 0.4723
9.7658 4.0 168 10.2071 572.0536 166.8688 706081 40002 122962 40002 122962 191357 1436.2147 354321 600 746083 1176.8017 66.67 204.9367 2.3937 0.0536 0.1648 0.2565 0.4749
9.5093 5.0 210 10.1640 572.3712 166.9189 703250 42833 121335 42833 121335 190255 1436.8813 354423 600 746083 1172.0833 71.3883 202.225 2.3948 0.0574 0.1626 0.2550 0.4750
9.3069 6.0 252 10.1287 573.2534 167.0644 700202 45881 119938 45881 119938 189349 1436.9886 355168 600 746083 1167.0033 76.4683 199.8967 2.3950 0.0615 0.1608 0.2538 0.4760
9.2209 7.0 294 10.1009 582.8864 168.6522 698009 48074 118866 48074 118866 188639 1437.1880 355579 600 746083 1163.3483 80.1233 198.11 2.3953 0.0644 0.1593 0.2528 0.4766
9.0761 8.0 336 10.0912 582.8864 168.6522 695719 50364 117834 50364 117834 187684 1437.6227 355882 600 746083 1159.5317 83.94 196.39 2.3960 0.0675 0.1579 0.2516 0.4770
8.9928 9.0 378 10.0654 582.8864 168.6522 694031 52052 117145 52052 117145 187295 1437.4753 356492 600 746083 1156.7183 86.7533 195.2417 2.3958 0.0698 0.1570 0.2510 0.4778
8.9674 10.0 420 10.0666 582.8864 168.6522 693831 52252 117030 52252 117030 187216 1437.5240 356498 600 746083 1156.385 87.0867 195.05 2.3959 0.0700 0.1569 0.2509 0.4778

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.0.0
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
7
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for anakib1/whisper-multi-diar-wer

Finetuned
(1287)
this model