whisper-small-multi-diar-wer

This model is a fine-tuned version of openai/whisper-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 10.4868
  • Wer: 100.0
  • Cer: 100.4922
  • Speech Scored: 10807
  • Speech Miss: 1926
  • Speech Falarm: 1656
  • Speaker Miss: 1926
  • Speaker Falarm: 1656
  • Speaker Error: 2927
  • Speaker Correct: 23.7093
  • Diarization Error: 6509
  • Frames: 10
  • Speaker Wide Frames: 12733
  • Speech Scored Ratio: 1080.7
  • Speech Miss Ratio: 192.6
  • Speech Falarm Ratio: 165.6
  • Speaker Correct Ratio: 2.3709
  • Speaker Miss Ratio: 0.1513
  • Speaker Falarm Ratio: 0.1301
  • Speaker Error Ratio: 0.2299
  • Diarization Error Ratio: 0.5112

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 48
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer Speech Scored Speech Miss Speech Falarm Speaker Miss Speaker Falarm Speaker Error Speaker Correct Diarization Error Frames Speaker Wide Frames Speech Scored Ratio Speech Miss Ratio Speech Falarm Ratio Speaker Correct Ratio Speaker Miss Ratio Speaker Falarm Ratio Speaker Error Ratio Diarization Error Ratio
No log 1.0 1 11.6412 100.0 189.4586 11732 1001 2061 1001 11893 2169 18.5120 15063 10 12733 1173.2 100.1 206.1 1.8512 0.0786 0.9340 0.1703 1.1830
No log 2.0 2 11.4959 209.5855 298.3593 11425 1308 2031 1308 5103 3052 21.6567 9463 10 12733 1142.5 130.8 203.1 2.1657 0.1027 0.4008 0.2397 0.7432
No log 3.0 3 11.2874 572.5389 437.2847 12060 673 2169 673 2169 3938 22.8547 6780 10 12733 1206.0 67.3 216.9 2.2855 0.0529 0.1703 0.3093 0.5325
No log 4.0 4 11.1333 535.2332 534.6596 12537 196 2237 196 2237 4141 22.8567 6574 10 12733 1253.7 19.6 223.7 2.2857 0.0154 0.1757 0.3252 0.5163
No log 5.0 5 11.0564 100.0 100.6153 12717 16 2264 16 2264 4222 22.8507 6502 10 12733 1271.7 1.6 226.4 2.2851 0.0013 0.1778 0.3316 0.5106
No log 6.0 6 10.9894 100.0 100.6153 12733 0 2267 0 2267 4232 22.8460 6499 10 12733 1273.3 0.0 226.7 2.2846 0.0 0.1780 0.3324 0.5104
No log 7.0 7 10.9209 100.0 100.6153 12733 0 2267 0 2267 4232 22.8460 6499 10 12733 1273.3 0.0 226.7 2.2846 0.0 0.1780 0.3324 0.5104
No log 8.0 8 10.8541 100.0 100.6153 12733 0 2267 0 2267 4232 22.8460 6499 10 12733 1273.3 0.0 226.7 2.2846 0.0 0.1780 0.3324 0.5104
No log 9.0 9 10.7928 100.0 100.6153 12705 28 2262 28 2262 4212 22.8573 6502 10 12733 1270.5 2.8 226.2 2.2857 0.0022 0.1776 0.3308 0.5106
No log 10.0 10 10.7401 100.0 100.6153 12535 198 2240 198 2240 4093 22.9173 6531 10 12733 1253.5 19.8 224.0 2.2917 0.0156 0.1759 0.3214 0.5129
No log 11.0 11 10.6969 100.0 100.6153 12279 454 2190 454 2190 3907 23.0280 6551 10 12733 1227.9 45.4 219.0 2.3028 0.0357 0.1720 0.3068 0.5145
No log 12.0 12 10.6622 100.0 100.6153 12126 607 2140 607 2140 3804 23.0967 6551 10 12733 1212.6 60.7 214.0 2.3097 0.0477 0.1681 0.2988 0.5145
No log 13.0 13 10.6341 100.0 100.6153 12131 602 2126 602 2126 3808 23.1040 6536 10 12733 1213.1 60.2 212.6 2.3104 0.0473 0.1670 0.2991 0.5133
No log 14.0 14 10.6127 100.0 100.6153 12202 531 2139 531 2139 3854 23.0813 6524 10 12733 1220.2 53.1 213.9 2.3081 0.0417 0.1680 0.3027 0.5124
No log 15.0 15 10.5973 100.0 100.6153 12253 480 2137 480 2137 3889 23.0700 6506 10 12733 1225.3 48.0 213.7 2.3070 0.0377 0.1678 0.3054 0.5110
No log 16.0 16 10.5863 100.0 100.6153 12239 494 2107 494 2107 3885 23.0860 6486 10 12733 1223.9 49.4 210.7 2.3086 0.0388 0.1655 0.3051 0.5094
No log 17.0 17 10.5787 100.0 100.6153 12158 575 2066 575 2066 3824 23.1407 6465 10 12733 1215.8 57.5 206.6 2.3141 0.0452 0.1623 0.3003 0.5077
No log 18.0 18 10.5715 100.0 100.6153 12030 703 2009 703 2009 3740 23.2053 6452 10 12733 1203.0 70.3 200.9 2.3205 0.0552 0.1578 0.2937 0.5067
No log 19.0 19 10.5638 100.0 100.6153 11866 867 1943 867 1943 3624 23.2947 6434 10 12733 1186.6 86.7 194.3 2.3295 0.0681 0.1526 0.2846 0.5053
No log 20.0 20 10.5554 100.0 100.6153 11695 1038 1892 1038 1892 3514 23.3613 6444 10 12733 1169.5 103.8 189.2 2.3361 0.0815 0.1486 0.2760 0.5061
No log 21.0 21 10.5470 100.0 100.6153 11588 1145 1854 1145 1854 3451 23.3993 6450 10 12733 1158.8 114.5 185.4 2.3399 0.0899 0.1456 0.2710 0.5066
No log 22.0 22 10.5397 100.0 100.6153 11556 1177 1840 1177 1840 3437 23.4060 6454 10 12733 1155.6 117.7 184.0 2.3406 0.0924 0.1445 0.2699 0.5069
No log 23.0 23 10.5330 100.0 100.6153 11562 1171 1834 1171 1834 3442 23.4073 6447 10 12733 1156.2 117.1 183.4 2.3407 0.0920 0.1440 0.2703 0.5063
No log 24.0 24 10.5265 100.0 100.6153 11583 1150 1838 1150 1838 3457 23.3987 6445 10 12733 1158.3 115.0 183.8 2.3399 0.0903 0.1443 0.2715 0.5062
5.1826 25.0 25 10.5191 100.0 100.6153 11574 1159 1834 1159 1834 3454 23.3993 6447 10 12733 1157.4 115.9 183.4 2.3399 0.0910 0.1440 0.2713 0.5063
5.1826 26.0 26 10.5103 100.0 100.6153 11539 1194 1826 1194 1826 3428 23.4160 6448 10 12733 1153.9 119.4 182.6 2.3416 0.0938 0.1434 0.2692 0.5064
5.1826 27.0 27 10.5003 100.0 100.6153 11466 1267 1802 1267 1802 3376 23.4527 6445 10 12733 1146.6 126.7 180.2 2.3453 0.0995 0.1415 0.2651 0.5062
5.1826 28.0 28 10.4912 100.0 100.6153 11387 1346 1780 1346 1780 3320 23.4893 6446 10 12733 1138.7 134.6 178.0 2.3489 0.1057 0.1398 0.2607 0.5062
5.1826 29.0 29 10.4834 100.0 100.5742 11300 1433 1754 1433 1754 3253 23.5380 6440 10 12733 1130.0 143.3 175.4 2.3538 0.1125 0.1378 0.2555 0.5058
5.1826 30.0 30 10.4772 100.0 100.5332 11248 1485 1739 1485 1739 3213 23.5667 6437 10 12733 1124.8 148.5 173.9 2.3567 0.1166 0.1366 0.2523 0.5055
5.1826 31.0 31 10.4723 100.0 100.5332 11207 1526 1736 1526 1736 3189 23.5733 6451 10 12733 1120.7 152.6 173.6 2.3573 0.1198 0.1363 0.2505 0.5066
5.1826 32.0 32 10.4676 100.0 100.5332 11163 1570 1724 1570 1724 3157 23.5947 6451 10 12733 1116.3 157.0 172.4 2.3595 0.1233 0.1354 0.2479 0.5066
5.1826 33.0 33 10.4632 100.0 100.5332 11134 1599 1718 1599 1718 3142 23.5993 6459 10 12733 1113.4 159.9 171.8 2.3599 0.1256 0.1349 0.2468 0.5073
5.1826 34.0 34 10.4616 100.0 100.5332 11106 1627 1712 1627 1712 3120 23.6140 6459 10 12733 1110.6 162.7 171.2 2.3614 0.1278 0.1345 0.2450 0.5073
5.1826 35.0 35 10.4624 100.0 100.4922 11080 1653 1704 1653 1704 3104 23.6233 6461 10 12733 1108.0 165.3 170.4 2.3623 0.1298 0.1338 0.2438 0.5074
5.1826 36.0 36 10.4619 100.0 100.4512 11047 1686 1698 1686 1698 3084 23.6320 6468 10 12733 1104.7 168.6 169.8 2.3632 0.1324 0.1334 0.2422 0.5080
5.1826 37.0 37 10.4617 100.0 100.4512 11005 1728 1691 1728 1691 3056 23.6460 6475 10 12733 1100.5 172.8 169.1 2.3646 0.1357 0.1328 0.2400 0.5085
5.1826 38.0 38 10.4616 100.0 100.4512 10972 1761 1684 1761 1684 3032 23.6607 6477 10 12733 1097.2 176.1 168.4 2.3661 0.1383 0.1323 0.2381 0.5087
5.1826 39.0 39 10.4629 100.0 100.4512 10955 1778 1681 1778 1681 3021 23.6660 6480 10 12733 1095.5 177.8 168.1 2.3666 0.1396 0.1320 0.2373 0.5089
5.1826 40.0 40 10.4668 100.0 100.4512 10948 1785 1679 1785 1679 3015 23.6707 6479 10 12733 1094.8 178.5 167.9 2.3671 0.1402 0.1319 0.2368 0.5088
5.1826 41.0 41 10.4696 100.0 100.4512 10940 1793 1679 1793 1679 3009 23.6733 6481 10 12733 1094.0 179.3 167.9 2.3673 0.1408 0.1319 0.2363 0.5090
5.1826 42.0 42 10.4722 100.0 100.4512 10928 1805 1677 1805 1677 3000 23.6787 6482 10 12733 1092.8 180.5 167.7 2.3679 0.1418 0.1317 0.2356 0.5091
5.1826 43.0 43 10.4757 100.0 100.4512 10894 1839 1672 1839 1672 2980 23.6860 6491 10 12733 1089.4 183.9 167.2 2.3686 0.1444 0.1313 0.2340 0.5098
5.1826 44.0 44 10.4790 100.0 100.4922 10851 1882 1663 1882 1663 2951 23.7020 6496 10 12733 1085.1 188.2 166.3 2.3702 0.1478 0.1306 0.2318 0.5102
5.1826 45.0 45 10.4821 100.0 100.4922 10840 1893 1657 1893 1657 2946 23.7053 6496 10 12733 1084.0 189.3 165.7 2.3705 0.1487 0.1301 0.2314 0.5102
5.1826 46.0 46 10.4847 100.0 100.4922 10829 1904 1655 1904 1655 2938 23.7100 6497 10 12733 1082.9 190.4 165.5 2.3710 0.1495 0.1300 0.2307 0.5102
5.1826 47.0 47 10.4862 100.0 100.4922 10821 1912 1655 1912 1655 2932 23.7127 6499 10 12733 1082.1 191.2 165.5 2.3713 0.1502 0.1300 0.2303 0.5104
5.1826 48.0 48 10.4865 100.0 100.4922 10813 1920 1656 1920 1656 2930 23.7093 6506 10 12733 1081.3 192.0 165.6 2.3709 0.1508 0.1301 0.2301 0.5110
5.1826 49.0 49 10.4866 100.0 100.4922 10807 1926 1656 1926 1656 2927 23.7093 6509 10 12733 1080.7 192.6 165.6 2.3709 0.1513 0.1301 0.2299 0.5112
4.8334 50.0 50 10.4868 100.0 100.4922 10807 1926 1656 1926 1656 2927 23.7093 6509 10 12733 1080.7 192.6 165.6 2.3709 0.1513 0.1301 0.2299 0.5112

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.0.0
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
4
Safetensors
Model size
242M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for anakib1/whisper-small-multi-diar-wer

Finetuned
(2224)
this model