marma commited on
Commit
ce279e0
·
1 Parent(s): 81f9b47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -26,11 +26,11 @@ model-index:
26
  type: wer
27
  value: 9.914
28
  ---
29
- # Wav2vec 2.0 large VoxRex Swedish (B)
30
 
31
  **Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
32
 
33
- Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **3.617%**. WER for Common Voice test set is **9.914%** directly and **7.77%** with a 4-gram language model.
34
 
35
  When using this model, make sure that your speech input is sampled at 16kHz.
36
 
@@ -40,7 +40,7 @@ When using this model, make sure that your speech input is sampled at 16kHz.
40
  <center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
41
 
42
  ## Training
43
- This model has been fine-tuned for 120000 updates on NST + CommonVoice and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed].
44
 
45
  ![WER during training](chart_1.svg "WER")
46
 
 
26
  type: wer
27
  value: 9.914
28
  ---
29
+ # Wav2vec 2.0 large VoxRex Swedish (C)
30
 
31
  **Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
32
 
33
+ Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **2.5%**. WER for Common Voice test set is **8.49%** directly and **7.37%** with a 4-gram language model.
34
 
35
  When using this model, make sure that your speech input is sampled at 16kHz.
36
 
 
40
  <center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
41
 
42
  ## Training
43
+ This model has been fine-tuned for 120000 updates on NST + CommonVoice<del> and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed]</del>.
44
 
45
  ![WER during training](chart_1.svg "WER")
46