Update README.md
Browse files
README.md
CHANGED
@@ -26,11 +26,11 @@ model-index:
|
|
26 |
type: wer
|
27 |
value: 9.914
|
28 |
---
|
29 |
-
# Wav2vec 2.0 large VoxRex Swedish (
|
30 |
|
31 |
**Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
|
32 |
|
33 |
-
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **
|
34 |
|
35 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
36 |
|
@@ -40,7 +40,7 @@ When using this model, make sure that your speech input is sampled at 16kHz.
|
|
40 |
<center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
|
41 |
|
42 |
## Training
|
43 |
-
This model has been fine-tuned for 120000 updates on NST + CommonVoice and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed]
|
44 |
|
45 |
![WER during training](chart_1.svg "WER")
|
46 |
|
|
|
26 |
type: wer
|
27 |
value: 9.914
|
28 |
---
|
29 |
+
# Wav2vec 2.0 large VoxRex Swedish (C)
|
30 |
|
31 |
**Disclaimer:** This is a work in progress. See [VoxRex](https://huggingface.co/KBLab/wav2vec2-large-voxrex) for more details.
|
32 |
|
33 |
+
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **2.5%**. WER for Common Voice test set is **8.49%** directly and **7.37%** with a 4-gram language model.
|
34 |
|
35 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
36 |
|
|
|
40 |
<center>*<i>Chart shows performance without the additional 20k steps of Common Voice fine-tuning</i></center>
|
41 |
|
42 |
## Training
|
43 |
+
This model has been fine-tuned for 120000 updates on NST + CommonVoice<del> and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed]</del>.
|
44 |
|
45 |
![WER during training](chart_1.svg "WER")
|
46 |
|