Update README.md
Browse files
README.md
CHANGED
@@ -19,11 +19,11 @@ metrics:
|
|
19 |
- Accuracy
|
20 |
widget:
|
21 |
- example_title: Australian English
|
22 |
-
src: australia_1.wav
|
23 |
- example_title: African English
|
24 |
-
src: african_1.wav
|
25 |
- example_title: Canadian English
|
26 |
-
src: canada_1.wav
|
27 |
---
|
28 |
|
29 |
|
@@ -32,22 +32,29 @@ widget:
|
|
32 |
|
33 |
# Accent Identification from Speech Recordings with ECAPA embeddings on CommonAccent
|
34 |
|
35 |
-
This repository provides all the necessary tools to perform accent identification from speech recordings with SpeechBrain.
|
36 |
-
The system uses a model pretrained on the CommonAccent dataset in English (16 accents).
|
|
|
|
|
37 |
The provided system can recognize the following 16 languages from short speech recordings:
|
38 |
|
39 |
```
|
40 |
african australia bermuda canada england hongkong indian ireland malaysia newzealand philippines scotland singapore southatlandtic us wales
|
41 |
```
|
42 |
|
|
|
|
|
|
|
43 |
### To UPDATE ALL BELOW
|
44 |
|
45 |
For a better experience, we encourage you to learn more about
|
46 |
[SpeechBrain](https://speechbrain.github.io). The given model performance on the test set is:
|
47 |
|
48 |
-
| Release | Accuracy (%)
|
49 |
|:-------------:|:--------------:|
|
50 |
-
|
|
|
|
|
|
51 |
|
52 |
|
53 |
## Pipeline description
|
@@ -72,13 +79,13 @@ Please notice that we encourage you to read our tutorials and learn more about
|
|
72 |
```python
|
73 |
import torchaudio
|
74 |
from speechbrain.pretrained import EncoderClassifier
|
75 |
-
classifier = EncoderClassifier.from_hparams(source="
|
76 |
-
#
|
77 |
-
out_prob, score, index, text_lab = classifier.classify_file('
|
78 |
print(text_lab)
|
79 |
|
80 |
-
#
|
81 |
-
out_prob, score, index, text_lab = classifier.classify_file('
|
82 |
print(text_lab)
|
83 |
```
|
84 |
|
@@ -86,31 +93,38 @@ print(text_lab)
|
|
86 |
To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
|
87 |
|
88 |
### Training
|
89 |
-
|
|
|
|
|
90 |
To train it from scratch follow these steps:
|
|
|
91 |
1. Clone SpeechBrain:
|
92 |
```bash
|
93 |
git clone https://github.com/speechbrain/speechbrain/
|
94 |
```
|
|
|
95 |
2. Install it:
|
96 |
-
```
|
97 |
cd speechbrain
|
98 |
pip install -r requirements.txt
|
99 |
pip install -e .
|
100 |
```
|
101 |
|
102 |
-
3.
|
103 |
-
|
104 |
-
|
105 |
-
|
|
|
|
|
106 |
```
|
107 |
|
108 |
-
You can find our training results (models, logs, etc)
|
109 |
|
110 |
### Limitations
|
111 |
The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
|
112 |
|
113 |
#### Referencing ECAPA
|
|
|
114 |
```@inproceedings{DBLP:conf/interspeech/DesplanquesTD20,
|
115 |
author = {Brecht Desplanques and
|
116 |
Jenthe Thienpondt and
|
|
|
19 |
- Accuracy
|
20 |
widget:
|
21 |
- example_title: Australian English
|
22 |
+
src: data/australia_1.wav
|
23 |
- example_title: African English
|
24 |
+
src: data/african_1.wav
|
25 |
- example_title: Canadian English
|
26 |
+
src: data/canada_1.wav
|
27 |
---
|
28 |
|
29 |
|
|
|
32 |
|
33 |
# Accent Identification from Speech Recordings with ECAPA embeddings on CommonAccent
|
34 |
|
35 |
+
This repository provides all the necessary tools to perform accent identification from speech recordings with [SpeechBrain](https://github.com/speechbrain/speechbrain).
|
36 |
+
The system uses a model pretrained on the CommonAccent dataset in English (16 accents). This system is based on the CommonLanguage Recipe located here: https://github.com/speechbrain/speechbrain/tree/develop/recipes/CommonLanguage
|
37 |
+
|
38 |
+
|
39 |
The provided system can recognize the following 16 languages from short speech recordings:
|
40 |
|
41 |
```
|
42 |
african australia bermuda canada england hongkong indian ireland malaysia newzealand philippines scotland singapore southatlandtic us wales
|
43 |
```
|
44 |
|
45 |
+
<a href="https://github.com/JuanPZuluaga/accent-recog-slt2022"> <img alt="GitHub" src="https://img.shields.io/badge/GitHub-Open%20source-green"> </a> Github repository link: https://github.com/JuanPZuluaga/accent-recog-slt2022
|
46 |
+
|
47 |
+
|
48 |
### To UPDATE ALL BELOW
|
49 |
|
50 |
For a better experience, we encourage you to learn more about
|
51 |
[SpeechBrain](https://speechbrain.github.io). The given model performance on the test set is:
|
52 |
|
53 |
+
| Release (dd/mm/yyyy) | Accuracy (%)
|
54 |
|:-------------:|:--------------:|
|
55 |
+
| 01-08-2023 (this model) | 87 |
|
56 |
+
| 01-08-2023 (this model trained without data augmentation) | 85 |
|
57 |
+
| 01-08-2023 (this model trained from scratch, no paremeter transfer) | 82 |
|
58 |
|
59 |
|
60 |
## Pipeline description
|
|
|
79 |
```python
|
80 |
import torchaudio
|
81 |
from speechbrain.pretrained import EncoderClassifier
|
82 |
+
classifier = EncoderClassifier.from_hparams(source="Jzuluaga/accent-id-commonaccent_ecapa", savedir="pretrained_models/accent-id-commonaccent_ecapa")
|
83 |
+
# Irish Example
|
84 |
+
out_prob, score, index, text_lab = classifier.classify_file('Jzuluaga/accent-id-commonaccent_ecapa/data/ireland_1.wav')
|
85 |
print(text_lab)
|
86 |
|
87 |
+
# Malaysia Example
|
88 |
+
out_prob, score, index, text_lab = classifier.classify_file('Jzuluaga/accent-id-commonaccent_ecapa/data/malaysia_1.wav')
|
89 |
print(text_lab)
|
90 |
```
|
91 |
|
|
|
93 |
To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
|
94 |
|
95 |
### Training
|
96 |
+
|
97 |
+
The model was trained with SpeechBrain.
|
98 |
+
|
99 |
To train it from scratch follow these steps:
|
100 |
+
|
101 |
1. Clone SpeechBrain:
|
102 |
```bash
|
103 |
git clone https://github.com/speechbrain/speechbrain/
|
104 |
```
|
105 |
+
|
106 |
2. Install it:
|
107 |
+
```bash
|
108 |
cd speechbrain
|
109 |
pip install -r requirements.txt
|
110 |
pip install -e .
|
111 |
```
|
112 |
|
113 |
+
3. Clone our repository in https://github.com/JuanPZuluaga/accent-recog-slt2022:
|
114 |
+
|
115 |
+
```bash
|
116 |
+
git clone https://github.com/JuanPZuluaga/accent-recog-slt2022
|
117 |
+
cd CommonAccent/accent_id
|
118 |
+
python train.py hparams/train_ecapa_tdnn.yaml
|
119 |
```
|
120 |
|
121 |
+
You can find our training results (models, logs, etc) in this repository's `Files and versions` page.
|
122 |
|
123 |
### Limitations
|
124 |
The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
|
125 |
|
126 |
#### Referencing ECAPA
|
127 |
+
|
128 |
```@inproceedings{DBLP:conf/interspeech/DesplanquesTD20,
|
129 |
author = {Brecht Desplanques and
|
130 |
Jenthe Thienpondt and
|