distil-whisper/earnings21
Viewer
•
Updated
•
44
•
121
•
2
A collection of long-form (samples > 30s) datasets used to evaluate the Distil-Whisper models.
Note Config: "full" Split: "test"
Note Config: "full" Split: "test"
Note Config: "default" Split: "test"
Note Config: "whisper_subset" Split: "test" We evaluate on a subset of 16 files from the 30 total podcast episodes. The Whisper paper states that in the other files, the audio and labels do not match. Thus, they are excluded from the benchmark. This 16 file subset corresponds to the config "whisper_subset".