--- tags: - espnet - audio - automatic-speech-recognition - audio_captioning language: en datasets: - clotho_v2 - slseanwu/clotho-chatgpt-mixup-50K - audiocaps license: cc-by-4.0 ---