Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Paper • 2306.15687 • Published Jun 23, 2023
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5 • 34
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization Paper • 2404.09956 • Published Apr 15 • 11
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts Paper • 2307.07218 • Published Jul 14, 2023 • 26
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias Paper • 2306.03509 • Published Jun 6, 2023 • 4