Text-to-Speech
Safetensors
qwen2

Audio prompt?

#5
by apepkuss79 - opened

Could you please provide an example of audio prompt? We would like to support the model in LlamaEdge. Thanks a lot!

OuteAI org

I'd suggest looking into the prompt creation class and function get_training_prompt

The full prompt looks like this:

<|im_start|>
<|text_start|>this<|text_sep|>is<|text_sep|>how<|text_sep|>it<|text_sep|>looks<|text_end|>
<|audio_start|>
this<|t_0.27|><|code_start|><|1362|><|1474|><|1023|><|1284|><|1338|><|555|><|1536|><|1246|><|712|><|1570|><|1346|><|1815|><|1004|><|616|><|1583|><|863|><|1518|><|507|><|18|><|713|><|code_end|>
is<|t_0.12|><|code_start|><|154|><|1382|><|903|><|576|><|1826|><|1018|><|1394|><|654|><|6|><|code_end|>
how<|t_0.17|><|code_start|><|1238|><|1082|><|0|><|1795|><|1256|><|757|><|1471|><|1182|><|159|><|1330|><|393|><|462|><|1718|><|code_end|>
it<|t_0.11|><|code_start|><|260|><|1534|><|697|><|959|><|110|><|717|><|1401|><|1800|><|code_end|>
looks<|t_0.36|><|code_start|><|513|><|1081|><|1595|><|806|><|1472|><|755|><|607|><|700|><|1252|><|107|><|1199|><|36|><|1005|><|516|><|1348|><|1735|><|345|><|1118|><|1320|><|1700|><|989|><|372|><|1450|><|746|><|493|><|640|><|231|><|code_end|>
<|audio_end|>
<|im_end|>

Sign up or log in to comment