Can you explain what this is?

#1
by jpgallegoar - opened

Hello, thank you for your contribution. Could you explain what this model is, the difference with the base model, how you created it, and how it improves upon the base model?

After some experimentation, we've come into the realization this is probably a distilled model which requires less steps to function. Do you have any idea what the optimal parameters are to generate videos with this model?

FastVideo org

The readme is updated!

Thank you! Is guidance scale still set to 6? Also, trying to run this on ComfyUI, we're getting much worse results with the settings you mentioned, in comparison to something like 10 steps, 10 guidance, 12 flow_shift. Did you have any other changes to the code? Because your examples look great. (I am running fp8)

FastVideo org
β€’
edited 9 days ago

cfg:6, step:6, shift:17, resolution:720X1280X125
You can follow the instructions in FastVideo to reproduce the examples.

I just tried this in bf16 in the ComfyUI-HunyuanVideoWrapper with the suggested guidance scale of 6, 6 steps, shift 17 at 720x1280 and it... somewhat works? The quality is nowhere near as good as the examples though.

I just tried this in bf16 in the ComfyUI-HunyuanVideoWrapper with the suggested guidance scale of 6, 6 steps, shift 17 at 720x1280 and it... somewhat works? The quality is nowhere near as good as the examples though.

Try my settings and let me know: 10 steps, 10 guidance, 12 flow_shift / 12 steps, 11 guidance, 9 flow_shit

Please tell me which one you prefer

I just tried this in bf16 in the ComfyUI-HunyuanVideoWrapper with the suggested guidance scale of 6, 6 steps, shift 17 at 720x1280 and it... somewhat works? The quality is nowhere near as good as the examples though.

Try my settings and let me know: 10 steps, 10 guidance, 12 flow_shift / 12 steps, 11 guidance, 9 flow_shit

Please tell me which one you prefer

Were these settings determined via trial and error? How do you feel it compares to full step gens using vanilla Hunyuan?

My friend also tried the recommended settings from the model card in a different Hunyuan pipeline and was also unable to reproduce the quality of the examples shown.

Appreciate the efforts by the authors, but results feel underwhelming to me as well sadly.

Needs more steps (10 absolute minimum), and at that point just using the original weights and bumping up shift to 17.0 is probably the best method.

I just tried this in bf16 in the ComfyUI-HunyuanVideoWrapper with the suggested guidance scale of 6, 6 steps, shift 17 at 720x1280 and it... somewhat works? The quality is nowhere near as good as the examples though.

Try my settings and let me know: 10 steps, 10 guidance, 12 flow_shift / 12 steps, 11 guidance, 9 flow_shit

Please tell me which one you prefer

Were these settings determined via trial and error? How do you feel it compares to full step gens using vanilla Hunyuan?

My friend also tried the recommended settings from the model card in a different Hunyuan pipeline and was also unable to reproduce the quality of the examples shown.

Yeah trial and error. It's definitely worse than base Hunyuan, but it you just wanna play around and gen quickly, it's alright

FastVideo org
β€’
edited 8 days ago

I just tried this in bf16 in the ComfyUI-HunyuanVideoWrapper with the suggested guidance scale of 6, 6 steps, shift 17 at 720x1280 and it... somewhat works? The quality is nowhere near as good as the examples though.

Try my settings and let me know: 10 steps, 10 guidance, 12 flow_shift / 12 steps, 11 guidance, 9 flow_shit

Please tell me which one you prefer

Were these settings determined via trial and error? How do you feel it compares to full step gens using vanilla Hunyuan?

My friend also tried the recommended settings from the model card in a different Hunyuan pipeline and was also unable to reproduce the quality of the examples shown.

Thanks for your feedback, The sample we provide is under a cfg:6, step:6, shift:17, resolution:720X1280X125 with a seed1024 in our repo, to our knowledge, the quality varies for different seeds.

FastVideo org
β€’
edited 7 days ago

Thanks for the feedback. We will definitely try to improve the quality. For now, you can reproduce the example videos following the guide in the FastVideo.
Also note that we only distill on 720X1280X125, so this might not work well on other resultions.

Hello, I have gotten much better results using other parameters. These are some comparisons:
https://streamable.com/1jusbj

Maybe a small adapter in a secondary direction used to enhance the 6 step also with 6 step could improve overall quality at a 12 step cost, at a constant memory, by just applying different adapters. For example first 6 steps are to quickly get to the high resolution the second adapter starts all its training at the finished 6th step version from this model to generate higher quality generations through detail specific amplification, heck if this interests you, adding a motion step sequence maybe useful too, you may find that separate adapters at the cost of loading the new model come with step speed ups and quality improvements, does anyone know of any papers on something like this, adapter stepped video generation...

Step specific adapter based generation.

Sign up or log in to comment