Hardware recommendations?

#6
by k-nearest-neighbor - opened

What's the minimum and recommended VRAM?
I see the space is running on an A100 40GB.
Any guidance on tradeoffs of popular cards?

Excited!

Lightricks org

People have successfully run the model with 6GB of VRAM and 16GB of RAM with some tricks (quantized clip encoder, etc) and generating 512x512 resolution with 50 frames. This is the lowest VRAM requirement we've seen so far!
On an RTX 4090 users have generated 121 frames in 11 seconds, and on top-tier hardware (H100/fal.ai) it can generate a 512×768 video with 121 frames in just 4 seconds.

So the tradeoffs boil down to speed: lower VRAM setups may require reduced resolution, frame count, or slower generation times, while higher-end hardware can unlock lightning-fast performance and higher resolutions.

Thank you! Can you please tell how to quantize it? My friend has 8 GB VRAM and we wanted to run, but we got CUDA memory error.

Thank you! Can you please tell how to quantize it? My friend has 8 GB VRAM and we wanted to run, but we got CUDA memory error.

If a model is above 8 gb never expect it to run .. just use cogvideoX with cpu offload

Try as I might, I can't get this to generate anything at all, regardless of the settings:

torch.OutOfMemoryError: HIP out of memory. Tried to allocate 160.00 MiB. GPU 0 has a total capacity of 23.98 GiB of which 66.00 MiB is free. Of the allocated memory 23.39 GiB is allocated by PyTorch, and 355.11 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Could you please give me some settings or other guidance to work with if this is indeed supposed to work with consumer cards.

People have successfully run the model with 6GB of VRAM and 16GB of RAM with some tricks (quantized clip encoder, etc) and generating 512x512 resolution with 50 frames. This is the lowest VRAM requirement we've seen so far!
On an RTX 4090 users have generated 121 frames in 11 seconds, and on top-tier hardware (H100/fal.ai) it can generate a 512×768 video with 121 frames in just 4 seconds.

So the tradeoffs boil down to speed: lower VRAM setups may require reduced resolution, frame count, or slower generation times, while higher-end hardware can unlock lightning-fast performance and higher resolutions.

Where's the best place to find the quantisations? @benibraz

8 vram RTX 4060 working.. the default workflow eats up ram and gpu .. i am making a tutorial soon

8 vram RTX 4060 working.. the default workflow eats up ram and gpu .. i am making a tutorial soon

Where will you be posting your tutorial? thx

Sign up or log in to comment