Fail to run on two A10Gs
#25
by
yNilay
- opened
Hi! The model failed to run on two A10Gs, is there any way to run it on two A10Gs? Thanks!
Error:
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 522.00 MiB. GPU 0 has a total capacity of 22.18 GiB of which 328.69 MiB is free. Process 293155 has 21.86 GiB memory in use. Of the allocated memory 20.74 GiB is allocated by PyTorch, and 858.09 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Code:
import torch
from diffusers import MochiPipeline
from diffusers.utils import export_to_video
pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", variant="bf16", torch_dtype=torch.bfloat16)
# Enable memory savings
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
frames = pipe(prompt, num_frames=84).frames[0]
export_to_video(frames, "mochi.mp4", fps=30)
@ved-genmo Can you please suggest what to do?
Can you try using the official Mochi API, instead of the diffusers API? https://github.com/genmoai/mochi
There's a cli.py script in demos that should automatically shard the model on multiple GPUs.