How many GPUs do we need to run this out of box?

by kz919 - opened

Sorry for the noob question, but I don't find that information anywhere.
Assuming A100s.

I got it running on one in 4bit and 8bit, higher precision might require 2 or 3

Looking for some feedback from anyone here. benchmark tokens per second on a single a100?

Sign up or log in to comment