Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
csabakecskemeti 
posted an update 2 days ago
Post
642
Fine tuning on the edge. Pushing the MI100 to it's limits.
QWQ-32B 4bit QLORA fine tuning
VRAM usage 31.498G/31.984G :D

The machine itself is also funny. This my my GPU test bench.
Now also testing the PWM fan control and jetkvm

IMG_7216.jpg

It's failed by the morning, need to find more room to decrease the memory

·

No success so far, the training data contains some larger contexts and it fails just before complete the first epoch.
(dataset: DevQuasar/brainstorm-v3.1_vicnua_1k)

If anyone has further suggestion to the bnb config (with ROCm on MI100)?
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)

Now testing with my other dataset that is smaller seems I have a lower memory need
DevQuasar/brainstorm_vicuna_1k