Csaba  Kecskemeti's picture

Csaba Kecskemeti PRO

csabakecskemeti

AI & ML interests

None yet

Recent Activity

updated a model about 5 hours ago
DevQuasar/llama3_8b_chat_brainstorm-GGUF
updated a model about 5 hours ago
DevQuasar/llama3.1_8b_chat_brainstorm-v3.1-GGUF
updated a model about 5 hours ago
DevQuasar/llama3_8b_chat_brainstorm_plus-GGUF
View all activity

Organizations

Zillow's profile picture DevQuasar's profile picture Hugging Face Party @ PyTorch Conference's profile picture Intelligent Estate's profile picture open/ acc's profile picture

csabakecskemeti's activity

replied to their post 1 day ago
view reply

No success so far, the training data contains some larger contexts and it fails just before complete the first epoch.
(dataset: DevQuasar/brainstorm-v3.1_vicnua_1k)

If anyone has further suggestion to the bnb config (with ROCm on MI100)?
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)

Now testing with my other dataset that is smaller seems I have a lower memory need
DevQuasar/brainstorm_vicuna_1k