Arthur Zucker's picture

Arthur Zucker

ArthurZ

·

AI & ML interests

None yet

Recent Activity

liked a model 15 days ago

meta-llama/Llama-3.2-1B-Instruct

liked a Space 15 days ago

m-ric/llm-race-to-the-top

reacted to MonsterMMORPG's post with 🚀 27 days ago

FLUX Redux is a hidden Gem I am still doing huge research to publish an amazing fully Public - no paywalled Tutorial, but this is generated via SwarmUI Style Model Merge Strength : 0.5 FLUX Guidance Scale is : 6 Used base model is my FLUX fine tuned model with 256 images via Kohya SS GUI as shown in tutorial ( https://youtu.be/FvpWy1x5etM ) - 70 epoch Prompt : anime ohwx man walking in a jungle <segment:yolo-face_yolov9c.pt-1,0.7,0.5> ohwx man, anime

View all activity

Articles

Fixing Gradient Accumulation

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Fine-Tuning Gemma Models in Hugging Face

Code Llama: Llama 2 learns to code

Organizations

ArthurZ's activity

New activity in mistralai/Pixtral-Large-Instruct-2411 about 1 month ago

Upload transformers version

#3 opened about 1 month ago by

New activity in huggingface/documentation-images about 1 month ago

Upload Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png

#392 opened about 1 month ago by

New activity in mistral-community/pixtral-12b 2 months ago

Update model weight

#13 opened 2 months ago by

Update hidden_act to silu

#14 opened 2 months ago by

New activity in rhymes-ai/Aria 3 months ago

llama.cpp support

#1 opened 3 months ago by

New activity in google/gemma-2-2b-jpn-it 3 months ago

tokenizer_config.json is different from gemma-2-2b-it

#8 opened 3 months ago by

New activity in mistral-community/pixtral-12b 3 months ago

How can i use the full 24GB model instead of this separated safetensors files?

#8 opened 3 months ago by

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct 3 months ago

hidden_activation vs hidden_act in config.json

#10 opened 3 months ago by

New activity in mistral-community/pixtral-12b-240910 3 months ago

How to use safetensors?

#13 opened 3 months ago by

New activity in mistral-community/pixtral-12b 3 months ago

lamma cpp ht to gguf not working

#2 opened 3 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 4 months ago

8-kv-heads

#14 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 4 months ago

Update config.json

#17 opened 4 months ago by

Config KV Heads should be 8 now?

#16 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 5 months ago

8 kv heads

#13 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 5 months ago

8-kv-heads

#15 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B 5 months ago

8-kv-heads

#21 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct 5 months ago

8-kv-heads

#17 opened 5 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 5 months ago

Updated eos_token to include multiple IDs

#14 opened 5 months ago by

Update tokenizer to prepend special token

#12 opened 5 months ago by

New activity in meta-llama/Llama-3.1-70B 5 months ago

Update tokenizer to prepend special token

#11 opened 5 months ago by