Michael Goin's picture

Michael Goin PRO

mgoin

·

mgoin_
mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Recent Activity

new activity 4 days ago

neuralmagic/Qwen2.5-VL-72B-Instruct-quantized.w8a8:Remove image_processor_type

updated a model 4 days ago

nm-testing/QwQ-32B-FP8-dynamic

updated a model 4 days ago

nm-testing/Ministral-8B-Instruct-2410-FP8-dynamic

View all activity

Organizations

mgoin's activity

New activity in neuralmagic/Qwen2.5-VL-72B-Instruct-quantized.w8a8 4 days ago

Remove image_processor_type

#1 opened 4 days ago by

pooya-davoodi-parasail

New activity in neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w8a8 4 days ago

Remove image_processor_type

#1 opened 4 days ago by

pooya-davoodi-parasail

New activity in neuralmagic/Qwen2.5-VL-72B-Instruct-FP8-Dynamic 13 days ago

Remove image_processor_type

#2 opened 13 days ago by

pooya-davoodi-parasail

New activity in neuralmagic/Qwen2.5-VL-7B-Instruct-FP8-Dynamic 13 days ago

Use Qwen2VLImageProcessor for image_processor_type

#2 opened 16 days ago by

pooya-davoodi-parasail

Use Qwen2VLImageProcessor for image_processor_type

#3 opened 14 days ago by

pooya-davoodi-parasail

New activity in cognitivecomputations/DeepSeek-R1-AWQ 24 days ago

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

#10 opened 25 days ago by

MLA is not supported with moe_wna16 quantization. Disabling MLA.

#7 opened 26 days ago by

New activity in neuralmagic/gemma-2-9b-it-FP8 about 1 month ago

AttributeError: 'Gemma2Config' object has no attribute 'interleaved_sliding_window' Traceback (most recent call last):

#3 opened about 1 month ago by

New activity in neuralmagic/granite-3.1-8b-instruct-FP8-dynamic about 1 month ago

compressed-tensors MLA support requires fp8 activations and weights in group 'group_0',

#1 opened about 1 month ago by

New activity in neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV about 1 month ago

How to load this model?

#1 opened 8 months ago by

New activity in neuralmagic/Llama-3.2-90B-Vision-Instruct-FP8-dynamic 3 months ago

Model does not run with VLLM

#3 opened 3 months ago by

New activity in nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic 3 months ago

Nice model, any info on scripts used to quantize?

#1 opened 3 months ago by

New activity in neuralmagic/Llama-3.2-11B-Vision-Instruct-FP8-dynamic 3 months ago

Thanks!

#2 opened 3 months ago by

New activity in mistralai/Pixtral-Large-Instruct-2411 4 months ago

Add config_format and load_format to vLLM args

#5 opened 4 months ago by

Update config.json to use null for sliding_window

#4 opened 4 months ago by

New activity in mgoin/nemotron-3-8b-chat-4k-sft-hf 4 months ago

Adding `safetensors` variant of this model

#1 opened 4 months ago by

New activity in neuralmagic/Meta-Llama-3.1-70B-Instruct-quantized.w8a16 4 months ago

Is this the standard GPTQ quantization?

#5 opened 4 months ago by

New activity in neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a16 4 months ago

Model weights are not loaded

#3 opened 6 months ago by

New activity in neuralmagic/pixtral-12b-FP8-dynamic 4 months ago

Update model card

#1 opened 4 months ago by

New activity in nm-testing/llava-1.5-7b-hf-FP8-dynamic 4 months ago

Add chat_template to tokenizer_config.json

#1 opened 4 months ago by