What is the advantage of fine-tuning based on coder-instruct model rather than instruct?

by sangmini - opened 8 days ago

8 days ago

Hello!

I'm curious about why models are trained on top of Qwen/Qwen2.5-Coder-7B-Instruct rather than the standard Qwen/Qwen2.5-7B-Instruct. I'd like to understand if there are references or evidence suggesting that coder-instruct models provide better performance as base models for fine-tuning.

Thanks

linqq9

MadeAgents org 4 days ago

Thank you for your question! Based on our experiments, we found that fine-tuning on the Qwen/Qwen2.5-Coder-7B-Instruct base model yields better performance for function calling tasks compared to using the standard Qwen2.5-7B-Instruct model. This aligns with observations mentioned in XLAM-related research papers.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment