chat template doesn't include tools

#3
by copasseron - opened

Hi mistral team,

nice to see a new model from you guys, thanks a lot.

https://huggingface.co./mistralai/Mistral-Small-24B-Instruct-2501/blob/main/tokenizer_config.json#L9010

in the jinja chat template we don't have anything related to tool (not available nor to put the tools result in the history of messages sent to the model), is it intended?

ollama do include it on their side:

https://ollama.com/library/mistral-small/blobs/5de2b8ebfbdd

{{- range $index, $_ := .Messages }}
{{- if eq .Role "system" }}[SYSTEM_PROMPT] {{ .Content }}[/SYSTEM_PROMPT]
{{- else if eq .Role "user" }}
{{- if and (le (len (slice $.Messages $index)) 2) $.Tools }}[AVAILABLE_TOOLS] {{ $.Tools }}[/AVAILABLE_TOOLS]
{{- end }}[INST] {{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{- if .Content }} {{ .Content }}
{{- if not (eq (len (slice $.Messages $index)) 1) }}</s>
{{- end }}
{{- else if .ToolCalls }}[TOOL_CALLS] [
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{- end }}]</s>
{{- end }}
{{- else if eq .Role "tool" }}[TOOL_RESULTS] {"content": {{ .Content }}}[/TOOL_RESULTS]
{{- end }}
{{- end }}

thanks a lot

Mistral AI_ org

We've tested function calling only with vLLM: https://huggingface.co./mistralai/Mistral-Small-24B-Instruct-2501#function-calling
The model should work very well for function calling tasks!

Can you give this a try?

Also, we'd be more than happy about any contribution to make function calling work with HF format.

It was working fine before and recent commit added strftime and it is bugging at Text-Generation-Inference now

@patrickvonplaten I was gonna test that today as well. That works without applying the template extension from OP or did you include it?

Also did you try this on openai compatible vllm endpoint or just the offline inference?

On TGI , it works without op template. Now it is broken after they included strftime

On TGI , it works without op template. Now it is broken after they included strftime

The latest TGI commit fixes this.

But regarding the original topic, I'm getting this error when using tool calling: Template error: syntax error: Only user, system and assistant roles are supported! .

On TGI , it works without op template. Now it is broken after they included strftime

The latest TGI commit fixes this.

But regarding the original topic, I'm getting this error when using tool calling: Template error: syntax error: Only user, system and assistant roles are supported! .

yes, that is my concern also.

I'm deploying the model with nvidia triton + vllm backend, so I can't use the LLM.chat() endpoint of vLLM.

the vLLM backend of triton uses https://docs.vllm.ai/en/v0.6.5/dev/engine/async_llm_engine.html , that takes the text directly before passing it to the tokenizer.

I'm obliged to template the messages first (either applying the jinja template myself or using transformers apply_chat_template() https://huggingface.co./docs/transformers/v4.37.1/chat_templating method, that uses the chat template here: https://huggingface.co./mistralai/Mistral-Small-24B-Instruct-2501/blob/20b2ed1c4e9af44b9ad125f79f713301e27737e2/tokenizer_config.json#L9010 ).

However, the chat template provided for this new model doesn't support tools (neither the response in the history of messages, nor the available tools to use).

Sign up or log in to comment