anthracite-org/magnum-v4-12b · Question about the chat

Hi, I found there is a slight difference on the chat template definition among training samples, tokenizer config and suggeted prompt.

In the README, the chat template is suggested as:

<s>[INST] SYSTEM MESSAGE
USER MESSAGE[/INST] ASSISTANT MESSAGE</s>[INST] USER MESSAGE[/INST]

And in the tokenizer_config, the official template is defined as (which means the system message is placed in the last message):

        {%- if loop.last and system_message is defined %}
            {{- \"[INST]\" + system_message + \"\
\
\" + message[\"content\"] + \"[/INST]\" }}
        {%- else %}
            {{- \"[INST]\" + message[\"content\"] + \"[/INST]\" }}

In addition, I notice that all training samples do not have system message. Does this affect the performance of the model?

anthracite-org
/

magnum-v4-12b

Question about the chat_template