
Inference engines generally use the jinja chat template, and we found a few issues with them after using Harmony directly. If you see below, the top is the correct rendered from as from Harmony. The below is the one rendered by the current jinja chat template.messages = [
{"role" : "user", "content" : "What is 1+1?"},
{"role" : "assistant", "content" : "2"},
{"role": "user", "content": "What's the temperature in San Francisco now? How about tomorrow? Today's date is 2024-09-30."},
{"role": "assistant", "content": "User asks: 'What is the weather in San Francisco?' We need to use get_current_temperature tool.", "thinking" : ""},
{"role": "assistant", "content": "", "tool_calls": [{"name": "get_current_temperature", "arguments": '{"location": "San Francisco, California, United States", "unit": "celsius"}'}]},
{"role": "tool", "name": "get_current_temperature", "content": '{"temperature": 19.9, "location": "San Francisco, California, United States", "unit": "celsius"}'},
]Then use the encode_conversations_with_harmony function from Unsloth.from unsloth_zoo import encode_conversations_with_harmony
def encode_conversations_with_harmony(
messages,
reasoning_effort = "medium",
add_generation_prompt = True,
tool_calls = None,
developer_instructions = None,
model_identity = "You are ChatGPT, a large language model trained by OpenAI.",
)The harmony format includes multiple interesting things: