GLM-5.1 tool call output parsing unstable (~50% failure rate) while GLM-4.7 works consistently
Title: GLM-5.1 tool call output parsing unstable (~50% failure rate) while GLM-4.7 works consistently
Environment:
Model: glm5.1 (via OpenAI-compatible API)
Reference model: GLM-4.7
Observed Behavior:
When using GLM-5.1 with function calling, the model fails to recognize tool execution outputs approximately 50% of the time. The specific symptoms are:
After the client executes a tool (e.g., date or pwd) and returns the output in a role: "tool" message, the model claims it did not receive any output or states that the tool returned empty content.
As a result, the model enters a repetitive loop, attempting different variations of commands (cd, Get-Location, echo %cd%, etc.) to obtain information that was already provided.
The exact same client code and message construction logic works correctly with GLM-4.7, where the model consistently acknowledges tool outputs and completes tasks without looping.
Reproducibility:
The issue occurs intermittently. In multiple test runs with identical inputs and tool outputs, GLM-5.1 successfully processes the output roughly 50% of the time. The failures do not appear to correlate with specific command types or output lengths.
maybe is this issue?
https://huggingface.co/zai-org/GLM-5.1-FP8/discussions/3/files#d2h-526183
we are checking today
Thanks, we'll give it a try.
We have updated the template, you can use our new template.
It's all set! We're all set here! Thanks!