GLM-5.1 tool call output parsing unstable (~50% failure rate) while GLM-4.7 works consistently

#25
by wudaolj - opened

Title: GLM-5.1 tool call output parsing unstable (~50% failure rate) while GLM-4.7 works consistently

Environment:

Model: glm5.1 (via OpenAI-compatible API)

Reference model: GLM-4.7

Observed Behavior:
When using GLM-5.1 with function calling, the model fails to recognize tool execution outputs approximately 50% of the time. The specific symptoms are:

After the client executes a tool (e.g., date or pwd) and returns the output in a role: "tool" message, the model claims it did not receive any output or states that the tool returned empty content.

As a result, the model enters a repetitive loop, attempting different variations of commands (cd, Get-Location, echo %cd%, etc.) to obtain information that was already provided.

The exact same client code and message construction logic works correctly with GLM-4.7, where the model consistently acknowledges tool outputs and completes tasks without looping.

Reproducibility:
The issue occurs intermittently. In multiple test runs with identical inputs and tool outputs, GLM-5.1 successfully processes the output roughly 50% of the time. The failures do not appear to correlate with specific command types or output lengths.

Thanks, we'll give it a try.

We have updated the template, you can use our new template.

It's all set! We're all set here! Thanks!

Sign up or log in to comment