GGUFs?

#3
by Throghar - opened

I would like to see some decent GGUFs for example from mradermacher, bartowski or unsloth so people can test this model more.

Orion LLM Labs org

why is that you gatekeepers who call themselves "skeptics" worship people and companies who are established and say "not possible" to any one "claiming" to have huge stuff and gives evidence for it? (im not actually a part of this org, i just happened to send request and they accepted it)

Main reason is, that qunatization quality directly affects models performance and stability and this results in real usefullness.

I may start to advocate for autoround quantization, apex and other alternative methods, because standard quants like Q4_K_M dont provide adequate results and model is compared to ones i provided dumber.

Prompt: Create svg image of a pelican riding a bicycle

Autoround Q2_K_Mixed https://huggingface.co/sphaela/Qwen3.6-27B-AutoRound-GGUF
image

Regular llama.cpp Q4_K_M https://huggingface.co/morikomorizz/GRM-2.6-Plus-GGUF
image

This is just one example and the output quality is consistently worse, when i ask it tricky questions, how much it hallucinates, loops etc.

Community should understand, that typical quantization under Q5-6 is inadequate for qwen models unless you tinker with it through some more intelligent mechanism like intel autoround does.

Looping from my experience is for example direct symptom of broken quantization, occasional syntactic errors in agentic coding another.

Orion LLM Labs org

I would like to see some decent GGUFs for example from mradermacher, bartowski or unsloth so people can test this model more.

I sent a request to the mradermacher team for them to do a GGUF of GRM-2.6-Plus: https://huggingface.co/mradermacher/model_requests/discussions/2320

Sign up or log in to comment