How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for LocoreMind/LocoOperator-4B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for LocoreMind/LocoOperator-4B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for LocoreMind/LocoOperator-4B-GGUF to start chatting
Quick Links

LocoOperator-4B-GGUF

This repository contains the official GGUF quantized versions of LocoOperator-4B.

LocoOperator-4B is a 4B-parameter code exploration agent distilled from Qwen3-Coder-Next. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with 100% JSON tool-calling validity.

πŸš€ Which file should I choose?

We provide several quantization levels to balance performance and memory usage:

File Name Size Recommendation
LocoOperator-4B.Q8_0.gguf 4.28 GB Best Accuracy. Recommended for local agent loops to ensure perfect JSON output.
LocoOperator-4B.Q6_K.gguf 3.31 GB Great Balance. Near-lossless logic with a smaller footprint.
LocoOperator-4B.Q4_K_M.gguf 2.50 GB Standard. Compatible with almost all local LLM runners (LM Studio, Ollama, etc.).
LocoOperator-4B.IQ4_XS.gguf 2.29 GB Advanced. Uses Importance Quantization for better performance at smaller sizes.

πŸ›  Usage (llama.cpp)

To run this model using llama-cli or llama-server, we recommend a context size of at least 50K to handle multi-turn codebase exploration:

Simple CLI Chat:

./llama-cli \
    -m LocoOperator-4B.Q8_0.gguf \
    -c 51200 \
    -p "You are a helpful codebase explorer. Use tools to help the user."

Serve as an OpenAI-compatible API:

./llama-server \
    -m LocoOperator-4B.Q8_0.gguf \
    --ctx-size 51200 \
    --port 8080

πŸ“‹ Model Details

  • Base Model: Qwen3-4B-Instruct-2507
  • Teacher Model: Qwen3-Coder-Next
  • Training Method: Full-parameter SFT (Knowledge Distillation)
  • Primary Use Case: Codebase exploration (Read, Grep, Glob, Bash, Task)

πŸ”— Links

πŸ™ Acknowledgments

Special thanks to mradermacher for the initial quantization work and the llama.cpp community.

Downloads last month
15,815
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LocoreMind/LocoOperator-4B-GGUF

Quantized
(14)
this model