Instructions to use kabachuha/Gemma-4-The-Deckards-Brain-31B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kabachuha/Gemma-4-The-Deckards-Brain-31B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="kabachuha/Gemma-4-The-Deckards-Brain-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("kabachuha/Gemma-4-The-Deckards-Brain-31B")
model = AutoModelForImageTextToText.from_pretrained("kabachuha/Gemma-4-The-Deckards-Brain-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use kabachuha/Gemma-4-The-Deckards-Brain-31B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kabachuha/Gemma-4-The-Deckards-Brain-31B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kabachuha/Gemma-4-The-Deckards-Brain-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/kabachuha/Gemma-4-The-Deckards-Brain-31B

SGLang

How to use kabachuha/Gemma-4-The-Deckards-Brain-31B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kabachuha/Gemma-4-The-Deckards-Brain-31B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kabachuha/Gemma-4-The-Deckards-Brain-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kabachuha/Gemma-4-The-Deckards-Brain-31B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kabachuha/Gemma-4-The-Deckards-Brain-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use kabachuha/Gemma-4-The-Deckards-Brain-31B with Docker Model Runner:
```
docker model run hf.co/kabachuha/Gemma-4-The-Deckards-Brain-31B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

The DECKARD's Brain - Gemma 4, 31b

This model aims to counteract the recent slop- and fluffmaxxed fine-tunes which have a great grasp of anime with the raw power of darkness and truely unhinged storytelling of The DECKARD.

It is a slerp merge of DavidAU's The Deckard fine-tune of Gemma 4 31B with the grand melting pot of Nimbz's Gembrain (MeroMero, Garnet, Musica, etc.) through a custom script. In theory (and practice), this is enabling you to have an enhanced experience of both anime tropes and surrealist psychedelic erotic horror in the same conversation!

The interpolation coefficient was selected as 0.6934 (arbitrary number) in favor of The Deckard to decay the distillmaxxed models slop.

This is a merger of two hereticised models: The Deckard (hereticised by the author) and Gembrain, which has been hereticised by LLMFan before this merge. Nimbz's Gembrain has had an insanely high refusal rate of 99/100 and I had to select the abliterated version.

Note: Please, do not uncensor this model further – the model is already a merge of two hereticised models.

The prompt format is fully inherited from DavidAU's The Deckard Thinking, meaning the chat template will have thinking by default. To disable, override the chat template with chat_template-instruct.jinja.

Thinking is highly recommended to not be ever turned off! If you are role-playing in non-English, this is crucial to bring your language output distribution closer to the English/Japanese training set, otherwise it will be much more bland and more censored!

This model's capabilities (anime tropes, horror, heretification) transfer to other languages and, for example, it works perfectly in Russian!

The settings I use:

Temperature: 1.2
Frequency penalty: 0.3
Presence penalty: 0.7
Top P 0.93

For the best effect, add to your system prompt or the OOC command before a particularly unhinged scene/ST message: Write the next scene in the style of a horny AO3 teenage fanfic (but without "fanfic" sugar) writer / Philip K. Dick hybrid. The model strongly reacts to both style guides (especially to Dick's mention). The "/" symbol is a part of the prompt, not a "choose one" instruction!

I provide a NVFP4-converted, calibrated on r/writingprompts, quant here, for very high speed on Blackwell (60tg/sec compared to 30tg/sec on Q8), along the some other GGUF files (Q8, Q6, Q4) at GGUF here.

The model's fundamental lineage can be summarized as follows:

DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking

I'm grateful for all the authors of these models, especially to the base trainers and fine-tuners.

Downloads last month: 64

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for kabachuha/Gemma-4-The-Deckards-Brain-31B

DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking

Nimbz/Gemma-4-Gembrain-31B

Merge model

this model

Quantizations

4 models