Instructions to use wolfram/miqu-1-103b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use wolfram/miqu-1-103b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="wolfram/miqu-1-103b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("wolfram/miqu-1-103b")
model = AutoModelForCausalLM.from_pretrained("wolfram/miqu-1-103b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use wolfram/miqu-1-103b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "wolfram/miqu-1-103b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wolfram/miqu-1-103b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/wolfram/miqu-1-103b

SGLang

How to use wolfram/miqu-1-103b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "wolfram/miqu-1-103b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wolfram/miqu-1-103b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "wolfram/miqu-1-103b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wolfram/miqu-1-103b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use wolfram/miqu-1-103b with Docker Model Runner:
```
docker model run hf.co/wolfram/miqu-1-103b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

miqu-1-103b

HF: wolfram/miqu-1-103b
GGUF: wolfram/miqu-1-103b-GGUF | mradermacher's static quants | weighted/imatrix quants
EXL2: wolfram/miqu-1-103b-5.0bpw-h6-exl2 | LoneStriker's 2.4bpw | 3.0bpw | 3.5bpw

This is a 103b frankenmerge of miqu-1-70b created by interleaving layers of miqu-1-70b-sf with itself using mergekit.

Inspired by Midnight-Rose-103B-v2.0.3.

Thanks for the support, CopilotKit - the open-source platform for building in-app AI Copilots into any product, with any LLM model. Check out their GitHub.

Thanks for the quants, Michael Radermacher and Lone Striker!

Also available:

miqu-1-120b – Miqu's older, bigger twin sister; same Miqu, inflated to 120B.
miquliz-120b-v2.0 – Miqu's younger, fresher sister; a new and improved Goliath-like merge of Miqu and lzlv.

Model Details

Max Context: 32768 tokens
Layers: 120

Prompt template: Mistral

<s>[INST] {prompt} [/INST]

Merge Details

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

152334H/miqu-1-70b-sf

Configuration

The following YAML configuration was used to produce this model:

mergekit_config.yml

dtype: float16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 40]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [20, 60]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [40, 80]
    model: 152334H/miqu-1-70b-sf

Credits & Special Thanks

original (unreleased) model: mistralai (Mistral AI_)
- ⭐⭐⭐ Use their newer, better, official models here! ⭐⭐⭐
leaked model: miqudev/miqu-1-70b
f16 model: 152334H/miqu-1-70b-sf
mergekit: arcee-ai/mergekit: Tools for merging pretrained large language models.
mergekit_config.yml: sophosympatheia/Midnight-Rose-103B-v2.0.3

Support

My Ko-fi page if you'd like to tip me to say thanks or request specific models to be tested or merged with priority. Also consider supporting your favorite model creators, quantizers, or frontend/backend devs if you can afford to do so. They deserve it!

Disclaimer

This model contains leaked weights and due to its content it should not be used by anyone. 😜

But seriously:

License

What I know: Weights produced by a machine are not copyrightable so there is no copyright owner who could grant permission or a license to use, or restrict usage, once you have acquired the files.

Ethics

What I believe: All generative AI, including LLMs, only exists because it is trained mostly on human data (both public domain and copyright-protected, most likely acquired without express consent) and possibly synthetic data (which is ultimately derived from human data, too). It is only fair if something that is based on everyone's knowledge and data is also freely accessible to the public, the actual creators of the underlying content. Fair use, fair AI!

Downloads last month: 85

Safetensors

Model size

103B params

Tensor type

F16

Model tree for wolfram/miqu-1-103b

Base model

152334H/miqu-1-70b-sf

Finetuned

(25)

this model

Quantizations

2 models