Instructions to use openbmb/MiniCPM4-MCP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use openbmb/MiniCPM4-MCP with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="openbmb/MiniCPM4-MCP", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("openbmb/MiniCPM4-MCP", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use openbmb/MiniCPM4-MCP with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "openbmb/MiniCPM4-MCP"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openbmb/MiniCPM4-MCP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/openbmb/MiniCPM4-MCP

SGLang

How to use openbmb/MiniCPM4-MCP with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "openbmb/MiniCPM4-MCP" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openbmb/MiniCPM4-MCP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "openbmb/MiniCPM4-MCP" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openbmb/MiniCPM4-MCP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use openbmb/MiniCPM4-MCP with Docker Model Runner:
```
docker model run hf.co/openbmb/MiniCPM4-MCP
```

SmartDazi commited on Jun 10, 2025

Commit

a86a48b

verified ·

1 Parent(s): 6c35761

Update README.md

Browse files

Files changed (1) hide show

README.md +55 -0

README.md CHANGED Viewed

@@ -48,6 +48,61 @@ As of now, MiniCPM4-MCP supports the following:
 - Cross-tool-calling capability: It can perform single- or multi-step tool calls using different tools that complies with the MCP.
 ## Evaluation
 The detailed evaluation script can be found on the [GitHub](https://github.com/OpenBMB/MiniCPM/tree/main/demo/minicpm4/MCP) page. The evaluation results are presented below.

 - Cross-tool-calling capability: It can perform single- or multi-step tool calls using different tools that complies with the MCP.
+## Inference
+### MCP Servers Deployment
+The MCP Servers supported by MiniCPM4-MCP include
+[Airbnb](https://github.com/openbnb-org/mcp-server-airbnb),
+[Amap-Maps](https://github.com/zxypro1/amap-maps-mcp-server),
+[Arxiv-MCP-Server](https://github.com/blazickjp/arxiv-mcp-server),
+[Calculator](https://github.com/githejie/mcp-server-calculator),
+[Computer-Control-MCP](https://github.com/AB498/computer-control-mcp),
+[Desktop-commander](https://github.com/wonderwhy-er/DesktopCommanderMCP),
+[Filesystem](https://github.com/mark3labs/mcp-filesystem-server),
+[Github](https://github.com/modelcontextprotocol/servers/tree/main/src/github),
+[Gaode](https://github.com/perMAIN/gaode),
+[MCP-Code-Executor](https://github.com/bazinga012/mcp_code_executor),
+[MCP-DOCx](https://github.com/MeterLong/MCP-Doc),
+[PPT](https://github.com/GongRzhe/Office-PowerPoint-MCP-Server),
+[PPTx](https://github.com/supercurses/powerpoint),
+[Simple-Time-Server](https://github.com/andybrandt/mcp-simple-timeserver),
+[Slack](https://github.com/modelcontextprotocol/servers/tree/main/src/slack), and
+[Whisper](https://github.com/arcaputo3/mcp-server-whisper). Follow the instructions provided in each server's repository for successful deployment. Note that not all tools in these servers will function properly in every environment. Some tools are unstable and may return errors such as timeouts or HTTP errors. During training data construction, tools with consistently high failure rates (e.g., those for which the LLM fails to produce a successful query even after hundreds of attempts) are filtered out.
+### MCP Client Setup
+We modified the existing MCP Client from the [mcp-cli](https://github.com/chrishayuk/mcp-cli) repository to enable interaction between MiniCPM and MCP Servers.
+After the MCP Client performs a handshake with a Server, it retrieves a list of available tools. An example of tool information contained in this list is provided in [`available_tool_example.json`](https://github.com/OpenBMB/MiniCPM/blob/main/demo/minicpm4/MCP/available_tool_example.json).
+Once the available tools and user query are obtained, results can be generated using the following script logic:
+```bash
+python generate_example.py \
+--tokenizer_path {path to MiniCPM4 tokenizer} \
+--base_url {vllm deployment URL} \
+--model {model name used in vllm deployment} \
+--output_path {path to save results}
+```
+where the `generate_example.py` is located in [link](https://github.com/OpenBMB/MiniCPM/blob/main/demo/minicpm4/MCP/generate_example.py) and MiniCPM4 generates tool calls in the following format:
+```
+    <|tool_call_start|>
+    ```python
+    read_file(path="/path/to/file")
+    ```
+    <|tool_call_end|>
+```
+You can build a custom parser for MiniCPM4 tool calls based on this format. The relevant parsing logic is located in `generate_example.py`.
+Since the [mcp-cli](https://github.com/chrishayuk/mcp-cli) repository supports the vLLM inference framework, MiniCPM4-MCP can also be integrated into `mcp-cli` by modifying vLLM accordingly.
+Specifically, follow the instructions in [this link](https://github.com/OpenBMB/MiniCPM/tree/main/demo/minicpm3/function_call) to enable interaction between a client running the MiniCPM4-MCP model and the MCP Server.
 ## Evaluation
 The detailed evaluation script can be found on the [GitHub](https://github.com/OpenBMB/MiniCPM/tree/main/demo/minicpm4/MCP) page. The evaluation results are presented below.