# 本地 LLM 的工具调用指南

工具调用是指允许 LLM 通过发出结构化请求来触发特定函数（例如“搜索我的文件”、“运行计算器”或“调用 API”），而不是靠文本猜测答案。你使用工具调用是因为它们能让输出 **更可靠且更新鲜**，并且它们让模型 **采取真实行动** （查询系统、验证事实、强制执行模式），而不是产生幻觉。

在本教程中，你将学习如何通过工具调用使用本地 LLM，并配合数学、故事、Python 代码和终端函数示例。推理通过 llama.cpp、llama-server 和 OpenAI 端点在本地完成。

{% columns %}
{% column %}
当你使用 [Unsloth Studio](https://unsloth.ai/docs/zh/xin-zeng/studio/chat#auto-healing-tool-calling)时，工具调用会自动设置。只需选择你的模型，然后打开或关闭工具调用即可。

右侧可查看一个工具调用自动应用于 [Gemma 4](https://unsloth.ai/docs/zh/mo-xing/gemma-4)的示例。Unsloth 还提供自我修复工具调用，确保你始终拥有可用的工具调用。
{% endcolumn %}

{% column %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FstfdTMsoBMmsbQsgQ1Ma%2Flandscape%20clip%20gemma.gif?alt=media&#x26;token=eec5f2f7-b97a-4c1c-ad01-5a041c3e4013" alt=""><figcaption></figcaption></figure>
{% endcolumn %}
{% endcolumns %}

我们的指南应适用于几乎 [任何模型](https://unsloth.ai/docs/zh/kai-shi-shi-yong/unsloth-model-catalog) ，包括：

* [**Qwen3**-Coder-Next](https://unsloth.ai/docs/models/qwen3-coder-next), [Qwen3-Coder](https://unsloth.ai/docs/models/qwen3-coder-how-to-run-locally)，以及其他 **Qwen** 模型
* [**GLM**-4.7](https://unsloth.ai/docs/models/glm-4.7), 4.6, [GLM-4.7-Flash](https://unsloth.ai/docs/models/glm-4.7-flash) 和 [**Kimi K2.5**](https://unsloth.ai/docs/models/kimi-k2.5), [Kimi K2 Thinking](https://unsloth.ai/docs/models/tutorials/kimi-k2-thinking-how-to-run-locally)
* [**DeepSeek**-V3.1](https://unsloth.ai/docs/models/tutorials/deepseek-v3.1-how-to-run-locally)、DeepSeek-V3.2 以及 **MiniMax**
* [**gpt-oss**](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune) 和 [**NVIDIA Nemotron** 3 Nano](https://unsloth.ai/docs/models/tutorials/nemotron-3) 和 [**Devstral** 2](https://unsloth.ai/docs/models/tutorials/devstral-2)

<a href="#qwen3-coder-next-tool-calling" class="button primary">Qwen3-Coder-Next 教程</a><a href="#glm-4.7-flash--glm-4.7-calling" class="button secondary">GLM-4.7-Flash 教程</a>

### :hammer:工具调用设置

我们的第一步是获取最新的 `llama.cpp` ，位于 [GitHub 此处](https://github.com/ggml-org/llama.cpp)。你也可以按照下面的构建说明进行操作。将 `-DGGML_CUDA=ON` 改为 `-DGGML_CUDA=OFF` ，如果你没有 GPU，或者只想进行 CPU 推理。 **对于 Apple Mac / Metal 设备**，设置 `-DGGML_CUDA=OFF` ，然后像往常一样继续——Metal 支持默认已开启。

{% code overflow="wrap" %}

```bash
apt-get update
apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y
git clone https://github.com/ggml-org/llama.cpp
cmake llama.cpp -B llama.cpp/build \
    -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON
cmake --build llama.cpp/build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split
cp llama.cpp/build/bin/llama-* llama.cpp
```

{% endcode %}

在一个新的终端中（如果使用 tmux，请使用 CTRL+B+D），我们创建一些工具，例如加 2 个数字、执行 Python 代码、执行 Linux 函数等等：

{% code expandable="true" %}

```python
import json, subprocess, random
from typing import Any
def add_number(a: float | str, b: float | str) -> float:
    return float(a) + float(b)
def multiply_number(a: float | str, b: float | str) -> float:
    return float(a) * float(b)
def substract_number(a: float | str, b: float | str) -> float:
    return float(a) - float(b)
def write_a_story() -> str:
    return random.choice([
        "很久很久以前，在一个遥远的星系里……",
        "有两个朋友喜欢树懒和代码……",
        "世界正在终结，因为每一只树懒都进化出了超人的智慧……",
        "其中一位朋友并不知道，另一位朋友不小心编写了一个让树懒进化的程序……",
    ])
def terminal(command: str) -> str:
    if "rm" in command or "sudo" in command or "dd" in command or "chmod" in command:
        msg = "无法执行 'rm, sudo, dd, chmod' 命令，因为它们很危险"
        print(msg); return msg
    print(f"正在执行终端命令 `{command}`")
    try:
        return str(subprocess.run(command, capture_output = True, text = True, shell = True, check = True).stdout)
    except subprocess.CalledProcessError as e:
        return f"命令失败：{e.stderr}"
def python(code: str) -> str:
    data = {}
    exec(code, data)
    del data["__builtins__"]
    return str(data)
MAP_FN = {
    "add_number": add_number,
    "multiply_number": multiply_number,
    "substract_number": substract_number,
    "write_a_story": write_a_story,
    "terminal": terminal,
    "python": python,
}
tools = [
    {
        "type": "function",
        "function": {
            "name": "add_number",
            "description": "添加两个数字。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "multiply_number",
            "description": "将两个数字相乘。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "substract_number",
            "description": "将两个数字相减。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "write_a_story",
            "description": "写一个随机故事。",
            "parameters": {
                "type": "object",
                "properties": {},
                "required": [],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "terminal",
            "description": "执行终端中的操作。",
            "parameters": {
                "type": "object",
                "properties": {
                    "command": {
                        "type": "string",
                        "description": "你希望执行的命令，例如 `ls`、`rm` 等。",
                    },
                },
                "required": ["command"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "python",
            "description": "调用一个 Python 解释器来运行一些将被执行的 Python 代码。",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "要运行的 Python 代码",
                    },
                },
                "required": ["code"],
            },
        },
    },
]
```

{% endcode %}

然后我们使用下面的函数（复制粘贴并执行），它会自动解析函数调用，并为任何模型调用 OpenAI 端点：

{% hint style="info" %}
在这个示例中，我们使用的是 Devstral 2；切换模型时，请确保使用正确的采样参数。你可以在我们的 [指南这里查看](https://unsloth.ai/docs/zh/mo-xing/tutorials).
{% endhint %}

{% code overflow="wrap" expandable="true" %}

```python
from openai import OpenAI
def unsloth_inference(
    messages,
    temperature = 0.7,
    top_p = 0.95,
    top_k = 40,
    min_p = 0.01,
    repetition_penalty = 1.0,
):
    messages = messages.copy()
    openai_client = OpenAI(
        base_url = "http://127.0.0.1:8001/v1",
        api_key = "sk-no-key-required",
    )
    model_name = next(iter(openai_client.models.list())).id
    print(f"Using model = {model_name}")
    has_tool_calls = True
    original_messages_len = len(messages)
    while has_tool_calls:
        print(f"当前消息 = {messages}")
        response = openai_client.chat.completions.create(
            model = model_name,
            messages = messages,
            temperature = temperature,
            top_p = top_p,
            tools = tools if tools else None,
            tool_choice = "auto" if tools else None,
            extra_body = {"top_k": top_k, "min_p": min_p, "repetition_penalty" :repetition_penalty,}
        )
        tool_calls = response.choices[0].message.tool_calls or []
        content = response.choices[0].message.content or ""
        tool_calls_dict = [tc.to_dict() for tc in tool_calls] if tool_calls else tool_calls
        messages.append({"role": "assistant", "tool_calls": tool_calls_dict, "content": content,})
        for tool_call in tool_calls:
            fx, args, _id = tool_call.function.name, tool_call.function.arguments, tool_call.id
            out = MAP_FN[fx](**json.loads(args))
            messages.append({"role": "tool", "tool_call_id": _id, "name": fx, "content": str(out),})
        else:
            has_tool_calls = False
    return messages
```

{% endcode %}

现在我们将展示多种工具调用方法，适用于下面许多不同的使用场景：

### 写故事：

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "你能给我写个故事吗？"}],
}]
unsloth_inference(messages, temperature = 0.15, top_p = 1.0, top_k = -1, min_p = 0.00)
```

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2F04clxomfFQjhIKiS5FAY%2Fimage.png?alt=media&#x26;token=299279c9-cca6-48d6-ab74-523edef04160" alt=""><figcaption></figcaption></figure>

### 数学运算：

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "今天的日期加 3 天是多少？"}],
}]
unsloth_inference(messages, temperature = 0.15, top_p = 1.0, top_k = -1, min_p = 0.00)
```

{% endcode %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2Fi15WEhfSIPuFeZUWjrG1%2Fimage.png?alt=media&#x26;token=74818449-637b-442a-b1d6-aa72b09fb6b5" alt=""><figcaption></figcaption></figure>

### 执行生成的 Python 代码

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "在 Python 中创建一个 Fibonacci 函数，并求 fib(20)。"}],
}]
unsloth_inference(messages, temperature = 0.15, top_p = 1.0, top_k = -1, min_p = 0.00)
```

{% endcode %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FDsfcIrunKtsZ2RXRrjBo%2Fimage.png?alt=media&#x26;token=6aa2ab38-7def-4792-a2dc-c663329a41ff" alt=""><figcaption></figcaption></figure>

### 执行任意终端函数

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "把 'I'm a happy Sloth' 写入文件，然后把它再打印给我。"}],
}]
messages = unsloth_inference(messages, temperature = 0.15, top_p = 1.0, top_k = -1, min_p = 0.00)
```

{% endcode %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FMhmXE34IBuR8H8SnuSMe%2Fimage%20(3).png?alt=media&#x26;token=689114b7-5b50-41c9-8678-88f8f5da87a7" alt=""><figcaption></figcaption></figure>

## :stars: Qwen3-Coder-Next 工具调用

在一个新的终端中，我们创建一些工具，例如添加 2 个数字、执行 Python 代码、执行 Linux 函数等等：

{% code expandable="true" %}

```python
import json, subprocess, random
from typing import Any
def add_number(a: float | str, b: float | str) -> float:
    return float(a) + float(b)
def multiply_number(a: float | str, b: float | str) -> float:
    return float(a) * float(b)
def substract_number(a: float | str, b: float | str) -> float:
    return float(a) - float(b)
def write_a_story() -> str:
    return random.choice([
        "很久很久以前，在一个遥远的星系里……",
        "有两个朋友喜欢树懒和代码……",
        "世界正在终结，因为每一只树懒都进化出了超人的智慧……",
        "其中一位朋友并不知道，另一位朋友不小心编写了一个让树懒进化的程序……",
    ])
def terminal(command: str) -> str:
    if "rm" in command or "sudo" in command or "dd" in command or "chmod" in command:
        msg = "无法执行 'rm, sudo, dd, chmod' 命令，因为它们很危险"
        print(msg); return msg
    print(f"正在执行终端命令 `{command}`")
    try:
        return str(subprocess.run(command, capture_output = True, text = True, shell = True, check = True).stdout)
    except subprocess.CalledProcessError as e:
        return f"命令失败：{e.stderr}"
def python(code: str) -> str:
    data = {}
    exec(code, data)
    del data["__builtins__"]
    return str(data)
MAP_FN = {
    "add_number": add_number,
    "multiply_number": multiply_number,
    "substract_number": substract_number,
    "write_a_story": write_a_story,
    "terminal": terminal,
    "python": python,
}
tools = [
    {
        "type": "function",
        "function": {
            "name": "add_number",
            "description": "添加两个数字。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "multiply_number",
            "description": "将两个数字相乘。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "substract_number",
            "description": "将两个数字相减。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "write_a_story",
            "description": "写一个随机故事。",
            "parameters": {
                "type": "object",
                "properties": {},
                "required": [],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "terminal",
            "description": "执行终端中的操作。",
            "parameters": {
                "type": "object",
                "properties": {
                    "command": {
                        "type": "string",
                        "description": "你希望执行的命令，例如 `ls`、`rm` 等。",
                    },
                },
                "required": ["command"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "python",
            "description": "调用一个 Python 解释器来运行一些将被执行的 Python 代码。",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "要运行的 Python 代码",
                    },
                },
                "required": ["code"],
            },
        },
    },
]
```

{% endcode %}

然后我们使用下面的函数，它们会自动解析函数调用，并为任何 LLM 调用 OpenAI 端点：

{% code overflow="wrap" expandable="true" %}

```python
from openai import OpenAI
def unsloth_inference(
    messages,
    temperature = 1.0,
    top_p = 0.95,
    top_k = 40,
    min_p = 0.01,
    repetition_penalty = 1.0,
):
    messages = messages.copy()
    openai_client = OpenAI(
        base_url = "http://127.0.0.1:8001/v1",
        api_key = "sk-no-key-required",
    )
    model_name = next(iter(openai_client.models.list())).id
    print(f"Using model = {model_name}")
    has_tool_calls = True
    original_messages_len = len(messages)
    while has_tool_calls:
        print(f"当前消息 = {messages}")
        response = openai_client.chat.completions.create(
            model = model_name,
            messages = messages,
            temperature = temperature,
            top_p = top_p,
            tools = tools if tools else None,
            tool_choice = "auto" if tools else None,
            extra_body = {"top_k": top_k, "min_p": min_p, "repetition_penalty" :repetition_penalty,}
        )
        tool_calls = response.choices[0].message.tool_calls or []
        content = response.choices[0].message.content or ""
        tool_calls_dict = [tc.to_dict() for tc in tool_calls] if tool_calls else tool_calls
        messages.append({"role": "assistant", "tool_calls": tool_calls_dict, "content": content,})
        for tool_call in tool_calls:
            fx, args, _id = tool_call.function.name, tool_call.function.arguments, tool_call.id
            out = MAP_FN[fx](**json.loads(args))
            messages.append({"role": "tool", "tool_call_id": _id, "name": fx, "content": str(out),})
        else:
            has_tool_calls = False
    return messages
```

{% endcode %}

现在我们将展示多种工具调用方法，适用于下面许多不同的使用场景：

#### 执行生成的 Python 代码

<pre class="language-python" data-overflow="wrap"><code class="lang-python"><strong>messages = [{
</strong>    "role": "user",
    "content": [{"type": "text", "text": "在 Python 中创建一个 Fibonacci 函数，并求 fib(20)。"}],
}]
unsloth_inference(messages, temperature = 1.0, top_p = 0.95, top_k = 40, min_p = 0.00)
</code></pre>

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2F7fY3LSeNCjHXNjBwQkbI%2Fimage.png?alt=media&#x26;token=50eba62e-f8b2-424a-833b-be56696b4710" alt=""><figcaption></figcaption></figure>

#### 执行任意终端函数

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "把 'I'm a happy Sloth' 写入文件，然后把它再打印给我。"}],
}]
messages = unsloth_inference(messages, temperature = 1.0, top_p = 1.0, top_k = 40, min_p = 0.00)
```

{% endcode %}

我们确认文件已创建，而且确实创建了！

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FabplwVbEMlsCEJTmxzSA%2Fimage.png?alt=media&#x26;token=eb27f30a-c91e-4aec-8fb0-f4a35921d3db" alt=""><figcaption></figcaption></figure>

## :zap: GLM-4.7-Flash + GLM 4.7 调用

我们首先下载 [glm-4.7](https://unsloth.ai/docs/zh/mo-xing/tutorials/glm-4.7 "mention") 或 [glm-4.7-flash](https://unsloth.ai/docs/zh/mo-xing/glm-4.7-flash "mention") ，通过一些 Python 代码，然后在另一个终端中通过 llama-server 启动它（例如使用 tmux）。在这个示例中，我们下载大型的 GLM-4.7 模型：

```python
# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "unsloth/GLM-4.7-GGUF",
    local_dir = "unsloth/GLM-4.7-GGUF",
    allow_patterns = ["*UD-Q2_K_XL*",], # 用于 Q2_K_XL
)
```

如果你成功运行了它，你应该会看到：

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FqZCvBkQPi7GZj50pPHsJ%2Fimage.png?alt=media&#x26;token=2e97888a-fafe-4477-b99c-c7f3e2e316bf" alt=""><figcaption></figcaption></figure>

现在在一个新的终端中通过 llama-server 启动它。如果你愿意，可以使用 tmux：

{% code overflow="wrap" %}

```bash
./llama.cpp/llama-server \
    --model unsloth/GLM-4.7-GGUF/UD-Q2_K_XL/GLM-4.7-UD-Q2_K_XL-00001-of-00003.gguf \
    --alias "unsloth/GLM-4.7" \
    --threads -1 \
    --fit on \
    --prio 3 \
    --min_p 0.01 \
    --ctx-size 16384 \
    --port 8001 \
    --jinja
```

{% endcode %}

然后你会得到：

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FMIqGMOXIMvl68jzDfyWy%2Fimage.png?alt=media&#x26;token=c5739da8-b9eb-4a31-9878-3f0d5f51416e" alt="" width="563"><figcaption></figcaption></figure>

现在在一个新的终端中并执行 Python 代码，提醒运行 [#tool-calling-setup](#tool-calling-setup "mention") 我们使用 GLM 4.7 的最佳参数：temperature = 0.7 和 top\_p = 1.0

#### 用于 GLM 4.7 数学运算的工具调用

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "今天的日期加 3 天是多少？"}],
}]
unsloth_inference(messages, temperature = 0.7, top_p = 1.0, top_k = -1, min_p = 0.00)
```

{% endcode %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FoFkZ20QOSGdzT4iz2SOB%2Fimage.png?alt=media&#x26;token=e4ca30b0-dcec-4a26-b019-dd33f0600949" alt=""><figcaption></figcaption></figure>

#### 用于执行 GLM 4.7 生成的 Python 代码的工具调用

{% code overflow="wrap" %}

```python
messages = [{
    "role": "user",
    "content": [{"type": "text", "text": "在 Python 中创建一个 Fibonacci 函数，并求 fib(20)。"}],
}]
unsloth_inference(messages, temperature = 0.7, top_p = 1.0, top_k = -1, min_p = 0.00)
```

{% endcode %}

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FhS8sWtZwjwerElezCc2C%2Fimage.png?alt=media&#x26;token=39032ef8-386e-4837-8dd2-c552c80a3ee3" alt="" width="563"><figcaption></figcaption></figure>

{% code expandable="true" %}

```python
import json, subprocess, random
from typing import Any
def add_number(a: float | str, b: float | str) -> float:
    return float(a) + float(b)
def multiply_number(a: float | str, b: float | str) -> float:
    return float(a) * float(b)
def substract_number(a: float | str, b: float | str) -> float:
    return float(a) - float(b)
def write_a_story() -> str:
    return random.choice([
        "很久很久以前，在一个遥远的星系里……",
        "有两个朋友喜欢树懒和代码……",
        "世界正在终结，因为每一只树懒都进化出了超人的智慧……",
        "其中一位朋友并不知道，另一位朋友不小心编写了一个让树懒进化的程序……",
    ])
def terminal(command: str) -> str:
    if "rm" in command or "sudo" in command or "dd" in command or "chmod" in command:
        msg = "无法执行 'rm, sudo, dd, chmod' 命令，因为它们很危险"
        print(msg); return msg
    print(f"正在执行终端命令 `{command}`")
    try:
        return str(subprocess.run(command, capture_output = True, text = True, shell = True, check = True).stdout)
    except subprocess.CalledProcessError as e:
        return f"命令失败：{e.stderr}"
def python(code: str) -> str:
    data = {}
    exec(code, data)
    del data["__builtins__"]
    return str(data)
MAP_FN = {
    "add_number": add_number,
    "multiply_number": multiply_number,
    "substract_number": substract_number,
    "write_a_story": write_a_story,
    "terminal": terminal,
    "python": python,
}
tools = [
    {
        "type": "function",
        "function": {
            "name": "add_number",
            "description": "添加两个数字。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "multiply_number",
            "description": "将两个数字相乘。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "substract_number",
            "description": "将两个数字相减。",
            "parameters": {
                "type": "object",
                "properties": {
                    "a": {
                        "type": "string",
                        "description": "第一个数字。",
                    },
                    "b": {
                        "type": "string",
                        "description": "第二个数字。",
                    },
                },
                "required": ["a", "b"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "write_a_story",
            "description": "写一个随机故事。",
            "parameters": {
                "type": "object",
                "properties": {},
                "required": [],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "terminal",
            "description": "执行终端中的操作。",
            "parameters": {
                "type": "object",
                "properties": {
                    "command": {
                        "type": "string",
                        "description": "你希望执行的命令，例如 `ls`、`rm` 等。",
                    },
                },
                "required": ["command"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "python",
            "description": "调用一个 Python 解释器来运行一些将被执行的 Python 代码。",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "要运行的 Python 代码",
                    },
                },
                "required": ["code"],
            },
        },
    },
]
```

{% endcode %}

### :orange\_book: Devstral 2 工具调用

我们首先下载 [devstral-2](https://unsloth.ai/docs/zh/mo-xing/tutorials/devstral-2 "mention") 通过一些 Python 代码，然后在另一个终端中通过 llama-server 启动它（例如使用 tmux）：

```python
# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF",
    local_dir = "unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF",
    allow_patterns = ["*UD-Q4_K_XL*", "*mmproj-F16*"], # 用于 Q4_K_XL
)
```

如果你成功运行了它，你应该会看到：

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2FC9M5eNGefCpi3bLe0Kbw%2Fimage.png?alt=media&#x26;token=727c22d5-368f-45ad-b698-ccb84e3bbbbf" alt=""><figcaption></figcaption></figure>

现在在一个新的终端中通过 llama-server 启动它。如果你愿意，可以使用 tmux：

{% code overflow="wrap" %}

```bash
./llama.cpp/llama-server \
    --model unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF/Devstral-Small-2-24B-Instruct-2512-UD-Q4_K_XL.gguf \
    --mmproj unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF/mmproj-F16.gguf \
    --alias "unsloth/Devstral-Small-2-24B-Instruct-2512" \
    --threads -1 \
    --fit on \
    --prio 3 \
    --min_p 0.01 \
    --ctx-size 16384 \
    --port 8001 \
    --jinja
```

{% endcode %}

如果成功，你会看到下面内容：

<figure><img src="https://2657992854-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FxhOjnexMCB3dmuQFQ2Zq%2Fuploads%2Fo0VjjsMTBQCsfMNXcbYG%2Fimage.png?alt=media&#x26;token=3f0d9c51-1fd7-4d8d-9532-1a190f1b5830" alt="" width="563"><figcaption></figcaption></figure>

然后我们用以下消息来调用模型，并且只使用 Devstral 建议的 temperature = 0.15 参数。提醒运行 [#tool-calling-setup](#tool-calling-setup "mention")
