> For the complete documentation index, see [llms.txt](https://unsloth.ai/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://unsloth.ai/docs/zh/mo-xing/tutorials/glm-4.7-flash.md). # GLM-4.7-Flash：如何在本地运行 GLM-4.7-Flash 是 Z.ai 新推出的 30B MoE 推理模型，专为本地部署打造，在编程、智能体工作流和聊天方面提供同类最佳性能。它使用约 3.6B 参数，支持 200K 上下文，并在 SWE-Bench、GPQA 以及推理/聊天基准上领先。 GLM-4.7-Flash 可运行于 **24GB 内存**/VRAM/统一内存（完整精度需 32GB），现在还可以使用 Unsloth 进行微调。要通过 vLLM 运行 GLM 4.7 Flash，请参见 [#glm-4.7-flash-in-vllm](#glm-4.7-flash-in-vllm "mention") {% hint style="success" %} 1月21日更新： `llama.cpp` 修复了一个指定错误的 bug， `scoring_func`: `"softmax"` （应为 `"sigmoid"`）。这会导致循环和较差的输出。我们已更新 GGUF 文件——请重新下载模型以获得更好的输出。你现在可以使用 Z.ai 推荐的参数并获得很好的结果： * **适用于一般场景：** `--temp 1.0 --top-p 0.95` * **适用于工具调用：** `--temp 0.7 --top-p 1.0` * **重复惩罚：** 将其禁用，或者设置为 `--repeat-penalty 1.0` 1月22日：CUDA 的 FA 修复已合并，因此现在推理速度更快了。 {% endhint %} 运行教程微调要运行的 GLM-4.7-Flash GGUF： [unsloth/GLM-4.7-Flash-GGUF](https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF) ### ⚙️ 使用指南为了获得最佳性能，请确保你的总可用内存（VRAM + 系统 RAM）大于你正在下载的量化模型文件大小。如果不够，llama.cpp 仍然可以通过 SSD/HDD 卸载运行，但推理速度会更慢。与 Z.ai 团队沟通后，他们建议使用他们的 GLM-4.7 采样参数： | 默认设置（大多数任务） | Terminal Bench，SWE Bench Verified | | ------------------------------------------------------------------ | ------------------------------------------------------------------ | | **temperature = 1.0** | **temperature = 0.7** | | **top\_p = 0.95** | **top\_p = 1.0** | | 重复惩罚 = 禁用或 1.0 | 重复惩罚 = 禁用或 1.0 | * 适用于一般场景： `--temp 1.0 --top-p 0.95` * 适用于工具调用： `--temp 0.7 --top-p 1.0` * 如果使用 llama.cpp，请设置 `--min-p 0.01` 因为 llama.cpp 的默认值是 0.05 * 有时你需要试验哪些数值最适合你的使用场景。 {% hint style="warning" %} 目前，我们 **不建议** 使用以下方式运行此 GGUF： **Ollama** 因为可能存在聊天模板兼容性问题。该 GGUF 在 llama.cpp 上运行良好（或其他后端，例如 LM Studio、Jan）。 **记得关闭重复惩罚！或者设置** `--repeat-penalty 1.0` {% endhint %} * **最大上下文窗口：** `202,752` ### 🖥️ 运行 GLM-4.7-Flash 根据你的使用场景，你需要使用不同的设置。有些 GGUF 的大小最终会相近，因为模型架构（例如 [gpt-oss](/docs/zh/mo-xing/gpt-oss-how-to-run-and-fine-tune.md)）的维度不能被 128 整除，因此某些部分无法量化到更低位。由于本指南使用 4-bit，你将需要大约 18GB 内存/统一内存。为获得最佳性能，我们建议至少使用 4-bit 精度。 {% hint style="warning" %} 目前，我们 **不建议** 使用以下方式运行此 GGUF： **Ollama** 因为可能存在聊天模板兼容性问题。该 GGUF 在 llama.cpp 上运行良好（或其他后端，例如 LM Studio、Jan）。 **记得关闭重复惩罚！或者设置** `--repeat-penalty 1.0` {% endhint %} #### 🦥 Unsloth Studio 指南 GLM-4.7-Flash 可以在 [Unsloth Studio](/docs/zh/xin-de/studio.md)，这是我们用于本地 AI 的全新开源 Web UI。借助 Unsloth Studio，你可以在本地运行模型于 **MacOS、Windows**、Linux，以及： {% columns %} {% column %} * 搜索、下载， [运行 GGUF](/docs/zh/xin-de/studio.md#run-models-locally) 以及 safetensor 模型 * [**自修复** 工具调用](/docs/zh/xin-de/studio.md#execute-code--heal-tool-calling) + **网页搜索** * [**代码执行**](/docs/zh/xin-de/studio.md#run-models-locally) （Python、Bash） * [自动推理](/docs/zh/xin-de/studio.md#model-arena) 参数调优（temp、top-p 等） * 通过 llama.cpp 实现快速 CPU + GPU 推理 * [训练 LLM](/docs/zh/xin-de/studio.md#no-code-training) 速度提升 2 倍，VRAM 减少 70% {% endcolumn %} {% column %}

{% endcolumn %} {% endcolumns %} {% stepper %} {% step %} **安装 Unsloth** 在终端中运行： MacOS、Linux、WSL： ```bash curl -fsSL https://unsloth.ai/install.sh | sh ``` Windows PowerShell： ```bash irm https://unsloth.ai/install.ps1 | iex ``` {% hint style="success" %} **安装会很快，大约需要 1-2 分钟。** {% endhint %} {% endstep %} {% step %} **启动 Unsloth** MacOS、Linux、WSL 和 Windows： ```bash unsloth studio -H 0.0.0.0 -p 8888 ``` 然后打开 `http://localhost:8888` 在你的浏览器中。 {% endstep %} {% step %} **搜索并下载 GLM-4.7-Flash** 首次启动时，你需要创建一个密码来保护你的账户，并在之后重新登录。随后你会看到一个简短的引导向导，用于选择模型、数据集和基本设置。你可以随时跳过它。然后前往 [Unsloth Chat](/docs/zh/xin-de/studio/chat.md) 选项卡并搜索 **GLM-4.7-Flash** 在搜索栏中输入，并下载你想要的模型和量化版本。

{% endstep %} {% step %} **运行 GLM-4.7-Flash** 使用 Unsloth Studio 时，推理参数应会自动设置，不过你仍然可以手动更改。你也可以编辑上下文长度、聊天模板和其他设置。更多信息，你可以查看我们的 [Unsloth Studio 推理指南](/docs/zh/xin-de/studio/chat.md).

{% endstep %} {% endstepper %} #### Llama.cpp 教程（GGUF）：在 llama.cpp 中运行的说明（注意，我们将使用 4-bit 以适配大多数设备）： {% stepper %} {% step %} 获取最新的 `llama.cpp` 在 [GitHub 这里](https://github.com/ggml-org/llama.cpp)。你也可以按照下面的构建说明进行操作。将 `-DGGML_CUDA=ON` 更改为 `-DGGML_CUDA=OFF` 如果你没有 GPU，或者只想进行 CPU 推理。 **对于 Apple Mac / Metal 设备**，设置 `-DGGML_CUDA=OFF` 然后像往常一样继续——Metal 支持默认开启。 {% code overflow="wrap" %} ```bash apt-get update apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y git clone https://github.com/ggml-org/llama.cpp cmake llama.cpp -B llama.cpp/build \\ -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON cmake --build llama.cpp/build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split cp llama.cpp/build/bin/llama-* llama.cpp ``` {% endcode %} {% endstep %} {% step %} 你可以直接从 Hugging Face 拉取。根据你的 RAM/VRAM 容量，可将上下文扩展到 200K。你也可以尝试 Z.ai 推荐的 GLM-4.7 采样参数： * 适用于一般场景： `--temp 1.0 --top-p 0.95` * 适用于工具调用： `--temp 0.7 --top-p 1.0` * **记得关闭重复惩罚！** 按此进行 **通用指令** 用例： ```bash ./llama.cpp/llama-cli \\ -hf unsloth/GLM-4.7-Flash-GGUF:UD-Q4_K_XL \\ --ctx-size 16384 \\ --temp 1.0 --top-p 0.95 --min-p 0.01 ``` 按此进行 **工具调用** 用例： ```bash ./llama.cpp/llama-cli \\ -hf unsloth/GLM-4.7-Flash-GGUF:UD-Q4_K_XL \\ --ctx-size 16384 \\ --temp 0.7 --top-p 1.0 --min-p 0.01 ``` {% endstep %} {% step %} 通过以下方式下载模型（在安装 `pip install huggingface_hub`之后）。你可以选择 `UD-Q4_K_XL` 或其他量化版本。如果下载卡住，请参见 [Hugging Face Hub、XET 调试](/docs/zh/ji-chu-zhi-shi/troubleshooting-and-faqs/hugging-face-hub-xet-debugging.md) {% code overflow="wrap" %} ```bash pip install -U huggingface_hub hf download unsloth/GLM-4.7-Flash-GGUF \\ --local-dir unsloth/GLM-4.7-Flash-GGUF \\ --include "*UD-Q2_K_XL*" ``` {% endcode %} {% endstep %} {% step %} 然后以对话模式运行模型： {% code overflow="wrap" %} ```bash ./llama.cpp/llama-cli \\ --model unsloth/GLM-4.7-Flash-GGUF/GLM-4.7-Flash-UD-Q4_K_XL.gguf \\ --ctx-size 16384 \\ --seed 3407 \\ --temp 1.0 \\ --top-p 0.95 \ --min-p 0.01 ``` {% endcode %} 同时，还要调整 **上下文窗口** 按需，最多到 `202752` {% endstep %} {% endstepper %} ### :loop:减少重复和循环 {% hint style="success" %} **1月21日更新：llama.cpp 修复了一个指定错误的 bug，** `"scoring_func": "softmax"` **这会导致循环和较差的输出（应为 sigmoid）。我们已更新 GGUF 文件。请重新下载模型以获得更好的输出。** {% endhint %} 这意味着你现在可以使用 Z.ai 推荐的参数并获得很好的结果： * 适用于一般场景： `--temp 1.0 --top-p 0.95` * 适用于工具调用： `--temp 0.7 --top-p 1.0` * 如果使用 llama.cpp，请设置 `--min-p 0.01` 因为 llama.cpp 的默认值是 0.05 * **记得关闭重复惩罚！或者设置** `--repeat-penalty 1.0` 我们添加了 `"scoring_func": "sigmoid"` 更改为 `config.json` 主模型的 - [见](https://huggingface.co/unsloth/GLM-4.7-Flash/commit/3fd53b491e04f707f307aef2f70f8a7520511e6d). {% hint style="warning" %} 目前，我们 **不建议** 使用以下方式运行此 GGUF： **Ollama** 因为可能存在聊天模板兼容性问题。该 GGUF 在 llama.cpp 上运行良好（或其他后端，例如 LM Studio、Jan）。 {% endhint %} ### :bird:使用 UD-Q4\_K\_XL 的 Flappy Bird 示例例如，我们通过 UD-Q4\_K\_XL 进行了以下长对话： `./llama.cpp/llama-cli --model unsloth/GLM-4.7-Flash-GGUF/GLM-4.7-Flash-UD-Q4_K_XL.gguf --fit on --temp 1.0 --top-p 0.95 --min-p 0.01` : ``` 你好 2+2 等于多少创建一个 Python 版 Flappy Bird 游戏用 Rust 创建一个完全不同的游戏找出两个中的 bug 把我提到的第一个游戏做成一个独立的 HTML 文件找出 bug 并展示修复后的游戏 ``` 从而渲染出以下 HTML 形式的 Flappy Bird 游戏：

HTML 版 Flappy Bird 游戏（可展开）

```html Flappy Bird Fixed

FLAPPY
BIRD

点击或按空格开始

游戏结束

得分：0

```

我们还截取了一些截图（4bit 可用）：

### 🦥 微调 GLM-4.7-Flash Unsloth 现在支持对 GLM-4.7-Flash 进行微调，不过你需要使用 `transformers v5`. 30B 模型无法放入免费的 Colab GPU；不过，你可以使用我们的 notebook。GLM-4.7-Flash 的 16-bit LoRA 微调大约会使用 **60GB VRAM**: * [GLM-4.7-Flash SFT LoRA 笔记本](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/GLM_Flash_A100\(80GB\).ipynb) {% hint style="warning" %} 当使用 A100 40GB VRAM 时，你可能会偶尔遇到内存不足。为了更流畅的运行，你需要使用 H100/A100 80GB VRAM。 {% endhint %} {% embed url="" %} 在微调 MoE 时，最好不要微调路由层，所以我们默认禁用了它。如果你想保留其推理能力（可选），可以使用直接回答和思维链示例的混合。至少使用 75% 推理和 25% 非推理在你的数据集中，以使模型保留其推理能力。 ### 🦙Llama-server 服务与部署要将 GLM-4.7-Flash 部署到生产环境，我们使用 `llama-server` 在新的终端中，例如通过 tmux，使用以下方式部署模型： {% code overflow="wrap" %} ```bash ./llama.cpp/llama-server \ --model unsloth/GLM-4.7-Flash-GGUF/GLM-4.7-Flash-UD-Q4_K_XL.gguf \\ --alias "unsloth/GLM-4.7-Flash" \\ --seed 3407 \\ --temp 1.0 \\ --top-p 0.95 \ --min-p 0.01 \ --ctx-size 16384 \\ --port 8001 ``` {% endcode %} 然后在新的终端中，在执行 `pip install openai`后，执行： {% code overflow="wrap" %} ```python from openai import OpenAI import json openai_client = OpenAI( base_url = "http://127.0.0.1:8001/v1", api_key = "sk-no-key-required", ) completion = openai_client.chat.completions.create( model = "unsloth/GLM-4.7-Flash", messages = [{"role": "user", "content": "2+2 等于多少？"},], ) print(completion.choices[0].message.content) ``` {% endcode %} 这将打印 {% code overflow="wrap" %} ``` 用户提出了一个简单问题：“2+2 等于多少？”答案是 4。请给出答案。 2 + 2 = 4. ``` {% endcode %} ### :computer: vLLM 中的 GLM-4.7-Flash 你现在可以使用我们新的 [FP8 动态量化版本](https://huggingface.co/unsloth/GLM-4.7-Flash-FP8-Dynamic) 用于模型的高性能、快速推理。首先从 nightly 版安装 vLLM： {% code overflow="wrap" %} ```bash uv pip install --upgrade --force-reinstall vllm --torch-backend=auto --extra-index-url https://wheels.vllm.ai/nightly/cu130 uv pip install --upgrade --force-reinstall git+https://github.com/huggingface/transformers.git uv pip install --force-reinstall numba ``` {% endcode %} 然后启动服务 [Unsloth 的动态 FP8 版本](https://huggingface.co/unsloth/GLM-4.7-Flash-FP8-Dynamic) 该模型。我们启用了 FP8，以将 KV cache 内存使用量减少 50%，并在 4 张 GPU 上运行。如果你只有 1 张 GPU，请使用 `CUDA_VISIBLE_DEVICES='0'` 并设置 `--tensor-parallel-size 1` 或者移除此参数。要禁用 FP8，请移除 `--quantization fp8 --kv-cache-dtype fp8` ```bash export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:False CUDA_VISIBLE_DEVICES='0,1,2,3' vllm serve unsloth/GLM-4.7-Flash-FP8-Dynamic \\ --served-model-name unsloth/GLM-4.7-Flash \\ --tensor-parallel-size 4 \\ --tool-call-parser glm47 \\ --reasoning-parser glm45 \\ --enable-auto-tool-choice \\ --dtype bfloat16 \\ --seed 3407 \\ --max-model-len 200000 \\ --gpu-memory-utilization 0.95 \\ --max_num_batched_tokens 16384 \\ --port 8001 \ --kv-cache-dtype fp8 ``` 然后你就可以通过 OpenAI API 调用已部署的模型： ```python from openai import AsyncOpenAI, OpenAI openai_api_key = "EMPTY" openai_api_base = "http://localhost:8001/v1" client = OpenAI( # 或 AsyncOpenAI api_key=openai_api_key, base_url=openai_api_base, ) ``` #### :star: vLLM 中 GLM-4.7-Flash 的推测解码我们发现，使用 GLM 4.7 Flash 的 MTP（多 token 预测）模块会使生成吞吐量从 1 台 B200 上的 13,000 token 降到 1,300 token！(慢 10 倍) 在 Hopper 上应该没问题，希望如此。 ```bash --speculative-config.method mtp \\ --speculative-config.num_speculative_tokens 1 ``` 在 1xB200 上只有 1,300 token/s 的吞吐量（每个用户的解码速度为 130 token/s）

在 1xB200 上有 13,000 token/s 的吞吐量（每个用户的解码速度仍为 130 token/s）

### :hammer:使用 GLM-4.7-Flash 进行工具调用参见 [Tool Calling Guide](/docs/zh/ji-chu-zhi-shi/tool-calling-guide-for-local-llms.md) 了解如何进行工具调用的更多细节。在一个新的终端中（如果使用 tmux，请按 CTRL+B+D），我们创建一些工具，例如两个数相加、执行 Python 代码、执行 Linux 函数等等： {% code expandable="true" %} ```python import json, subprocess, random from typing import Any def add_number(a: float | str, b: float | str) -> float: return float(a) + float(b) def multiply_number(a: float | str, b: float | str) -> float: return float(a) * float(b) def subtract_number(a: float | str, b: float | str) -> float: return float(a) - float(b) def write_a_story() -> str: return random.choice([ "很久很久以前，在一个遥远的星系里……", "有两个朋友热爱树懒和代码……", "世界正在终结，因为每只树懒都进化出了超人般的智慧……", "其中一位朋友不知道，另一位不小心编写了一个让树懒进化的程序……", ]) def terminal(command: str) -> str: if "rm" in command or "sudo" in command or "dd" in command or "chmod" in command: msg = "由于这些命令很危险，无法执行 'rm, sudo, dd, chmod' 命令" print(msg); return msg print(f"正在执行终端命令 `{command}`") try: return str(subprocess.run(command, capture_output = True, text = True, shell = True, check = True).stdout) except subprocess.CalledProcessError as e: return f"命令失败：{e.stderr}" def python(code: str) -> str: data = {} exec(code, data) del data["__builtins__"] return str(data) MAP_FN = { "add_number": add_number, "multiply_number": multiply_number, "subtract_number": subtract_number, "write_a_story": write_a_story, "terminal": terminal, "python": python, } tools = [ { "type": "function", "function": { "name": "add_number", "description": "将两个数字相加。", "参数": { "类型": "object", "属性": { "a": { "类型": "string", "描述": "第一个数字。", }, "b": { "类型": "string", "描述": "第二个数字。", }, }, "必需": ["a", "b"], }, }, }, { "type": "function", "function": { "名称": "multiply_number", "描述": "将两个数字相乘。", "参数": { "类型": "object", "属性": { "a": { "类型": "string", "描述": "第一个数字。", }, "b": { "类型": "string", "描述": "第二个数字。", }, }, "必需": ["a", "b"], }, }, }, { "type": "function", "function": { "名称": "subtract_number", "描述": "将两个数字相减。", "参数": { "类型": "object", "属性": { "a": { "类型": "string", "描述": "第一个数字。", }, "b": { "类型": "string", "描述": "第二个数字。", }, }, "必需": ["a", "b"], }, }, }, { "type": "function", "function": { "名称": "write_a_story", "描述": "编写一个随机故事。", "参数": { "类型": "object", "属性": {}, "必需": [], }, }, }, { "type": "function", "function": { "名称": "terminal", "描述": "在终端中执行操作。", "参数": { "类型": "object", "属性": { "command": { "类型": "string", "描述": "你希望执行的命令，例如 `ls`、`rm` 等。", }, }, "必需": ["command"], }, }, }, { "type": "function", "function": { "名称": "python", "描述": "调用一个 Python 解释器来运行一些 Python 代码。", "参数": { "类型": "object", "属性": { "code": { "类型": "string", "描述": "要运行的 Python 代码", }, }, "必需": ["code"], }, }, }, ] ``` {% endcode %} 然后我们使用下面的函数（复制、粘贴并执行），它们会自动解析函数调用，并为任何模型调用 OpenAI 端点： {% code overflow="wrap" expandable="true" %} ```python from openai import OpenAI def unsloth_inference( messages, temperature = 0.7, top_p = 1.0, top_k = -1, repetition_penalty = 0.0, ): messages = messages.copy() openai_client = OpenAI( base_url = "http://127.0.0.1:8001/v1", api_key = "sk-no-key-required", ) model_name = next(iter(openai_client.models.list())).id print(f"使用的模型 = {model_name}") has_tool_calls = True original_messages_len = len(messages) while has_tool_calls: print(f"当前消息 = {messages}") response = openai_client.chat.completions.create( model = model_name, messages = messages, temperature = temperature, top_p = top_p, tools = tools if tools else None, tool_choice = "auto" if tools else None, extra_body = {"top_k": top_k, "min_p": min_p, "dry_multiplier" :repetition_penalty,} ) tool_calls = response.choices[0].message.tool_calls or [] content = response.choices[0].message.content or "" tool_calls_dict = [tc.to_dict() for tc in tool_calls] if tool_calls else tool_calls messages.append({"role": "assistant", "tool_calls": tool_calls_dict, "content": content,}) for tool_call in tool_calls: fx, args, _id = tool_call.function.name, tool_call.function.arguments, tool_call.id out = MAP_FN[fx](**json.loads(args)) messages.append({"role": "tool", "tool_call_id": _id, "name": fx, "content": str(out),}) else: has_tool_calls = False return messages ``` {% endcode %} 通过以下方式启动 GLM-4.7-Flash 后 `llama-server` 例如在 [#deploy-with-llama-server-and-openais-completion-library](#deploy-with-llama-server-and-openais-completion-library "mention") 或查看 [Tool Calling Guide](/docs/zh/ji-chu-zhi-shi/tool-calling-guide-for-local-llms.md) 要了解更多细节，我们可以进行一些工具调用： **GLM 4.7 的数学运算工具调用** {% code overflow="wrap" %} ```python messages = [{ "role": "user", "content": [{"type": "text", "text": "今天的日期加 3 天是多少？"}], }] unsloth_inference(messages, temperature = 1.0, top_p = 0.95, top_k = -1, min_p = 0.01) ``` {% endcode %}

**用于 GLM-4.7-Flash 的工具调用，以执行生成的 Python 代码** {% code overflow="wrap" %} ```python messages = [{ "role": "user", "content": [{"type": "text", "text": "在 Python 中创建一个斐波那契函数并求 fib(20)。"}], }] unsloth_inference(messages, temperature = 1.0, top_p = 0.95, top_k = -1, min_p = 0.01) ``` {% endcode %}

### 基准测试除 AIME 25 外，GLM-4.7-Flash 是所有基准中表现最好的 30B 模型。

| 基准 | GLM-4.7-Flash | Qwen3-30B-A3B-Thinking-2507 | GPT-OSS-20B | | ------------------ | ------------- | --------------------------- | ----------- | | AIME 25 | 91.6 | 85.0 | 91.7 | | GPQA | 75.2 | 73.4 | 71.5 | | LCB v6 | 64.0 | 66.0 | 61.0 | | HLE | 14.4 | 9.8 | 10.9 | | SWE-bench Verified | 59.2 | 22.0 | 34.0 | | τ²-Bench | 79.5 | 49.0 | 47.7 | | BrowseComp | 42.8 | 2.29 | 28.3 | --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://unsloth.ai/docs/zh/mo-xing/tutorials/glm-4.7-flash.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.

FLAPPYBIRD

游戏结束

得分：0

FLAPPY
BIRD