# 基础

- [如何将 Unsloth 作为 API 端点使用](https://unsloth.ai/docs/zh/ji-chu/api.md)
- [推理与部署](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment.md): 了解如何保存你微调后的模型，以便在你喜欢的推理引擎中运行它。
- [保存为 GGUF](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/saving-to-gguf.md)
- [推测解码](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/saving-to-gguf/speculative-decoding.md): 使用 llama-server、llama.cpp、vLLM 等进行推测解码，实现 2 倍更快的推理
- [vLLM 部署与推理指南](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/vllm-guide.md): 关于将 LLM 保存并部署到 vLLM，以便在生产环境中提供 LLM 服务的指南
- [vLLM 引擎参数](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/vllm-guide/vllm-engine-arguments.md)
- [LoRA 热插拔指南](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/vllm-guide/lora-hot-swapping-guide.md)
- [保存到 Ollama](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/saving-to-ollama.md)
- [将模型部署到 LM Studio](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/lm-studio.md): 将模型保存为 GGUF，以便你可以将其运行并部署到 LM Studio
- [如何在 Linux 终端中安装 LM Studio CLI](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/lm-studio/how-to-install-lm-studio-cli-in-linux-terminal.md): 在终端实例中无需 UI 的 LM Studio CLI 安装指南。
- [SGLang 部署与推理指南](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/sglang-guide.md): 关于将 LLM 保存并部署到 SGLang，以便在生产环境中提供 LLM 服务的指南
- [Unsloth 推理](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/unsloth-inference.md): 了解如何使用 Unsloth 更快的推理来运行你的微调模型。
- [llama-server 与 OpenAI 端点部署指南](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/llama-server-and-openai-endpoint.md): 通过 llama-server 部署并提供兼容 OpenAI 的端点
- [如何在你的 iOS 或 Android 手机上运行和部署 LLM](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/deploy-llms-phone.md): 关于微调你自己的 LLM，并使用 ExecuTorch 将其部署到 Android 或 iPhone 的教程。
- [推理故障排查](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/troubleshooting-inference.md): 如果你在运行或保存模型时遇到问题。
- [使用 Hugging Face Jobs 部署 LLM](https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment/deploying-llms-with-hugging-face-jobs.md): 使用 Hugging Face jobs 和 skills，通过一个 SKILL 使用 Codex / Claude Code 微调 LFM。
- [如何使用 Claude Code 运行本地 LLM](https://unsloth.ai/docs/zh/ji-chu/claude-code.md): 在你的本地设备上使用 Claude Code 运行开源模型的指南。
- [如何使用 OpenAI Codex 运行本地 LLM](https://unsloth.ai/docs/zh/ji-chu/codex.md): 在你的设备上本地使用 OpenAI Codex 运行开源模型。
- [使用 Unsloth 进行多 GPU 微调](https://unsloth.ai/docs/zh/ji-chu/multi-gpu-training-with-unsloth.md): 了解如何使用 Unsloth 在多 GPU 和并行环境下对 LLM 进行微调。
- [使用分布式数据并行（DDP）进行多 GPU 微调](https://unsloth.ai/docs/zh/ji-chu/multi-gpu-training-with-unsloth/ddp.md): 了解如何使用 Unsloth CLI 通过分布式数据并行（DDP）在多个 GPU 上训练！
- [使用 Unsloth 微调嵌入模型指南](https://unsloth.ai/docs/zh/ji-chu/embedding-finetuning.md): 了解如何使用 Unsloth 轻松微调嵌入模型。
- [使用 Unsloth 将 MoE 模型微调速度提升 12 倍](https://unsloth.ai/docs/zh/ji-chu/faster-moe.md): 使用 Unsloth 指南在本地训练 MoE LLM。
- [文本转语音（TTS）微调指南](https://unsloth.ai/docs/zh/ji-chu/text-to-speech-tts-fine-tuning.md): 了解如何使用 Unsloth 微调 TTS 和 STT 语音模型。
- [Unsloth Dynamic 2.0 GGUF](https://unsloth.ai/docs/zh/ji-chu/unsloth-dynamic-2.0-ggufs.md): 我们的 Dynamic Quants 的一次重大新升级！
- [Aider Polyglot 上的 Unsloth Dynamic GGUF](https://unsloth.ai/docs/zh/ji-chu/unsloth-dynamic-2.0-ggufs/unsloth-dynamic-ggufs-on-aider-polyglot.md): Unsloth Dynamic GGUF 在 Aider Polyglot 基准测试中的表现
- [本地 LLM 的工具调用指南](https://unsloth.ai/docs/zh/ji-chu/tool-calling-guide-for-local-llms.md)
- [视觉微调](https://unsloth.ai/docs/zh/ji-chu/vision-fine-tuning.md): 了解如何使用 Unsloth 微调视觉/多模态 LLM
- [故障排查与常见问题](https://unsloth.ai/docs/zh/ji-chu/troubleshooting-and-faqs.md): 解决问题的技巧，以及常见问题解答。
- [Hugging Face Hub，XET 调试](https://unsloth.ai/docs/zh/ji-chu/troubleshooting-and-faqs/hugging-face-hub-xet-debugging.md): 调试、排查卡住的下载和缓慢下载
- [聊天模板](https://unsloth.ai/docs/zh/ji-chu/chat-templates.md): 了解聊天模板的基础知识和自定义选项，包括 Conversational、ChatML、ShareGPT、Alpaca 等格式，以及更多内容！
- [Unsloth 环境标志](https://unsloth.ai/docs/zh/ji-chu/unsloth-environment-flags.md): 如果你看到微调中断，或者想关闭某些功能，这些高级标志可能会很有用。
- [继续预训练](https://unsloth.ai/docs/zh/ji-chu/continued-pretraining.md): 也称为持续微调。Unsloth 允许你持续进行预训练，让模型学习新语言。
- [从最后一个检查点继续微调](https://unsloth.ai/docs/zh/ji-chu/finetuning-from-last-checkpoint.md): 检查点保存可让你保存微调进度，以便暂停后继续。
- [Unsloth 基准测试](https://unsloth.ai/docs/zh/ji-chu/unsloth-benchmarks.md): Unsloth 在 NVIDIA GPU 上记录的基准测试。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://unsloth.ai/docs/zh/ji-chu.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.