# 推理与部署你也可以通过以下方式运行你微调后的模型 [Unsloth 的 2 倍更快推理](/docs/zh/ji-chu/inference-and-deployment/unsloth-inference.md).


Unsloth Studio 聊天	/pages/5c2325084ff65c0303d8ec102b689868935855d3
llama.cpp - 保存为 GGUF	/pages/b83d88f106d75c3396c46f5342fb401501910093	/pages/b83d88f106d75c3396c46f5342fb401501910093
Unsloth API 端点	/pages/2c2bb53a273009e389791ded9e28dd4769a55051
vLLM	/pages/9f0e22d200c9105481e4854b8473aba99ca44835	/pages/9f0e22d200c9105481e4854b8473aba99ca44835
Ollama	/pages/d4f9cf59ddb6cd217d8f8563eeb6c00042f21972	/pages/d4f9cf59ddb6cd217d8f8563eeb6c00042f21972
LM Studio	/pages/771775d41e6a1596232819cdd79823d415eda744
SGLang	/pages/23e76b12a72496ba4fcc9d857dd940dd6ae14736	/pages/160443d79a06d2d700045d140452e790dbdb1173
故障排除	/pages/5511a85b8b57f4cfdffedbc8f1ea2110a10d550e	/pages/5511a85b8b57f4cfdffedbc8f1ea2110a10d550e
llama-server 和 OpenAI 端点	/pages/4a201e2f3e992b62e25a0ba283ec8b14ad3f414b
工具调用	/pages/4fe123c34ab0d523b509efe2b2b56b299498fc5c
在你的手机上运行 LLM	/pages/e0e826e45659eab088ef3acd7826998bc36539e9

--- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://unsloth.ai/docs/zh/ji-chu/inference-and-deployment.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.