> For the complete documentation index, see [llms.txt](https://unsloth.ai/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://unsloth.ai/docs/models.md).

# Models

- [Large language model (LLMs) Tutorials](https://unsloth.ai/docs/models/tutorials.md)
- [Qwen3 - How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/qwen3-how-to-run-and-fine-tune.md): Learn to run & fine-tune Qwen3 locally with Unsloth + our Dynamic 2.0 quants
- [Qwen3-VL: How to Run Guide](https://unsloth.ai/docs/models/tutorials/qwen3-how-to-run-and-fine-tune/qwen3-vl-how-to-run-and-fine-tune.md): Learn to fine-tune and run Qwen3-VL locally with Unsloth.
- [Qwen3-2507: Run Locally Guide](https://unsloth.ai/docs/models/tutorials/qwen3-how-to-run-and-fine-tune/qwen3-2507.md): Run Qwen3-30B-A3B-2507 and 235B-A22B Thinking and Instruct versions locally on your device!
- [MiniMax-M2.7 - How to Run Locally](https://unsloth.ai/docs/models/tutorials/minimax-m27.md): Run MiniMax-M2.7 LLM locally on your own device!
- [GLM-5: How to Run Locally Guide](https://unsloth.ai/docs/models/tutorials/glm-5.md): Run the new GLM-5 model by Z.ai on your own local device!
- [Kimi K2.5: How to Run Locally Guide](https://unsloth.ai/docs/models/tutorials/kimi-k2.5.md): Guide on running Kimi-K2.5 on your own local device!
- [GLM-4.7-Flash: How To Run Locally](https://unsloth.ai/docs/models/tutorials/glm-4.7-flash.md): Run & fine-tune GLM-4.7-Flash locally on your device!
- [Gemma 3 - How to Run Guide](https://unsloth.ai/docs/models/tutorials/gemma-3-how-to-run-and-fine-tune.md): How to run Gemma 3 effectively with our GGUFs on llama.cpp, Ollama, Open WebUI and how to fine-tune with Unsloth!
- [Gemma 3n: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/gemma-3-how-to-run-and-fine-tune/gemma-3n-how-to-run-and-fine-tune.md): Run Google's new Gemma 3n locally with Dynamic GGUFs on llama.cpp, Ollama, Open WebUI and fine-tune with Unsloth!
- [Qwen3-Coder: How to Run Locally](https://unsloth.ai/docs/models/tutorials/qwen3-coder-how-to-run-locally.md): Run Qwen3-Coder-30B-A3B-Instruct and 480B-A35B locally with Unsloth Dynamic quants.
- [MiniMax-M2.5: How to Run Guide](https://unsloth.ai/docs/models/tutorials/minimax-m25.md): Run MiniMax-M2.5 locally on your own device!
- [DeepSeek-OCR 2: How to Run & Fine-tune Guide](https://unsloth.ai/docs/models/tutorials/deepseek-ocr-2.md): Guide on how to run and fine-tune DeepSeek-OCR-2 locally.
- [GLM-4.7: How to Run Locally Guide](https://unsloth.ai/docs/models/tutorials/glm-4.7.md): A guide on how to run Z.ai GLM-4.7 model on your own local device!
- [How to Run Qwen-Image-2512 Locally in ComfyUI](https://unsloth.ai/docs/models/tutorials/qwen-image-2512.md): Step-by-step tutorial for running Qwen-Image-2512 on your local device with ComfyUI.
- [Run Qwen-Image-2512 in stable-diffusion.cpp Tutorial](https://unsloth.ai/docs/models/tutorials/qwen-image-2512/stable-diffusion.cpp.md): Tutorial for using Qwen-Image-2512 in stable-diffusion.cpp.
- [Devstral 2 - How to Run Guide](https://unsloth.ai/docs/models/tutorials/devstral-2.md): Guide for local running Mistral Devstral 2 models: 123B-Instruct-2512 and Small-2-24B-Instruct-2512.
- [Ministral 3 - How to Run Guide](https://unsloth.ai/docs/models/tutorials/ministral-3.md): Guide for Mistral Ministral 3 models, to run or fine-tune locally on your device
- [DeepSeek-OCR: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/deepseek-ocr-how-to-run-and-fine-tune.md): Guide on how to run and fine-tune DeepSeek-OCR locally.
- [Kimi K2 Thinking: Run Locally Guide](https://unsloth.ai/docs/models/tutorials/kimi-k2-thinking-how-to-run-locally.md): Guide on running Kimi-K2-Thinking and Kimi-K2 on your own local device!
- [GLM-4.6: Run Locally Guide](https://unsloth.ai/docs/models/tutorials/glm-4.6-how-to-run-locally.md): A guide on how to run Z.ai GLM-4.6 and GLM-4.6V-Flash model on your own local device!
- [Qwen3-Next: Run Locally Guide](https://unsloth.ai/docs/models/tutorials/qwen3-next.md): Run Qwen3-Next-80B-A3B-Instruct and Thinking versions locally on your device!
- [FunctionGemma: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/functiongemma.md): Learn how to run and fine-tune FunctionGemma locally on your device and phone.
- [DeepSeek-V3.1: How to Run Locally](https://unsloth.ai/docs/models/tutorials/deepseek-v3.1-how-to-run-locally.md): A guide on how to run DeepSeek-V3.1 and Terminus on your own local device!
- [DeepSeek-R1-0528: How to Run Locally](https://unsloth.ai/docs/models/tutorials/deepseek-r1-0528-how-to-run-locally.md): A guide on how to run DeepSeek-R1-0528 including Qwen3 on your own local device!
- [Liquid LFM2.5: How To Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/lfm2.5.md): Run and fine-tune LFM2.5 Instruct and Vision locally on your device!
- [Magistral: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/magistral-how-to-run-and-fine-tune.md): Meet Magistral - Mistral's new reasoning models.
- [IBM Granite 4.0](https://unsloth.ai/docs/models/tutorials/ibm-granite-4.0.md): How to run IBM Granite-4.0 with Unsloth GGUFs on llama.cpp, Ollama and how to fine-tune!
- [Llama 4: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/llama-4-how-to-run-and-fine-tune.md): How to run Llama 4 locally using our dynamic GGUFs which recovers accuracy compared to standard quantization.
- [Grok 2](https://unsloth.ai/docs/models/tutorials/grok-2.md): Run xAI's Grok 2 model locally!
- [Devstral: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/devstral-how-to-run-and-fine-tune.md): Run and fine-tune Mistral Devstral 1.1, including Small-2507 and 2505.
- [How to Run Local LLMs with Docker: Step-by-Step Guide](https://unsloth.ai/docs/models/tutorials/how-to-run-llms-with-docker.md): Learn how to run Large Language Models (LLMs) with Docker & Unsloth on your local device.
- [DeepSeek-V3-0324: How to Run Locally](https://unsloth.ai/docs/models/tutorials/deepseek-v3-0324-how-to-run-locally.md): How to run DeepSeek-V3-0324 locally using our dynamic quants which recovers accuracy
- [DeepSeek-R1: How to Run Locally](https://unsloth.ai/docs/models/tutorials/deepseek-r1-how-to-run-locally.md): A guide on how you can run our 1.58-bit Dynamic Quants for DeepSeek-R1 using llama.cpp.
- [DeepSeek-R1 Dynamic 1.58-bit](https://unsloth.ai/docs/models/tutorials/deepseek-r1-how-to-run-locally/deepseek-r1-dynamic-1.58-bit.md): See performance comparison tables for Unsloth's Dynamic GGUF Quants vs Standard IMatrix Quants.
- [Phi-4 Reasoning: How to Run & Fine-tune](https://unsloth.ai/docs/models/tutorials/phi-4-reasoning-how-to-run-and-fine-tune.md): Learn to run & fine-tune Phi-4 reasoning models locally with Unsloth + our Dynamic 2.0 quants
- [QwQ-32B: How to Run effectively](https://unsloth.ai/docs/models/tutorials/qwq-32b-how-to-run-effectively.md): How to run QwQ-32B effectively with our bug fixes and without endless generations + GGUFs.
- [Cogito v2.1: How to Run Locally](https://unsloth.ai/docs/models/tutorials/cogito-v2-how-to-run-locally.md): Cogito v2.1 LLMs are one of the strongest open models in the world trained with IDA. Also v1 comes in 4 sizes: 70B, 109B, 405B and 671B, allowing you to select which size best matches your hardware.
- [GLM-5.2 - How to Run Locally](https://unsloth.ai/docs/models/glm-5.2.md): Run the new GLM-5.2 model by Z.ai on local hardware!
- [DiffusionGemma - How to Run Locally](https://unsloth.ai/docs/models/diffusiongemma.md)
- [Gemma 4 - How to Run Locally](https://unsloth.ai/docs/models/gemma-4.md): Run Google’s new Gemma 4 models locally, including E2B, E4B, 26B A4B, and 31B.
- [Gemma 4 QAT](https://unsloth.ai/docs/models/gemma-4/qat.md): Run Google Gemma 4 QAT models locally, including E2B, E4B, 12B, 26B-A4B, and 31B.
- [Gemma 4 Fine-tuning Guide](https://unsloth.ai/docs/models/gemma-4/train.md): Train Gemma 4 by Google with Unsloth.
- [Qwen3.6 - How to Run Locally](https://unsloth.ai/docs/models/qwen3.6.md): Run the new Qwen3.6-27B and 35B-A3B models locally!
- [Kimi K2.7 Code - How to Run Locally](https://unsloth.ai/docs/models/kimi-k2.7-code.md): Step-by-step guide to running Kimi K2.7 Code on your own local device.
- [How to Run MTP Models: Multi-Token Prediction Guide](https://unsloth.ai/docs/models/mtp.md)
- [MiniMax M3 - How to Run Locally](https://unsloth.ai/docs/models/minimax-m3.md): Run MiniMax M3 LLM locally on your own device!
- [Qwen3.5 - How to Run Locally](https://unsloth.ai/docs/models/qwen3.5.md): Run the new Qwen3.5 LLMs including Medium: Qwen3.5-35B-A3B, 27B, 122B-A10B, Small: Qwen3.5-0.8B, 2B, 4B, 9B and 397B-A17B on your local device!
- [Qwen3.5 Fine-tuning Guide](https://unsloth.ai/docs/models/qwen3.5/fine-tune.md): Learn how to fine-tune Qwen3.5 LLMs with Unsloth.
- [Qwen3.5 GGUF Benchmarks](https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks.md): See how Unsloth Dynamic GGUFs perform + analysis of perplexity, KL divergence & MXFP4.
- [Kimi K2.6 - How to Run Locally](https://unsloth.ai/docs/models/kimi-k2.6.md): Step-by-step guide to running Kimi-K2.6 on your own local device.
- [NVIDIA Nemotron 3 Ultra - How To Run Locally](https://unsloth.ai/docs/models/nemotron-3-ultra.md): Run Nemotron-3-Ultra-550B-A55B locally on your device!
- [NVIDIA Nemotron 3 Nano Omni - How To Run Locally](https://unsloth.ai/docs/models/nemotron-3-nano-omni.md): Run & fine-tune Nemotron-3-Nano-Omni-30B-A3B locally on your device!
- [Mistral 3.5 - How To Run Locally](https://unsloth.ai/docs/models/mistral-3.5.md): Guide for Mistral Mistral 3.5 models, to run or fine-tune locally on your device
- [IBM Granite 4.1 - How to Run Locally](https://unsloth.ai/docs/models/ibm-granite-4.1.md): Run IBM Granite-4.1 with Unsloth GGUFs and how to fine-tune!
- [GLM-5.1 - How to Run Locally](https://unsloth.ai/docs/models/glm-5.1.md): Run the new GLM-5.1 model by Z.ai on your own local device!
- [Qwen3-Coder-Next: How to Run Locally](https://unsloth.ai/docs/models/qwen3-coder-next.md): Guide to run Qwen3-Coder-Next locally on your device!
- [NVIDIA Nemotron 3 Nano - How To Run Guide](https://unsloth.ai/docs/models/nemotron-3.md): Run & fine-tune NVIDIA Nemotron 3 Nano locally on your device!
- [NVIDIA Nemotron-3-Super: How To Run Guide](https://unsloth.ai/docs/models/nemotron-3/nemotron-3-super.md): Run & fine-tune NVIDIA Nemotron-3-Super-120B-A12B locally on your device!
- [gpt-oss: How to Run Guide](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune.md): Run & fine-tune OpenAI's new open-source models!
- [gpt-oss Reinforcement Learning](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/gpt-oss-reinforcement-learning.md)
- [Tutorial: How to Train gpt-oss with RL](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/gpt-oss-reinforcement-learning/tutorial-how-to-train-gpt-oss-with-rl.md): Learn to train OpenAI gpt-oss with GRPO to autonomously beat 2048 locally or on Colab.
- [Tutorial: How to Fine-tune gpt-oss](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/tutorial-how-to-fine-tune-gpt-oss.md): Learn step-by-step how to train OpenAI gpt-oss locally with Unsloth.
- [Long Context gpt-oss Training](https://unsloth.ai/docs/models/gpt-oss-how-to-run-and-fine-tune/long-context-gpt-oss-training.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://unsloth.ai/docs/models.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.