# Inférence et déploiement

Vous pouvez également exécuter vos modèles affinés en utilisant [l’inférence 2x plus rapide d’Unsloth](https://unsloth.ai/docs/fr/bases/inference-and-deployment/unsloth-inference).

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th data-hidden data-card-target data-type="content-ref"></th><th data-hidden data-type="content-ref"></th></tr></thead><tbody><tr><td><a href="../../nouveau/studio#run-models-locally">Chat d’Unsloth Studio</a></td><td><a href="../nouveau/studio/chat">chat</a></td><td></td></tr><tr><td><a href="inference-and-deployment/saving-to-gguf">llama.cpp - Enregistrement au format GGUF</a></td><td><a href="inference-and-deployment/saving-to-gguf">saving-to-gguf</a></td><td><a href="inference-and-deployment/saving-to-gguf">saving-to-gguf</a></td></tr><tr><td><a href="inference-and-deployment/vllm-guide">vLLM</a></td><td><a href="inference-and-deployment/vllm-guide">vllm-guide</a></td><td><a href="inference-and-deployment/vllm-guide">vllm-guide</a></td></tr><tr><td><a href="inference-and-deployment/saving-to-ollama">Ollama</a></td><td><a href="inference-and-deployment/saving-to-ollama">saving-to-ollama</a></td><td><a href="inference-and-deployment/saving-to-ollama">saving-to-ollama</a></td></tr><tr><td><a href="inference-and-deployment/lm-studio">LM Studio</a></td><td><a href="inference-and-deployment/lm-studio">lm-studio</a></td><td></td></tr><tr><td><a href="inference-and-deployment/sglang-guide">SGLang</a></td><td><a href="inference-and-deployment/sglang-guide">sglang-guide</a></td><td><a href="inference-and-deployment/vllm-guide/vllm-engine-arguments">vllm-engine-arguments</a></td></tr><tr><td><a href="inference-and-deployment/troubleshooting-inference">Dépannage</a></td><td><a href="inference-and-deployment/troubleshooting-inference">troubleshooting-inference</a></td><td><a href="inference-and-deployment/troubleshooting-inference">troubleshooting-inference</a></td></tr><tr><td><a href="inference-and-deployment/llama-server-and-openai-endpoint">llama-server et point de terminaison OpenAI</a></td><td><a href="inference-and-deployment/llama-server-and-openai-endpoint">llama-server-and-openai-endpoint</a></td><td></td></tr><tr><td><a href="tool-calling-guide-for-local-llms">Appel d’outils</a></td><td><a href="tool-calling-guide-for-local-llms">tool-calling-guide-for-local-llms</a></td><td></td></tr><tr><td><a href="inference-and-deployment/deploy-llms-phone">Exécutez des LLM sur votre téléphone</a></td><td><a href="inference-and-deployment/deploy-llms-phone">deploy-llms-phone</a></td><td></td></tr></tbody></table>
