# Inférence et déploiement Vous pouvez également exécuter vos modèles finement ajustés en utilisant [l’inférence 2x plus rapide d’Unsloth](/docs/fr/bases/inference-and-deployment/unsloth-inference.md).


Chat d’Unsloth Studio	/pages/02de109936ce31121bffae3333822baa85a115f0
llama.cpp - Sauvegarde au format GGUF	/pages/0ce33fc68eed069d43cdcfb76b9793ce71c64c1f	/pages/0ce33fc68eed069d43cdcfb76b9793ce71c64c1f
Point de terminaison API d’Unsloth	/pages/d7ed99d74f1997aa8747da14938bbaee3f09d15b
vLLM	/pages/682151c53afcf1f6d611eb29ad62b7182b5187ea	/pages/682151c53afcf1f6d611eb29ad62b7182b5187ea
Ollama	/pages/169851d8ccb3cd6dc872748a239f3bf944e2cd74	/pages/169851d8ccb3cd6dc872748a239f3bf944e2cd74
LM Studio	/pages/36cb808bf0f748d077633c7ecbb311cce41e282a
SGLang	/pages/b4083297e9c4dc4c5eedc209c17ef65ddd265e4e	/pages/1ef02dc5c24ab1305556a9b21ced05fca5ca43d9
Dépannage	/pages/6dba44c4c4f004bdca413ea55834649bee26efe4	/pages/6dba44c4c4f004bdca413ea55834649bee26efe4
llama-server et point de terminaison OpenAI	/pages/b7833386edaca08d62cb22de0c06676726d89d43
Appel d’outils	/pages/f47c3320cb986c4fe011958fe46fbbdef0e37e35
Exécutez des LLM sur votre téléphone	/pages/feb774ec9526706cf5885edfba3f100fbb320ea6

--- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://unsloth.ai/docs/fr/bases/inference-and-deployment.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.