# Grundlagen

- [Wie man Unsloth als API-Endpunkt verwendet](https://unsloth.ai/docs/de/grundlagen/api.md)
- [Inferenz & Bereitstellung](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment.md): Lerne, wie du dein feinabgestimmtes Modell speicherst, damit du es in deiner bevorzugten Inferenz-Engine ausführen kannst.
- [In GGUF speichern](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/saving-to-gguf.md)
- [Spekulatives Decoding](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/saving-to-gguf/speculative-decoding.md): Spekulatives Decoding mit llama-server, llama.cpp, vLLM und mehr für 2x schnellere Inferenz
- [Leitfaden für vLLM-Bereitstellung & Inferenz](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/vllm-guide.md): Leitfaden zum Speichern und Bereitstellen von LLMs in vLLM, um LLMs produktiv auszuliefern
- [vLLM-Engine-Argumente](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/vllm-guide/vllm-engine-arguments.md)
- [Leitfaden zum Hot-Swapping von LoRA](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/vllm-guide/lora-hot-swapping-guide.md)
- [In Ollama speichern](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/saving-to-ollama.md)
- [Modelle in LM Studio bereitstellen](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/lm-studio.md): Modelle in GGUF speichern, damit du sie in LM Studio ausführen und bereitstellen kannst
- [Wie man die LM Studio CLI im Linux-Terminal installiert](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/lm-studio/how-to-install-lm-studio-cli-in-linux-terminal.md): Installationsanleitung für die LM Studio CLI ohne UI in einer Terminal-Instanz.
- [Leitfaden für SGLang-Bereitstellung & Inferenz](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/sglang-guide.md): Leitfaden zum Speichern und Bereitstellen von LLMs in SGLang, um LLMs produktiv auszuliefern
- [Unsloth-Inferenz](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/unsloth-inference.md): Lerne, wie du dein feinabgestimmtes Modell mit der schnelleren Inferenz von Unsloth ausführst.
- [Leitfaden zur Bereitstellung von llama-server & OpenAI-Endpunkt](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/llama-server-and-openai-endpoint.md): Bereitstellung über llama-server mit einem OpenAI-kompatiblen Endpunkt
- [Wie man LLMs auf deinem iOS- oder Android-Handy ausführt und bereitstellt](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/deploy-llms-phone.md): Tutorial zum Feinabstimmen deines eigenen LLMs und zur Bereitstellung auf deinem Android- oder iPhone mit ExecuTorch.
- [Fehlerbehebung bei der Inferenz](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/troubleshooting-inference.md): Wenn du Probleme beim Ausführen oder Speichern deines Modells hast.
- [Bereitstellung von LLMs mit Hugging Face Jobs](https://unsloth.ai/docs/de/grundlagen/inference-and-deployment/deploying-llms-with-hugging-face-jobs.md): Verwendung von Hugging Face Jobs und Skills, um LFM mit Codex / Claude Code mit einem SKILL feinabzustimmen.
- [Wie man lokale LLMs mit Claude Code ausführt](https://unsloth.ai/docs/de/grundlagen/claude-code.md): Leitfaden zur Verwendung offener Modelle mit Claude Code auf deinem lokalen Gerät.
- [Wie man lokale LLMs mit OpenAI Codex ausführt](https://unsloth.ai/docs/de/grundlagen/codex.md): Verwende offene Modelle mit OpenAI Codex lokal auf deinem Gerät.
- [Multi-GPU-Feinabstimmung mit Unsloth](https://unsloth.ai/docs/de/grundlagen/multi-gpu-training-with-unsloth.md): Lerne, wie man LLMs auf mehreren GPUs und mit Parallelisierung mit Unsloth feinabstimmt.
- [Multi-GPU-Feinabstimmung mit Distributed Data Parallel (DDP)](https://unsloth.ai/docs/de/grundlagen/multi-gpu-training-with-unsloth/ddp.md): Lerne, wie man die Unsloth-CLI verwendet, um mit Distributed Data Parallel (DDP) auf mehreren GPUs zu trainieren!
- [Leitfaden zur Feinabstimmung von Embedding-Modellen mit Unsloth](https://unsloth.ai/docs/de/grundlagen/embedding-finetuning.md): Lerne, wie man Embedding-Modelle einfach mit Unsloth feinabstimmt.
- [MoE-Modelle 12x schneller mit Unsloth feinabstimmen](https://unsloth.ai/docs/de/grundlagen/faster-moe.md): Trainiere MoE-LLMs lokal mit dem Unsloth-Leitfaden.
- [Leitfaden zur Feinabstimmung von Text-to-Speech (TTS)](https://unsloth.ai/docs/de/grundlagen/text-to-speech-tts-fine-tuning.md): Lerne, wie man TTS- und STT-Sprachmodelle mit Unsloth feinabstimmt.
- [Unsloth Dynamic 2.0 GGUFs](https://unsloth.ai/docs/de/grundlagen/unsloth-dynamic-2.0-ggufs.md): Ein großes neues Upgrade für unsere Dynamic Quants!
- [Unsloth Dynamic GGUFs auf Aider Polyglot](https://unsloth.ai/docs/de/grundlagen/unsloth-dynamic-2.0-ggufs/unsloth-dynamic-ggufs-on-aider-polyglot.md): Leistung von Unsloth Dynamic GGUFs auf den Aider-Polyglot-Benchmarks
- [Leitfaden zur Tool-Aufrufung für lokale LLMs](https://unsloth.ai/docs/de/grundlagen/tool-calling-guide-for-local-llms.md)
- [Vision-Feinabstimmung](https://unsloth.ai/docs/de/grundlagen/vision-fine-tuning.md): Lerne, wie man Vision-/Multimodal-LLMs mit Unsloth feinabstimmt
- [Fehlerbehebung & FAQs](https://unsloth.ai/docs/de/grundlagen/troubleshooting-and-faqs.md): Tipps zur Behebung von Problemen und häufig gestellte Fragen.
- [Hugging Face Hub, XET-Debugging](https://unsloth.ai/docs/de/grundlagen/troubleshooting-and-faqs/hugging-face-hub-xet-debugging.md): Debugging, Fehlerbehebung bei hängenden, stecken gebliebenen und langsamen Downloads
- [Chat-Vorlagen](https://unsloth.ai/docs/de/grundlagen/chat-templates.md): Lerne die Grundlagen und Anpassungsoptionen von Chat-Vorlagen kennen, einschließlich Conversational-, ChatML-, ShareGPT-, Alpaca-Formate und mehr!
- [Unsloth-Umgebungsflags](https://unsloth.ai/docs/de/grundlagen/unsloth-environment-flags.md): Erweiterte Flags, die nützlich sein könnten, wenn du defekte Feinabstimmungen siehst oder Dinge abschalten möchtest.
- [Fortgesetztes Vortraining](https://unsloth.ai/docs/de/grundlagen/continued-pretraining.md): Auch bekannt als fortgesetzte Feinabstimmung. Unsloth ermöglicht es dir, kontinuierlich vorzutrainen, damit ein Modell eine neue Sprache lernen kann.
- [Feinabstimmung vom letzten Checkpoint](https://unsloth.ai/docs/de/grundlagen/finetuning-from-last-checkpoint.md): Checkpointing erlaubt es dir, deinen Feinabstimmungsfortschritt zu speichern, damit du ihn pausieren und später fortsetzen kannst.
- [Unsloth-Benchmarks](https://unsloth.ai/docs/de/grundlagen/unsloth-benchmarks.md): Von Unsloth aufgezeichnete Benchmarks auf NVIDIA-GPUs.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://unsloth.ai/docs/de/grundlagen.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.