# How to Run Local AI Models with Hermes Agent This guide enables you to run open LLMs locally with **Hermes Agent** via [**Unsloth**](https://github.com/unslothai/unsloth). Hermes Agent is an **open-source** autonomous AI agent that connects to a model endpoint, executes tasks, and improves over time through memory and learned skills. It works with any **local model** exposed through Unsloth’s **OpenAI-compatible API**, including: DeepSeek, Qwen, Gemma, and more. Hermes acts as the agent client, while Unsloth loads and serves models via a local API. After setup, every prompt sent through Hermes will run against your local model instead of a remote provider. Setup Hermes 🦥 Use open models with Unsloth {% hint style="info" %} In this tutorial, you’ll install Hermes and configure it to use `unsloth/Qwen3.6-27B-GGUF` served from Unsloth. Prefer a different model? Swap in any other model by loading it in Unsloth and updating the configuration. {% endhint %} ### Setup Hermes Agent **Prerequisites.** The installer verifies these and halts if any are missing. Install what's not already on your machine first: * **OS** Linux, macOS, or Windows via WSL. * **uv** Python package manager. Install with `curl -LsSf https://astral.sh/uv/install.sh | sh`. * **Python 3.11+** the installer can provision this via `uv` if it's missing. * **Git** to clone the Hermes repo. * **Node.js** 18+ for Hermes's browser tools. * **ripgrep** (`rg`) for fast file search. * **ffmpeg** for TTS/voice messages. #### **1. Run the installer** in a terminal: ```bash curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash ``` The installer will: 1. Detect your OS. 2. Verify every prerequisite listed above and print a ✓ or ✗ for each. 3. Clone Hermes into `~/.hermes/hermes-agent/` (over SSH if a GitHub SSH key is configured, otherwise HTTPS). 4. Create a Python 3.11 virtualenv at `~/.hermes/hermes-agent/venv/`. 5. Install Hermes and all Python dependencies. 6. Install Node.js dependencies for the browser tools. 7. Install Playwright's Chromium engine. **This step prompts for `sudo`** so Playwright can install shared libraries. Hermes itself does not require root.

#### **2. Reload your shell** so the `hermes` command is on your `PATH`: {% code title="bash" %} ```bash source ~/.bashrc ``` {% endcode %} {% code title="zsh" %} ```bash source ~/.zshrc ``` {% endcode %} #### **3. Verify the install:** ```bash hermes --version ``` If the command resolves, Hermes is installed. Everything lives under `~/.hermes/`: | Path | What it is | | --------------------------------------- | ---------------------------------------------- | | `~/.hermes/config.yaml` | Main settings (model, provider, tools, TTS, …) | | `~/.hermes/.env` | API keys and other secrets | | `~/.hermes/hermes-agent/` | The Hermes source + virtualenv | | `~/.hermes/cron/`, `sessions/`, `logs/` | Runtime data | | `~/.hermes/skills/` | Installed skills (synced from the Skills Hub) | {% hint style="info" %} Full install reference: [hermes-agent.nousresearch.com/docs/getting-started/installation](https://hermes-agent.nousresearch.com/docs/getting-started/installation). If the installer reports a missing prerequisite, install it and re-run the one-liner. The installer is idempotent. {% endhint %} ### Installing Unsloth ### ⚡ Quickstart After installing OpenCode, we'll need install Unsloth Studio to enable OpenCode to serve and run inference of local models. 1. **Install or update Unsloth Studio.** Earlier versions don't expose the external API. See Installation. 2. **Launch Unsloth.** Note the port it starts on is usually `8000` or `8888`. You'll see it in the terminal output and in the browser URL (`http://localhost:PORT`). 3. **Load a model.** Click **New Chat**, pick or search a model (GGUF), and wait for it to finish loading. 4. **Create an API key.** In Unsloth, click your **Unsloth** avatar in the bottom-left → **Settings** → **API** → type a key name → **Create**. Copy the `sk-unsloth-…` value that appears . Unsloth only shows it once. 5. **Point your client at Unsloth.** Use `http://localhost:PORT` as the base URL and your `sk-unsloth-…` key for auth. Jump to the recipe for your tool below. ### 🔑 Creating an API key 1. Open the sidebar, click your **Unsloth** avatar at the bottom-left. 2. Go to **Settings** → **API**. 3. Enter a friendly name (e.g. `claude-code-macbook`). 4. *(Optional)* Set an expiry. 5. Click **Create**. 6. **Copy the key immediately.** Unsloth stores only a hash and you won't be able to view it again.

All keys start with the `sk-unsloth-` prefix. Revoke a key from the same page at any time. Requests made with a revoked key will fail with `401 Unauthorized`. {% hint style="warning" %} Treat your API key like a password. Anyone with the key and network access to your Unsloth instance can send requests to your loaded model. {% endhint %} ### 🦥 Integrate Hermes with Unsloth API Hermes sends each chat turn to a configured inference provider and connects to **OpenAI-compatible** endpoints. Configure the provider during installation or later in the setup wizard. **1. Open the setup wizard:** ```bash hermes setup ``` Pick **Model & Provider** from the "What would you like to do?" menu to configure only the inference endpoint, or **Full Setup** to walk through everything (TTS, tools, messaging gateway, agent settings).

**2. Select the Custom OpenAI-compatible endpoint** when Hermes prompts you for an inference provider.

**3. Fill in the prompts** as Hermes walks through them: | Prompt | Value | | ------------------------------------- | ---------------------------------------------------------- | | **API base URL** | `http://localhost:8888/v1` *(your Unsloth port + `/v1`)* | | **API key** | Your `sk-unsloth-…` key | | **Detected model: … Use this model?** | `Y` *(Hermes auto-detects the model via `GET /v1/models`)* | | **Context length in tokens** | *(leave blank for auto-detect)* | | **Display name** | Anything you like, e.g. `unsloth-api` | Hermes verifies the endpoint against `/v1/models` and confirms the detected model before continuing.

**4. Accept defaults for the remaining prompts** (TTS, tools, messaging gateway, agent settings) you can reconfigure any of them later. Hermes writes everything to `~/.hermes/config.yaml` and `~/.hermes/.env`.

**5. Launch Hermes:** ```bash hermes ``` The startup banner shows your Unsloth model name in the status bar (e.g. `unsloth/Qwen3.6-27B-GGUF`), and the prompt is ready for input.

{% hint style="info" %} To reconfigure just the model later, run `hermes setup model`. To edit the config file directly, `hermes config edit` opens `~/.hermes/config.yaml` in your `$EDITOR`. {% endhint %} --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://unsloth.ai/docs/integrations/hermes-agent.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.