How to Run Local AI Models with Hermes Agent

Guide on using open LLMs with Hermes Agent locally.

This guide enables you to run open LLMs locally with Hermes Agent via Unsloth. Hermes Agent is an open-source autonomous AI agent that connects to a model endpoint, executes tasks, and improves over time through memory and learned skills.

It works with any local model exposed through Unsloth’s OpenAI-compatible API, including: DeepSeek, Qwen, Gemma, and more. Hermes acts as the agent client, while Unsloth loads and serves models via a local API.

After setup, every prompt sent through Hermes will run against your local model instead of a remote provider.

Setup Hermes🦥 Use open models with Unsloth

In this tutorial, you’ll install Hermes and configure it to use unsloth/Qwen3.6-27B-GGUF served from Unsloth. Prefer a different model? Swap in any other model by loading it in Unsloth and updating the configuration.

Setup Hermes Agent

Prerequisites. The installer verifies these and halts if any are missing. Install what's not already on your machine first:

  • OS Linux, macOS, or Windows via WSL.

  • uv Python package manager. Install with curl -LsSf https://astral.sh/uv/install.sh | sh.

  • Python 3.11+ the installer can provision this via uv if it's missing.

  • Git to clone the Hermes repo.

  • Node.js 18+ for Hermes's browser tools.

  • ripgrep (rg) for fast file search.

  • ffmpeg for TTS/voice messages.

1. Run the installer in a terminal:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

The installer will:

  1. Detect your OS.

  2. Verify every prerequisite listed above and print a ✓ or ✗ for each.

  3. Clone Hermes into ~/.hermes/hermes-agent/ (over SSH if a GitHub SSH key is configured, otherwise HTTPS).

  4. Create a Python 3.11 virtualenv at ~/.hermes/hermes-agent/venv/.

  5. Install Hermes and all Python dependencies.

  6. Install Node.js dependencies for the browser tools.

  7. Install Playwright's Chromium engine. This step prompts for sudo so Playwright can install shared libraries. Hermes itself does not require root.

2. Reload your shell so the hermes command is on your PATH:

3. Verify the install:

If the command resolves, Hermes is installed. Everything lives under ~/.hermes/:

Path
What it is

~/.hermes/config.yaml

Main settings (model, provider, tools, TTS, …)

~/.hermes/.env

API keys and other secrets

~/.hermes/hermes-agent/

The Hermes source + virtualenv

~/.hermes/cron/, sessions/, logs/

Runtime data

~/.hermes/skills/

Installed skills (synced from the Skills Hub)

Full install reference: hermes-agent.nousresearch.com/docs/getting-started/installation. If the installer reports a missing prerequisite, install it and re-run the one-liner. The installer is idempotent.

Installing Unsloth

⚡ Quickstart

After installing OpenCode, we'll need install Unsloth Studio to enable OpenCode to serve and run inference of local models.

  1. Install or update Unsloth Studio. Earlier versions don't expose the external API. See Installation.

  2. Launch Unsloth. Note the port it starts on is usually 8000 or 8888. You'll see it in the terminal output and in the browser URL (http://localhost:PORT).

  3. Load a model. Click New Chat, pick or search a model (GGUF), and wait for it to finish loading.

  4. Create an API key. In Unsloth, click your Unsloth avatar in the bottom-left → SettingsAPI → type a key name → Create. Copy the sk-unsloth-… value that appears . Unsloth only shows it once.

  5. Point your client at Unsloth. Use http://localhost:PORT as the base URL and your sk-unsloth-… key for auth. Jump to the recipe for your tool below.

🔑 Creating an API key

  1. Open the sidebar, click your Unsloth avatar at the bottom-left.

  2. Go to SettingsAPI.

  3. Enter a friendly name (e.g. claude-code-macbook).

  4. (Optional) Set an expiry.

  5. Click Create.

  6. Copy the key immediately. Unsloth stores only a hash and you won't be able to view it again.

All keys start with the sk-unsloth- prefix. Revoke a key from the same page at any time. Requests made with a revoked key will fail with 401 Unauthorized.

🦥 Integrate Hermes with Unsloth API

Hermes sends each chat turn to a configured inference provider and connects to OpenAI-compatible endpoints. Configure the provider during installation or later in the setup wizard.

1. Open the setup wizard:

Pick Model & Provider from the "What would you like to do?" menu to configure only the inference endpoint, or Full Setup to walk through everything (TTS, tools, messaging gateway, agent settings).

2. Select the Custom OpenAI-compatible endpoint when Hermes prompts you for an inference provider.

3. Fill in the prompts as Hermes walks through them:

Prompt
Value

API base URL

http://localhost:8888/v1 (your Unsloth port + /v1)

API key

Your sk-unsloth-… key

Detected model: … Use this model?

Y (Hermes auto-detects the model via GET /v1/models)

Context length in tokens

(leave blank for auto-detect)

Display name

Anything you like, e.g. unsloth-api

Hermes verifies the endpoint against /v1/models and confirms the detected model before continuing.

4. Accept defaults for the remaining prompts (TTS, tools, messaging gateway, agent settings) you can reconfigure any of them later. Hermes writes everything to ~/.hermes/config.yaml and ~/.hermes/.env.

5. Launch Hermes:

The startup banner shows your Unsloth model name in the status bar (e.g. unsloth/Qwen3.6-27B-GGUF), and the prompt is ready for input.

To reconfigure just the model later, run hermes setup model. To edit the config file directly, hermes config edit opens ~/.hermes/config.yaml in your $EDITOR.

Last updated

Was this helpful?