How to Run Local AI Models with Hermes Agent
Guide on using open LLMs with Hermes Agent locally.
This guide enables you to run open LLMs locally with Hermes Agent via Unsloth. Hermes Agent is an open-source autonomous AI agent that connects to a model endpoint, executes tasks, and improves over time through memory and learned skills.
It works with any local model exposed through Unsloth’s OpenAI-compatible API, including: DeepSeek, Qwen, Gemma, and more. Hermes acts as the agent client, while Unsloth loads and serves models via a local API.
After setup, every prompt sent through Hermes will run against your local model instead of a remote provider.
Setup Hermes🦥 Use open models with Unsloth
In this tutorial, you’ll install Hermes and configure it to use unsloth/Qwen3.6-27B-GGUF served from Unsloth. Prefer a different model? Swap in any other model by loading it in Unsloth and updating the configuration.
Setup Hermes Agent
Prerequisites. The installer verifies these and halts if any are missing. Install what's not already on your machine first:
OS Linux, macOS, or Windows via WSL.
uv Python package manager. Install with
curl -LsSf https://astral.sh/uv/install.sh | sh.Python 3.11+ the installer can provision this via
uvif it's missing.Git to clone the Hermes repo.
Node.js 18+ for Hermes's browser tools.
ripgrep (
rg) for fast file search.ffmpeg for TTS/voice messages.
1. Run the installer in a terminal:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bashThe installer will:
Detect your OS.
Verify every prerequisite listed above and print a ✓ or ✗ for each.
Clone Hermes into
~/.hermes/hermes-agent/(over SSH if a GitHub SSH key is configured, otherwise HTTPS).Create a Python 3.11 virtualenv at
~/.hermes/hermes-agent/venv/.Install Hermes and all Python dependencies.
Install Node.js dependencies for the browser tools.
Install Playwright's Chromium engine. This step prompts for
sudoso Playwright can install shared libraries. Hermes itself does not require root.

2. Reload your shell so the hermes command is on your PATH:
hermes command is on your PATH:3. Verify the install:
If the command resolves, Hermes is installed. Everything lives under ~/.hermes/:
~/.hermes/config.yaml
Main settings (model, provider, tools, TTS, …)
~/.hermes/.env
API keys and other secrets
~/.hermes/hermes-agent/
The Hermes source + virtualenv
~/.hermes/cron/, sessions/, logs/
Runtime data
~/.hermes/skills/
Installed skills (synced from the Skills Hub)
Full install reference: hermes-agent.nousresearch.com/docs/getting-started/installation. If the installer reports a missing prerequisite, install it and re-run the one-liner. The installer is idempotent.
Installing Unsloth
⚡ Quickstart
After installing OpenCode, we'll need install Unsloth Studio to enable OpenCode to serve and run inference of local models.
Install or update Unsloth Studio. Earlier versions don't expose the external API. See Installation.
Launch Unsloth. Note the port it starts on is usually
8000or8888. You'll see it in the terminal output and in the browser URL (http://localhost:PORT).Load a model. Click New Chat, pick or search a model (GGUF), and wait for it to finish loading.
Create an API key. In Unsloth, click your Unsloth avatar in the bottom-left → Settings → API → type a key name → Create. Copy the
sk-unsloth-…value that appears . Unsloth only shows it once.Point your client at Unsloth. Use
http://localhost:PORTas the base URL and yoursk-unsloth-…key for auth. Jump to the recipe for your tool below.
🔑 Creating an API key
Open the sidebar, click your Unsloth avatar at the bottom-left.
Go to Settings → API.
Enter a friendly name (e.g.
claude-code-macbook).(Optional) Set an expiry.
Click Create.
Copy the key immediately. Unsloth stores only a hash and you won't be able to view it again.

All keys start with the sk-unsloth- prefix. Revoke a key from the same page at any time. Requests made with a revoked key will fail with 401 Unauthorized.
Treat your API key like a password. Anyone with the key and network access to your Unsloth instance can send requests to your loaded model.
🦥 Integrate Hermes with Unsloth API
Hermes sends each chat turn to a configured inference provider and connects to OpenAI-compatible endpoints. Configure the provider during installation or later in the setup wizard.
1. Open the setup wizard:
Pick Model & Provider from the "What would you like to do?" menu to configure only the inference endpoint, or Full Setup to walk through everything (TTS, tools, messaging gateway, agent settings).

2. Select the Custom OpenAI-compatible endpoint when Hermes prompts you for an inference provider.

3. Fill in the prompts as Hermes walks through them:
API base URL
http://localhost:8888/v1 (your Unsloth port + /v1)
API key
Your sk-unsloth-… key
Detected model: … Use this model?
Y (Hermes auto-detects the model via GET /v1/models)
Context length in tokens
(leave blank for auto-detect)
Display name
Anything you like, e.g. unsloth-api
Hermes verifies the endpoint against /v1/models and confirms the detected model before continuing.

4. Accept defaults for the remaining prompts (TTS, tools, messaging gateway, agent settings) you can reconfigure any of them later. Hermes writes everything to ~/.hermes/config.yaml and ~/.hermes/.env.

5. Launch Hermes:
The startup banner shows your Unsloth model name in the status bar (e.g. unsloth/Qwen3.6-27B-GGUF), and the prompt is ready for input.

To reconfigure just the model later, run hermes setup model. To edit the config file directly, hermes config edit opens ~/.hermes/config.yaml in your $EDITOR.
Last updated
Was this helpful?

