AI Agents Artificial Intelligence Open Source

Nous Research Hermes Agent:
Setup and Tutorial Guide

Learn how to install and set up Hermes Agent, the open-source AI agent by Nous Research that remembers, learns, and grows smarter with every task.

OpenClaw launched with a lot of hype and security concerns — and with it, a wave of copycats claiming to solve its problems. The two biggest issues: security (due to repo size) and cost. Running OpenClaw with top-tier models like Claude Opus 4.6 (required to prevent prompt injection) can get expensive fast, especially when the agent needs to load large memory and skill contexts. Enter Hermes Agent.

The creators of Hermes Agent claim their agent is better than OpenClaw at using open-source models — that open-source models can be used effectively with the right harness. This tutorial examines those claims by walking through installation, local and online model use, and how to build a practical research agent.

What Is Hermes Agent?

Hermes Agent is an open-source OpenClaw alternative by Nous Research, the lab behind the Hermes family of models. After launch it became very popular, reaching over 30K GitHub stars. Unlike OpenClaw, Hermes Agent can create skills from experience, improve itself, and persist knowledge across sessions.

Key Capabilities

🔄

Closed Learning Loop

Creates skills automatically, improves them during use, and nudges itself to persist knowledge.

📦

Context Compression

Dual compression + Anthropic prompt caching prevents API failures and keeps costs low.

🧩

Skills System

Compatible with agentskills.io. Ships with bundled skills and saves its own as you use it.

📱

Multi-Platform Gateway

Telegram, Discord, Slack, Signal, and WhatsApp. All sessions share one SQLite database.

⚡

Subagents & Parallelism

Spawn isolated subagents for parallel workstreams via the delegate_task tool.

🔌

MCP Support

Connect to any MCP server for APIs, databases, or company systems without code changes.

🎓

RL Training Pipeline

Built-in Tinker-Atropos pipeline for training LLMs with GRPO + LoRA adapters.

🗄️

Persistent SQLite Memory

Every session stored with FTS5 full-text search. Retrieve memories from weeks ago.

Closed Learning Loop & Self-Improving Memory

The Hermes Agent has a closed learning loop:

Agent-curated memory with periodic nudges
Creates skills automatically after performing complex tasks
Improves skills as it uses them

Sessions are stored in a SQLite database with FTS5 full-text search, enabling the agent to retrieve memories from weeks ago even if they're not currently in memory. Hermes also uses Honcho memory — giving the agent a persistent understanding of users across sessions — in addition to memory.md and user.md files for communication style, goals, and preferences.

Context Compression & Caching

Hermes uses dual compression and Anthropic's prompt caching to manage context across long conversations. This also prevents API failures when context is too large — it prunes old results and summarizes conversations using an LLM.

Skills System

Like OpenClaw, Hermes supports skills compatible with agentskills.io. They follow a progressive disclosure pattern to minimize token use. All skills are stored at ~/.hermes/skills/, but you can also point the agent to external skills.

directory structure

~/.hermes/skills/
├── mlops/                    # Category directory
│   └── axolotl/
│       ├── SKILL.md          # Main instructions (required)
│       ├── references/       # Additional docs
│       ├── templates/        # Output formats
│       ├── scripts/          # Helper scripts callable from skill
│       └── assets/           # Supplementary files
├── devops/
│   └── deploy-k8s/          # Agent-created skill
│       ├── SKILL.md
│       └── references/
├── .hub/                     # Skills Hub state
│   ├── lock.json
│   ├── quarantine/
│   └── audit.log
└── .bundled_manifest         # Tracks seeded bundled skills

Multi-Platform Gateway

Hermes supports Telegram, Discord, Slack, Signal, and WhatsApp, with voice memo transcription. Since all sessions go to the same database, you can start a conversation in your terminal and continue it on Telegram.

Subagents & Parallel Workstreams

The delegate_task tool spawns multiple subagents with restricted toolsets and isolated terminal sessions. Each starts a fresh conversation — no shared history — so you must provide all context the subagent needs. Use cases include:

Research multiple topics simultaneously and collect summaries
Code review, fix, and refactor multiple files in parallel

MCP Support & Extended Tool Access

For any tool missing in Hermes, connect to MCPs without changing Hermes Agent code. You can connect to APIs, databases, or company systems by:

Installing MCP support on Hermes
Adding an MCP server
Whitelisting what tools the MCP can expose
Blacklisting dangerous activities (e.g., deleting customer records)

RL Training & Trajectory Generation

Hermes includes an integrated RL training pipeline built on Tinker-Atropos. This enables training LLMs on specific environments using GRPO (Group Relative Policy Optimization) with LoRA adapters — useful for building the next generation of tool-calling models.

How Hermes Compares to Standard AI Assistants

Feature	Hermes Agent	OpenClaw	Standard Assistants
Persistent memory (`memory.md`)	✓ Yes	✓ Yes	✗ No
SQLite session storage (FTS5)	✓ Yes	✗ No	✗ No
Self-improving skills	✓ Yes	~ Partial	✗ No
Fallback model support	✓ Yes	✓ Yes	✗ No
Open-source model harness	✓ Best-in-class	~ Basic	✗ No
Multi-platform gateway	✓ Yes	✓ Yes	~ Limited
VPS / cheap deployment	✓ $5 VPS	✓ Yes	✗ No
RL training pipeline	✓ Built-in	✗ No	✗ No

Supported Models & Endpoints

Hermes Agent is model agnostic. It works with any of the following:

Nous Portal

OpenRouter (100+ models)

Anthropic (Claude)

OpenAI (GPT / Codex)

Google Gemini

DeepSeek

Ollama (local models)

Custom endpoint

Prerequisites

OS: Linux, macOS, or WSL2
Python: 3.11+
Node.js: Required (installed automatically during setup)
API key: From any supported provider (Anthropic, OpenRouter, OpenAI, etc.)

Step-by-Step Tutorial: Building a Research Agent

Let's build a research agent that can search the web and send you a daily briefing on Telegram.

Install Hermes Agent

Open your terminal and run the one-line installer:

bash

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

💡

TipIf you have an existing OpenClaw installation, the setup wizard can migrate it automatically.

Get Your Telegram Interface Token

For the Telegram gateway, open Telegram and search for @BotFather. Send /newbot, follow the prompts, and copy the bot token (format: 123456789:AAH...).

Then use @userinfobot to get your Telegram user ID — this ensures your bot only responds to you.

Initialize with Setup Wizard

Run the setup wizard and choose Full Setup to configure everything including API keys and the Telegram bot:

bash

hermes setup

The wizard presents a menu with options: Model & Provider, Terminal Backend, Messaging Platforms (Gateway), Tools, and Agent Settings.

Once setup completes, verify everything with:

bash

hermes doctor

Configuration & Model Selection

You can change your model provider at any time by running:

bash

hermes model

This lets you select from 18 providers including OpenAI Codex, OpenRouter, Anthropic, Gemini, MiniMax, Kilo Code, and custom endpoints. The current model and active provider are shown at the top.

Set Up the Gateway

The gateway lets Hermes reach you on Telegram so you don't have to stay in the terminal:

bash

hermes gateway setup

After this, you should be able to send messages from Telegram and get responses. Hermes will use its terminal tool to query APIs, parse responses, and reply directly in your Telegram chat.

ℹ️

What you'll seeHermes Agent uses terminal to write and execute Python snippets inline — for example, querying the Yahoo Finance chart API and parsing the JSON response to answer stock price questions.

Leveling Up: The "Multi-Tool" Agent

To improve Hermes's web search capabilities, configure a dedicated search tool. This example uses FireCrawl (free API key available on their website):

bash

hermes config set FIRECRAWL_API_KEY your_fire-crawl_key

With FireCrawl configured, Hermes can plan and delegate tasks across multiple subagents in parallel. For example, asking it to summarize several tools at once will cause it to plan 3 tasks and delegate them automatically.

Multiple Agent Profiles

The other way to run multiple agents is to set up profiles. Each profile gets its own config, API keys, memory, sessions, gateway, and skills. For example:

A coding agent
A personal assistant
A research agent

Create one from scratch or clone your current settings:

bash

# Clone your default profile into a new "work" profile
hermes profile create work --clone

# Start chatting with the work profile
work chat

# Configure API keys and model for this profile
work setup

💡

Profile isolationEach profile gets its own .env file, so you can configure different Telegram bots, API keys, and agent personalities per profile.

Deployment Options: Running Beyond Your Laptop

You don't have to run Hermes on your daily machine. Options include:

Dedicated computer — a separate machine you leave running
VPS (Modal, Daytona) — serverless persistence; hibernates when idle, wakes on demand, costs nearly nothing between sessions
Docker container — recommended for any cloud/VPS deployment

Security Best Practices for VPS Deployment

Use the container backend
Set explicit allowlists for tool permissions
Use pairing codes instead of hardcoded user IDs
Store secrets securely with proper file permissions
Regularly update with hermes update
Never run the agent as root user
Set appropriate resource limits (CPU, disk, memory)
Monitor logs for unauthorized access attempts

Local & Private: Running Hermes Agent Offline

Run Hermes completely offline using Ollama. Here's how to set up qwen2.5-coder:32b as a local model:

bash — install and serve model via Ollama

# Pull the model and start serving
ollama pull qwen2.5-coder:32b
ollama serve   # Starts on port 11434

bash — point Hermes to Ollama

hermes model
# Select "Custom endpoint (self-hosted / VLLM / etc.)"
# Enter URL:        http://localhost:11434/v1
# Skip API key:     (Ollama doesn't need one)
# Enter model name: qwen2.5-coder:32b

Make sure to increase the context window — Hermes needs to load the system prompt, tools, and return a full response:

bash — increase context window (pick one method)

# Option 1: environment variable (recommended)
OLLAMA_CONTEXT_LENGTH=32768 ollama serve

# Option 2: systemd service override
sudo systemctl edit ollama.service
# Add: Environment="OLLAMA_CONTEXT_LENGTH=32768"
# Then: sudo systemctl daemon-reload && sudo systemctl restart ollama

# Option 3: bake into a custom model (persistent)
echo -e "FROM qwen2.5-coder:32b\nPARAMETER num_ctx 32768" > Modelfile
ollama create qwen2.5-coder-32k -f Modelfile

Common Pitfalls & Troubleshooting

Connection refused errors

Run hermes doctor first — it checks for missing provider config, broken env vars, and misconfigured paths. Re-run setup if you suspect a typo in your API key.

Context window limits hit

Type /compress to trigger manual compression. Or edit ~/.hermes/config.yaml and set compression.threshold: 0.50 with your preferred summary_model.

Skill not triggering or reusing

Verify the skill exists and its instructions are being loaded: hermes chat --toolsets skills -q "Use the X skill to do Y"

Gateway not receiving messages

Check status with hermes gateway status. If stopped, restart it: hermes gateway start

Model endpoint auth failures

Check ~/.hermes/.env for correct API keys. Run hermes model to confirm the selected model has a valid key configured.

⚠️

Cost noteRunning agents is not cheap regardless of the tool. While Hermes is optimized for open-source models, API costs still apply. Local model setup via Ollama is the only way to achieve truly unlimited usage — but requires capable hardware (GPU recommended for 32B models).