AI Agents Artificial Intelligence Open Source

Nous Research Hermes Agent:
Setup and Tutorial Guide

Learn how to install and set up Hermes Agent, the open-source AI agent by Nous Research that remembers, learns, and grows smarter with every task.

OpenClaw launched with a lot of hype and security concerns — and with it, a wave of copycats claiming to solve its problems. The two biggest issues: security (due to repo size) and cost. Running OpenClaw with top-tier models like Claude Opus 4.6 (required to prevent prompt injection) can get expensive fast, especially when the agent needs to load large memory and skill contexts. Enter Hermes Agent.

The creators of Hermes Agent claim their agent is better than OpenClaw at using open-source models — that open-source models can be used effectively with the right harness. This tutorial examines those claims by walking through installation, local and online model use, and how to build a practical research agent.


What Is Hermes Agent?

Hermes Agent is an open-source OpenClaw alternative by Nous Research, the lab behind the Hermes family of models. After launch it became very popular, reaching over 30K GitHub stars. Unlike OpenClaw, Hermes Agent can create skills from experience, improve itself, and persist knowledge across sessions.

Key Capabilities

🔄
Closed Learning Loop
Creates skills automatically, improves them during use, and nudges itself to persist knowledge.
📦
Context Compression
Dual compression + Anthropic prompt caching prevents API failures and keeps costs low.
🧩
Skills System
Compatible with agentskills.io. Ships with bundled skills and saves its own as you use it.
📱
Multi-Platform Gateway
Telegram, Discord, Slack, Signal, and WhatsApp. All sessions share one SQLite database.
Subagents & Parallelism
Spawn isolated subagents for parallel workstreams via the delegate_task tool.
🔌
MCP Support
Connect to any MCP server for APIs, databases, or company systems without code changes.
🎓
RL Training Pipeline
Built-in Tinker-Atropos pipeline for training LLMs with GRPO + LoRA adapters.
🗄️
Persistent SQLite Memory
Every session stored with FTS5 full-text search. Retrieve memories from weeks ago.

Closed Learning Loop & Self-Improving Memory

The Hermes Agent has a closed learning loop:

  • Agent-curated memory with periodic nudges
  • Creates skills automatically after performing complex tasks
  • Improves skills as it uses them

Sessions are stored in a SQLite database with FTS5 full-text search, enabling the agent to retrieve memories from weeks ago even if they're not currently in memory. Hermes also uses Honcho memory — giving the agent a persistent understanding of users across sessions — in addition to memory.md and user.md files for communication style, goals, and preferences.

Context Compression & Caching

Hermes uses dual compression and Anthropic's prompt caching to manage context across long conversations. This also prevents API failures when context is too large — it prunes old results and summarizes conversations using an LLM.

Skills System

Like OpenClaw, Hermes supports skills compatible with agentskills.io. They follow a progressive disclosure pattern to minimize token use. All skills are stored at ~/.hermes/skills/, but you can also point the agent to external skills.

directory structure
~/.hermes/skills/
├── mlops/                    # Category directory
│   └── axolotl/
│       ├── SKILL.md          # Main instructions (required)
│       ├── references/       # Additional docs
│       ├── templates/        # Output formats
│       ├── scripts/          # Helper scripts callable from skill
│       └── assets/           # Supplementary files
├── devops/
│   └── deploy-k8s/          # Agent-created skill
│       ├── SKILL.md
│       └── references/
├── .hub/                     # Skills Hub state
│   ├── lock.json
│   ├── quarantine/
│   └── audit.log
└── .bundled_manifest         # Tracks seeded bundled skills

Multi-Platform Gateway

Hermes supports Telegram, Discord, Slack, Signal, and WhatsApp, with voice memo transcription. Since all sessions go to the same database, you can start a conversation in your terminal and continue it on Telegram.

Subagents & Parallel Workstreams

The delegate_task tool spawns multiple subagents with restricted toolsets and isolated terminal sessions. Each starts a fresh conversation — no shared history — so you must provide all context the subagent needs. Use cases include:

  • Research multiple topics simultaneously and collect summaries
  • Code review, fix, and refactor multiple files in parallel

MCP Support & Extended Tool Access

For any tool missing in Hermes, connect to MCPs without changing Hermes Agent code. You can connect to APIs, databases, or company systems by:

  • Installing MCP support on Hermes
  • Adding an MCP server
  • Whitelisting what tools the MCP can expose
  • Blacklisting dangerous activities (e.g., deleting customer records)

RL Training & Trajectory Generation

Hermes includes an integrated RL training pipeline built on Tinker-Atropos. This enables training LLMs on specific environments using GRPO (Group Relative Policy Optimization) with LoRA adapters — useful for building the next generation of tool-calling models.


How Hermes Compares to Standard AI Assistants

Feature Hermes Agent OpenClaw Standard Assistants
Persistent memory (memory.md) ✓ Yes ✓ Yes ✗ No
SQLite session storage (FTS5) ✓ Yes ✗ No ✗ No
Self-improving skills ✓ Yes ~ Partial ✗ No
Fallback model support ✓ Yes ✓ Yes ✗ No
Open-source model harness ✓ Best-in-class ~ Basic ✗ No
Multi-platform gateway ✓ Yes ✓ Yes ~ Limited
VPS / cheap deployment ✓ $5 VPS ✓ Yes ✗ No
RL training pipeline ✓ Built-in ✗ No ✗ No

Supported Models & Endpoints

Hermes Agent is model agnostic. It works with any of the following:

Nous Portal
OpenRouter (100+ models)
Anthropic (Claude)
OpenAI (GPT / Codex)
Google Gemini
DeepSeek
Ollama (local models)
Custom endpoint

Prerequisites

  • OS: Linux, macOS, or WSL2
  • Python: 3.11+
  • Node.js: Required (installed automatically during setup)
  • API key: From any supported provider (Anthropic, OpenRouter, OpenAI, etc.)

Step-by-Step Tutorial: Building a Research Agent

Let's build a research agent that can search the web and send you a daily briefing on Telegram.

1
Install Hermes Agent

Open your terminal and run the one-line installer:

bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
💡
TipIf you have an existing OpenClaw installation, the setup wizard can migrate it automatically.
2
Get Your Telegram Interface Token

For the Telegram gateway, open Telegram and search for @BotFather. Send /newbot, follow the prompts, and copy the bot token (format: 123456789:AAH...).

Then use @userinfobot to get your Telegram user ID — this ensures your bot only responds to you.

3
Initialize with Setup Wizard

Run the setup wizard and choose Full Setup to configure everything including API keys and the Telegram bot:

bash
hermes setup

The wizard presents a menu with options: Model & Provider, Terminal Backend, Messaging Platforms (Gateway), Tools, and Agent Settings.

Once setup completes, verify everything with:

bash
hermes doctor
4
Configuration & Model Selection

You can change your model provider at any time by running:

bash
hermes model

This lets you select from 18 providers including OpenAI Codex, OpenRouter, Anthropic, Gemini, MiniMax, Kilo Code, and custom endpoints. The current model and active provider are shown at the top.

5
Set Up the Gateway

The gateway lets Hermes reach you on Telegram so you don't have to stay in the terminal:

bash
hermes gateway setup

After this, you should be able to send messages from Telegram and get responses. Hermes will use its terminal tool to query APIs, parse responses, and reply directly in your Telegram chat.

ℹ️
What you'll seeHermes Agent uses terminal to write and execute Python snippets inline — for example, querying the Yahoo Finance chart API and parsing the JSON response to answer stock price questions.

Leveling Up: The "Multi-Tool" Agent

To improve Hermes's web search capabilities, configure a dedicated search tool. This example uses FireCrawl (free API key available on their website):

bash
hermes config set FIRECRAWL_API_KEY your_fire-crawl_key

With FireCrawl configured, Hermes can plan and delegate tasks across multiple subagents in parallel. For example, asking it to summarize several tools at once will cause it to plan 3 tasks and delegate them automatically.

Multiple Agent Profiles

The other way to run multiple agents is to set up profiles. Each profile gets its own config, API keys, memory, sessions, gateway, and skills. For example:

  • A coding agent
  • A personal assistant
  • A research agent

Create one from scratch or clone your current settings:

bash
# Clone your default profile into a new "work" profile
hermes profile create work --clone

# Start chatting with the work profile
work chat

# Configure API keys and model for this profile
work setup
💡
Profile isolationEach profile gets its own .env file, so you can configure different Telegram bots, API keys, and agent personalities per profile.

Deployment Options: Running Beyond Your Laptop

You don't have to run Hermes on your daily machine. Options include:

  • Dedicated computer — a separate machine you leave running
  • VPS (Modal, Daytona) — serverless persistence; hibernates when idle, wakes on demand, costs nearly nothing between sessions
  • Docker container — recommended for any cloud/VPS deployment

Security Best Practices for VPS Deployment

  • Use the container backend
  • Set explicit allowlists for tool permissions
  • Use pairing codes instead of hardcoded user IDs
  • Store secrets securely with proper file permissions
  • Regularly update with hermes update
  • Never run the agent as root user
  • Set appropriate resource limits (CPU, disk, memory)
  • Monitor logs for unauthorized access attempts

Local & Private: Running Hermes Agent Offline

Run Hermes completely offline using Ollama. Here's how to set up qwen2.5-coder:32b as a local model:

bash — install and serve model via Ollama
# Pull the model and start serving
ollama pull qwen2.5-coder:32b
ollama serve   # Starts on port 11434
bash — point Hermes to Ollama
hermes model
# Select "Custom endpoint (self-hosted / VLLM / etc.)"
# Enter URL:        http://localhost:11434/v1
# Skip API key:     (Ollama doesn't need one)
# Enter model name: qwen2.5-coder:32b

Make sure to increase the context window — Hermes needs to load the system prompt, tools, and return a full response:

bash — increase context window (pick one method)
# Option 1: environment variable (recommended)
OLLAMA_CONTEXT_LENGTH=32768 ollama serve

# Option 2: systemd service override
sudo systemctl edit ollama.service
# Add: Environment="OLLAMA_CONTEXT_LENGTH=32768"
# Then: sudo systemctl daemon-reload && sudo systemctl restart ollama

# Option 3: bake into a custom model (persistent)
echo -e "FROM qwen2.5-coder:32b\nPARAMETER num_ctx 32768" > Modelfile
ollama create qwen2.5-coder-32k -f Modelfile

Common Pitfalls & Troubleshooting

Connection refused errors
Run hermes doctor first — it checks for missing provider config, broken env vars, and misconfigured paths. Re-run setup if you suspect a typo in your API key.
Context window limits hit
Type /compress to trigger manual compression. Or edit ~/.hermes/config.yaml and set compression.threshold: 0.50 with your preferred summary_model.
Skill not triggering or reusing
Verify the skill exists and its instructions are being loaded: hermes chat --toolsets skills -q "Use the X skill to do Y"
Gateway not receiving messages
Check status with hermes gateway status. If stopped, restart it: hermes gateway start
Model endpoint auth failures
Check ~/.hermes/.env for correct API keys. Run hermes model to confirm the selected model has a valid key configured.
⚠️
Cost noteRunning agents is not cheap regardless of the tool. While Hermes is optimized for open-source models, API costs still apply. Local model setup via Ollama is the only way to achieve truly unlimited usage — but requires capable hardware (GPU recommended for 32B models).