Nous Research Hermes Agent:
Setup and Tutorial Guide
Learn how to install and set up Hermes Agent, the open-source AI agent by Nous Research that remembers, learns, and grows smarter with every task.
OpenClaw launched with a lot of hype and security concerns — and with it, a wave of copycats claiming to solve its problems. The two biggest issues: security (due to repo size) and cost. Running OpenClaw with top-tier models like Claude Opus 4.6 (required to prevent prompt injection) can get expensive fast, especially when the agent needs to load large memory and skill contexts. Enter Hermes Agent.
The creators of Hermes Agent claim their agent is better than OpenClaw at using open-source models — that open-source models can be used effectively with the right harness. This tutorial examines those claims by walking through installation, local and online model use, and how to build a practical research agent.
What Is Hermes Agent?
Hermes Agent is an open-source OpenClaw alternative by Nous Research, the lab behind the Hermes family of models. After launch it became very popular, reaching over 30K GitHub stars. Unlike OpenClaw, Hermes Agent can create skills from experience, improve itself, and persist knowledge across sessions.
Key Capabilities
delegate_task tool.Closed Learning Loop & Self-Improving Memory
The Hermes Agent has a closed learning loop:
- Agent-curated memory with periodic nudges
- Creates skills automatically after performing complex tasks
- Improves skills as it uses them
Sessions are stored in a SQLite database with FTS5 full-text search, enabling the agent to retrieve memories from weeks ago even if they're not currently in memory. Hermes also uses Honcho memory — giving the agent a persistent understanding of users across sessions — in addition to memory.md and user.md files for communication style, goals, and preferences.
Context Compression & Caching
Hermes uses dual compression and Anthropic's prompt caching to manage context across long conversations. This also prevents API failures when context is too large — it prunes old results and summarizes conversations using an LLM.
Skills System
Like OpenClaw, Hermes supports skills compatible with agentskills.io. They follow a progressive disclosure pattern to minimize token use. All skills are stored at ~/.hermes/skills/, but you can also point the agent to external skills.
~/.hermes/skills/ ├── mlops/ # Category directory │ └── axolotl/ │ ├── SKILL.md # Main instructions (required) │ ├── references/ # Additional docs │ ├── templates/ # Output formats │ ├── scripts/ # Helper scripts callable from skill │ └── assets/ # Supplementary files ├── devops/ │ └── deploy-k8s/ # Agent-created skill │ ├── SKILL.md │ └── references/ ├── .hub/ # Skills Hub state │ ├── lock.json │ ├── quarantine/ │ └── audit.log └── .bundled_manifest # Tracks seeded bundled skills
Multi-Platform Gateway
Hermes supports Telegram, Discord, Slack, Signal, and WhatsApp, with voice memo transcription. Since all sessions go to the same database, you can start a conversation in your terminal and continue it on Telegram.
Subagents & Parallel Workstreams
The delegate_task tool spawns multiple subagents with restricted toolsets and isolated terminal sessions. Each starts a fresh conversation — no shared history — so you must provide all context the subagent needs. Use cases include:
- Research multiple topics simultaneously and collect summaries
- Code review, fix, and refactor multiple files in parallel
MCP Support & Extended Tool Access
For any tool missing in Hermes, connect to MCPs without changing Hermes Agent code. You can connect to APIs, databases, or company systems by:
- Installing MCP support on Hermes
- Adding an MCP server
- Whitelisting what tools the MCP can expose
- Blacklisting dangerous activities (e.g., deleting customer records)
RL Training & Trajectory Generation
Hermes includes an integrated RL training pipeline built on Tinker-Atropos. This enables training LLMs on specific environments using GRPO (Group Relative Policy Optimization) with LoRA adapters — useful for building the next generation of tool-calling models.
How Hermes Compares to Standard AI Assistants
| Feature | Hermes Agent | OpenClaw | Standard Assistants |
|---|---|---|---|
Persistent memory (memory.md) |
✓ Yes | ✓ Yes | ✗ No |
| SQLite session storage (FTS5) | ✓ Yes | ✗ No | ✗ No |
| Self-improving skills | ✓ Yes | ~ Partial | ✗ No |
| Fallback model support | ✓ Yes | ✓ Yes | ✗ No |
| Open-source model harness | ✓ Best-in-class | ~ Basic | ✗ No |
| Multi-platform gateway | ✓ Yes | ✓ Yes | ~ Limited |
| VPS / cheap deployment | ✓ $5 VPS | ✓ Yes | ✗ No |
| RL training pipeline | ✓ Built-in | ✗ No | ✗ No |
Supported Models & Endpoints
Hermes Agent is model agnostic. It works with any of the following:
Prerequisites
- OS: Linux, macOS, or WSL2
- Python: 3.11+
- Node.js: Required (installed automatically during setup)
- API key: From any supported provider (Anthropic, OpenRouter, OpenAI, etc.)
Step-by-Step Tutorial: Building a Research Agent
Let's build a research agent that can search the web and send you a daily briefing on Telegram.
Open your terminal and run the one-line installer:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
For the Telegram gateway, open Telegram and search for @BotFather. Send /newbot, follow the prompts, and copy the bot token (format: 123456789:AAH...).
Then use @userinfobot to get your Telegram user ID — this ensures your bot only responds to you.
Run the setup wizard and choose Full Setup to configure everything including API keys and the Telegram bot:
hermes setup
The wizard presents a menu with options: Model & Provider, Terminal Backend, Messaging Platforms (Gateway), Tools, and Agent Settings.
Once setup completes, verify everything with:
hermes doctor
You can change your model provider at any time by running:
hermes model
This lets you select from 18 providers including OpenAI Codex, OpenRouter, Anthropic, Gemini, MiniMax, Kilo Code, and custom endpoints. The current model and active provider are shown at the top.
The gateway lets Hermes reach you on Telegram so you don't have to stay in the terminal:
hermes gateway setup
After this, you should be able to send messages from Telegram and get responses. Hermes will use its terminal tool to query APIs, parse responses, and reply directly in your Telegram chat.
terminal to write and execute Python snippets inline — for example, querying the Yahoo Finance chart API and parsing the JSON response to answer stock price questions.Leveling Up: The "Multi-Tool" Agent
To improve Hermes's web search capabilities, configure a dedicated search tool. This example uses FireCrawl (free API key available on their website):
hermes config set FIRECRAWL_API_KEY your_fire-crawl_key
With FireCrawl configured, Hermes can plan and delegate tasks across multiple subagents in parallel. For example, asking it to summarize several tools at once will cause it to plan 3 tasks and delegate them automatically.
Multiple Agent Profiles
The other way to run multiple agents is to set up profiles. Each profile gets its own config, API keys, memory, sessions, gateway, and skills. For example:
- A coding agent
- A personal assistant
- A research agent
Create one from scratch or clone your current settings:
# Clone your default profile into a new "work" profile hermes profile create work --clone # Start chatting with the work profile work chat # Configure API keys and model for this profile work setup
.env file, so you can configure different Telegram bots, API keys, and agent personalities per profile.Deployment Options: Running Beyond Your Laptop
You don't have to run Hermes on your daily machine. Options include:
- Dedicated computer — a separate machine you leave running
- VPS (Modal, Daytona) — serverless persistence; hibernates when idle, wakes on demand, costs nearly nothing between sessions
- Docker container — recommended for any cloud/VPS deployment
Security Best Practices for VPS Deployment
- Use the container backend
- Set explicit allowlists for tool permissions
- Use pairing codes instead of hardcoded user IDs
- Store secrets securely with proper file permissions
- Regularly update with
hermes update - Never run the agent as root user
- Set appropriate resource limits (CPU, disk, memory)
- Monitor logs for unauthorized access attempts
Local & Private: Running Hermes Agent Offline
Run Hermes completely offline using Ollama. Here's how to set up qwen2.5-coder:32b as a local model:
# Pull the model and start serving ollama pull qwen2.5-coder:32b ollama serve # Starts on port 11434
hermes model
# Select "Custom endpoint (self-hosted / VLLM / etc.)"
# Enter URL: http://localhost:11434/v1
# Skip API key: (Ollama doesn't need one)
# Enter model name: qwen2.5-coder:32b
Make sure to increase the context window — Hermes needs to load the system prompt, tools, and return a full response:
# Option 1: environment variable (recommended) OLLAMA_CONTEXT_LENGTH=32768 ollama serve # Option 2: systemd service override sudo systemctl edit ollama.service # Add: Environment="OLLAMA_CONTEXT_LENGTH=32768" # Then: sudo systemctl daemon-reload && sudo systemctl restart ollama # Option 3: bake into a custom model (persistent) echo -e "FROM qwen2.5-coder:32b\nPARAMETER num_ctx 32768" > Modelfile ollama create qwen2.5-coder-32k -f Modelfile
Common Pitfalls & Troubleshooting
hermes doctor first — it checks for missing provider config, broken env vars, and misconfigured paths. Re-run setup if you suspect a typo in your API key./compress to trigger manual compression. Or edit ~/.hermes/config.yaml and set compression.threshold: 0.50 with your preferred summary_model.hermes chat --toolsets skills -q "Use the X skill to do Y"hermes gateway status. If stopped, restart it: hermes gateway start~/.hermes/.env for correct API keys. Run hermes model to confirm the selected model has a valid key configured.