A comprehensive architecture, security, and ontology analysis of the E.D.D.I.E. autonomous agent system — with recommendations for integration into the ShurAI semantic infrastructure.
EDDIE is a 235-file, 39,636-line TypeScript monolith that wraps Claude Code CLI in a Telegram relay with 35+ autonomous subsystems. It is ambitious, well-intentioned engineering — but it reproduces every structural vulnerability documented in the OpenClaw security literature while adding layers of complexity that actively degrade the AI's reasoning capacity.
The core thesis: EDDIE spends more tokens describing itself to Claude than it spends on the user's actual problem. The system is a context window tax — and the tax funds no ontology, no knowledge graph, and no integration with the actual client intelligence system (ShurAI) that gives the work meaning.
At its core, EDDIE is a message relay: Telegram message in → Claude Code CLI subprocess out. Everything else is middleware.
The system wraps this simple relay in:
Plus: a full video production pipeline, multi-channel comms (Gmail, Calendar, Slack, WhatsApp, iMessage), eBPF kernel monitoring in a privileged container, phone calls via Twilio, and a community distribution bridge.
How much of this complexity serves the user's actual needs, and how much serves the system's self-narrative?
This is the most architecturally damaging finding. Every Claude invocation pays a token "tax" before any user content is processed.
Interactive relay: EDDIE persona (~1,200 chars) + top 10 memory matches (~3,000 chars) + session state (~2,000 chars) = 2,000–5,000 tokens before the user's message.
Background jobs: PDAC protocol (~3,000 chars) + Brain Vault paths + briefing + project states + CLAUDE.md files + routing hints = 8,000–15,000 tokens before the task prompt.
Heartbeat ticks: Checklist + last 3 heartbeats + goals + 30 conversations + ALL project states + usage data + crons + queue stats = 10,000–30,000 tokens before any reasoning.
The system is paying Claude to read its own autobiography on every tick.
None of this injected context comes from a knowledge graph. The capabilities registry is a 2,966-line keyword matcher — pattern matching, not understanding:
{
id: "agent:cold-outreach-strategist",
triggers: [
{ type: "keyword", patterns: ["outreach", "cold email", "prospecting"...], weight: 0.9 },
{ type: "project", patterns: ["leadgen"], weight: 1.0 }
]
}
It doesn't know the relationship between a client, a campaign, and a lead. It can't distinguish "outreach" meaning cold email vs. community engagement vs. PR. No semantic graph — just keywords and weights.
The following credentials are in plaintext in a version-controlled file:
All of these keys should be considered compromised.
Dashboard binds to 0.0.0.0 with Access-Control-Allow-Origin: *. Authentication is optional. The /chat/send endpoint creates and spawns jobs from web requests — an unauthenticated remote code execution endpoint.
--dangerously-skip-permissions on every invocationEvery Claude CLI call disables all safety guardrails. Any prompt injection that survives the input scan gets arbitrary file read/write, command execution, and network access with no confirmation prompts.
High-severity injections are blocked — but the system includes a one-time override mechanism. Send the same injection again, and it passes through. Medium-severity injections pass with only a text disclaimer.
Arbitrary content stored as "facts" gets injected into all future system prompts via buildMemoryContext(). No sanitization. A malicious fact persists indefinitely and poisons every future Claude call.
This is the exact "sticky attack" vector from the Akamai analysis: "An instruction injected today could lie dormant in the agent's 'memory' and be triggered weeks later."
Dream cycle: Autonomously extracts "insights" → stores as facts → injected into future prompts → shapes future behavior. No human review gate.
Self-heal: Failed jobs spawn autonomous repair jobs with full system access.
Self-improve: Modifies its own prompting strategy based on rejection patterns.
This is exactly the behavior documented in OpenClaw Issue #24237: agents silently modifying their own configuration. Here, it's by design.
openai and twilio packages are declared but never imported. Output scan is disabled by default. safeEnv() passes all env vars except CLAUDECODE to subprocesses.
The core value proposition — relay Telegram messages to Claude Code — is roughly a 50-line script:
import { Bot } from "gramio";
const bot = new Bot(process.env.TELEGRAM_BOT_TOKEN);
bot.on("message", async (ctx) => {
if (ctx.from.id !== Number(process.env.OWNER_ID)) return;
const proc = Bun.spawn(["claude", "-p", ctx.text, "--print"]);
const result = await new Response(proc.stdout).text();
await ctx.reply(result);
});
bot.start();
Everything beyond this is middleware. The question is whether each layer earns its complexity.
35+ proactive modules: Heartbeat, dream cycle, morning brief, daily brief, weekly review, monthly review, weekly content, 80/20 analysis, nightly orchestrate, self-improve, optimizer, challenge, brainstorm, vision, pillars, goals, observation log, revenue tracker, expense tracker, social snapshot, content brief, waiting-on tracker, rejection learning, outcome analysis, report synthesis, accounting triage, discoverability, context drift, video idea pipeline, dashboard generation, eBPF monitoring, anthropic monitoring, claude health, memory monitor...
Heimdall: A full content scanning/cleaning/packaging/sync system for community distribution. An entire product inside an assistant.
Video pipeline: News → script → ElevenLabs TTS → Remotion render → FFmpeg → Gemini QA → YouTube upload → analytics. A full media production system embedded in a chat bot.
eBPF kernel monitoring: A privileged Docker container running bpftrace programs for network, process, and memory kernel events. Infrastructure monitoring inside an assistant.
The OpenClaw extraction document prepared for Limore identified structural vulnerabilities. Every one maps to EDDIE:
| OpenClaw Vulnerability | EDDIE Equivalent | Status |
|---|---|---|
| Dashboard as attack surface | Dashboard on 0.0.0.0 with CORS * and optional auth | Present |
| Each channel = attack vector | Telegram + Gmail + Calendar + Slack + WhatsApp + iMessage + YouTube | Present (7 channels) |
| SOUL.md/MEMORY.md as swappable Post-it notes | Brain Vault state files + Supabase facts injected into system prompts | Present |
| Memory persistence enabling sticky attacks | storeFact() stores arbitrary unsanitized content forever | Present |
| Agents silently mutating own config (#24237) | Dream cycle, self-heal, self-improve, nightly orchestrate | Present (by design) |
--dangerously-skip-permissions | Used on every Claude invocation | Present |
| No separation of data and instruction channels | Memory content, state files, user messages all concatenated into system prompt | Present |
| MoltMatch incident (unauthorized real-world actions) | Heartbeat can autonomously spawn tasks, make calls, send messages | Present (by design) |
| Shadow AI (operating outside IAM) | API keys in plaintext settings.json, no secret rotation | Present |
The concierge metaphor holds: anyone — or anything — that can write to the Brain Vault state files or Supabase facts table has effectively swapped the Post-it notes.
A weighted keyword matcher that cannot understand relationships, context, or meaning. It doesn't know that "reach out to the prospect from Tuesday's meeting" refers to a specific person. It can't connect content to strategy.
Brain Vault is a folder structure, not a knowledge graph. "Semantic search" is embedding similarity over stored text fragments — useful for recall, but it has no understanding of relationships, types, hierarchies, or provenance.
EDDIE asks "does this message contain the word 'outreach'?"
ShurAI asks "what is this message about, in the context of what we know about this client, this project, and this strategy?"
| EDDIE Capability | ShurAI Equivalent | Advantage |
|---|---|---|
| Telegram relay to Claude | Same (50 lines of code) | Same functionality, 1/500th the codebase |
| Background jobs via tmux | Claude Code native task system | No tmux dependency, built-in monitoring |
| Memory/context injection | Letta memory blocks + InfraNodus graph context | Structured, scoped, auditable — not flat text dumps |
| Capability routing | Ontology-driven routing via knowledge graph | Semantic understanding, not keyword matching |
| Prompt injection defense | Architectural separation of data/instruction channels | Defense by design, not by regex |
| Autonomous heartbeat | Scheduled tasks with explicit human approval gates | Autonomy with accountability |
| Multi-channel comms | MCP server integrations (already available) | Standard protocol, not custom bridges |
| Video pipeline | Separate service (not embedded in an assistant) | Proper separation of concerns |
| Security scanning | Trust-boundary architecture from the ground up | Not bolted on after the fact |
Never inject context that doesn't directly serve the current query. No persona monologues. No autobiographies.
Route requests based on semantic understanding, not regex patterns against a 3,000-line static registry.
A personal assistant is not a video production pipeline is not a kernel monitoring system. Each is its own service.
Separate data channels from instruction channels. Don't store unsanitized user content where it gets injected as system prompt.
The system should propose, not execute. Especially for actions that affect the real world.
Environment variables, secret managers, or encrypted stores — never plaintext in committed files.
This report is not an attack on the engineering effort. EDDIE represents significant technical ambition and many individual modules are well-built. The critique is architectural:
storeFact() → buildMemoryContext() pipeline is a persistent injection vector..claude/settings.json should be considered compromised. Move to environment variables or a secret manager.0.0.0.0 with Access-Control-Allow-Origin: * and optional auth is an open door.| File | Finding | Severity |
|---|---|---|
.claude/settings.json | Plaintext API keys (8+ services) | Critical |
src/dashboard/server.ts | 0.0.0.0 bind, CORS *, optional auth, RCE via /chat/send | Critical |
src/claude/relay.ts:99 | --dangerously-skip-permissions on all relay calls | High |
src/jobs/tmux.ts:380 | --dangerously-skip-permissions on all job spawns | High |
src/security/scan.ts | Override mechanism: send injection twice to bypass | High |
src/memory/store.ts | storeFact() stores unsanitized content | High |
src/memory/context.ts | Injects stored facts verbatim into system prompt | High |
src/proactive/dream.ts | Auto-extracts and stores "insights" (self-reinforcing loop) | Medium |
src/jobs/self-heal.ts | Autonomous repair jobs with full system access | Medium |
src/security/output-scan.ts | Disabled by default | Medium |
src/claude/run-prompt.ts | safeEnv() passes all env vars except CLAUDECODE | Medium |
package.json | Dead dependencies: openai, twilio | Low |
package.json | @types/bun pinned to latest (non-deterministic) | Low |
| Path | Fixed Overhead | Variable Overhead | Total Before User Content |
|---|---|---|---|
| Interactive relay | ~1,200 chars | ~3,000–5,000 chars | 2,000–5,000 tokens |
| Background job | ~3,000 chars | ~5,000–12,000 chars | 8,000–15,000 tokens |
| Heartbeat tick | ~2,000 chars | ~8,000–28,000 chars | 10,000–30,000 tokens |
Each connection is an authentication surface, a credential to store, a dependency to maintain, and a potential injection vector.