Don't Give Your Agent
the Keys to the Kingdom

A builder's guide to secret management for AI agents — from "good enough" to "seriously locked down."

Reading time8 min
TagsAI · Security · Agents

I'll start with a confession: my agents currently have direct access to their API keys. The keys sit on the VPS. The agent can see them. If something goes wrong — a prompt injection attack, a runaway subagent, a compromised dependency — those keys are exposed.

This is probably true for most people running agents in production today. It's not reckless; it's just where the tooling was six months ago. But the threat model has shifted. Agents are getting more capable, more autonomous, and more connected to real services. The blast radius of a compromised agent is no longer theoretical.

So I did a survey of what's actually available right now. Here's what I found.

The Problem

The Problem, Plainly

When your agent runs a task — calling an API, pushing code, querying a database — it needs credentials to do so. Somewhere, somehow, those credentials have to be accessible to the process.

The naive approach (and the one most of us started with) is: put the keys in an .env file or environment variable on the server. The agent reads them, uses them, done.

The problem is that this treats the agent as fully trusted — the same as you, the developer. But agents aren't you. They execute instructions from many sources: your prompts, tool outputs, scraped web content, emails, documents. Any of those can contain adversarial instructions. And an agent that's been manipulated into doing something bad has full access to every credential it was given.

The question isn't if you need to think about this. It's how much isolation you need, and how much complexity you're willing to take on.

The Spectrum

The Spectrum: Five Real Options

Think of these as levels on a dial — from "low friction, low isolation" to "high friction, high isolation." None of them is universally right. The right answer depends on what your agents do and what they touch.

Level 1 — Keep secrets out of your code (fnox)

fnox GitHub repository

What it does: Secrets live in a managed store — local encrypted file, git-encrypted secrets, HashiCorp Vault, AWS SSM, and others. At runtime, fnox injects them into the process environment: fnox exec -- npm start. Your agent sees the real key in its environment, but it never sits in your repo or on disk in plaintext.

The threat it solves: Secrets leaking through version control, logs, or careless file access.

What it doesn't solve: A compromised agent can still read the environment variable it was handed.

Who it's for: The pragmatic baseline. If you're early-stage and your main concern is keeping secrets out of your codebase and away from accidental exposure, this is where to start. It's mature, well-documented, and low ops overhead.

Level 2 — The agent never sees the real key (OneCLI / Gondolin)

OneCLI GitHub repositoryGondolin GitHub repository

What it does: A proxy gateway sits between your agent and the external service. The agent is given a fake or scoped token — useless on its own. When it makes an API call, the call goes through the local gateway, which swaps in the real credential before forwarding it upstream. The agent is cryptographically isolated from the actual secret.

The threat it solves: Prompt injection and adversarial manipulation. An agent that's been tricked into exfiltrating its credentials has nothing to give away — because it was never given anything real.

What it doesn't solve: The gateway itself becomes a critical dependency. You're now running another service that needs to be maintained, monitored, and secured.

Who it's for: Anyone running agents that interact with third-party APIs and is worried about manipulation attacks. This is likely the upgrade path for setups like mine — it's a meaningful security step without requiring you to rethink your entire infrastructure.

Level 3 — Give the agent its own identity on the host (SandVault, macOS)

SandVault GitHub repository

What it does: Rather than protecting one secret at a time, SandVault gives the agent a separate OS user with its own home directory, access control lists, and optionally Apple's sandbox-exec restrictions. The agent can't touch your personal files, access external drives, or reach sensitive parts of the filesystem — not because you blocked specific paths, but because it's running as a different user entirely.

The threat it solves: An agent (or its dependencies) accessing files it has no business touching — your SSH keys, personal documents, other project secrets.

What it doesn't solve: Everything still runs on the same kernel. It's identity isolation, not execution isolation.

Who it's for: Mac-native builders who want lightweight sandboxing without spinning up containers. Good for local development environments where you're running agents alongside your personal machine.

Level 4 — Parallel agents without polluting each other (Coasts)

What it does: Coasts is built for a specific scenario: multiple agents working on multiple git worktrees in parallel, each needing its own full stack (web server, API, database, etc.). It clones your existing Docker setup per worktree, with coordinated port routing and branch alignment.

The threat it solves: Agents stepping on each other — shared state, port conflicts, one agent's work bleeding into another's environment.

What it doesn't solve: It's primarily about parallel runtimes, not host security. A container can still be compromised.

Who it's for: Teams or builders running multiple concurrent agents against complex multi-service stacks. If you're running a single agent on a VPS, this probably isn't your next step — but it matters once you scale horizontally.

Level 5 — Burn it down after every run (Shuru, microVMs)

Shuru GitHub repository

What it does: Shuru boots a fresh Linux microVM for each agent run using macOS's Virtualization.framework. The VM gets an ephemeral root filesystem, controlled mounts, port forwarding, and a built-in secrets proxy. When the run ends, the VM is destroyed. Nothing persists. Nothing escapes.

The threat it solves: Everything. A fully compromised agent — one that has been manipulated to do the worst possible thing — can't persist state, can't reach the host filesystem, can't exfiltrate credentials that were never in its environment.

What it doesn't solve: It's the heaviest option on this list. Slower to spin up, more infrastructure to manage. For long-running agents or high-frequency tasks, the overhead adds up.

Who it's for: Agents executing untrusted or user-supplied code. Coding agents. Any scenario where the blast radius of a full compromise is unacceptable.

Combining

These Can Be Combined

The insight that changed how I think about this: you don't pick one and apply it everywhere. You match isolation level to actual risk.

A practical pattern: run your main orchestrator agent with a proxy gateway (Level 2) — it never has real credentials. Spin up microVMs (Level 5) only for subagents that execute untrusted code. Use fnox (Level 1) for internal tooling where you trust the code.

Tiered isolation, matched to the actual threat at each layer.

My Setup

Where Does That Leave My Setup?

My honest assessment: secrets sitting on the VPS with direct agent access is Level 0 — not on this list. It's where most of us started.

The pragmatic next step for most builders is Level 2 — the proxy gateway approach. It doesn't require rearchitecting anything. You keep your VPS, your existing stack. You just add a local gateway that intercepts outbound API calls and swaps tokens. The agent never touches a real key. Prompt injection attacks become dramatically less dangerous overnight.

If you're running coding agents or executing anything untrusted, Level 5 is the right long-term destination — but it's a bigger lift.

The key thing is to stop treating your agents as trusted the way you'd trust your own terminal session. They're not. They process information from the open web, from user inputs, from tool outputs you don't control. Their threat model is different.

Secure them accordingly.

Quick Reference

Quick Reference

Worry
Start here
Secrets leaking from repo or logs
Agent being manipulated to expose keys
Agent accessing your personal files
Multiple agents polluting each other
Coasts
Fully untrusted or user-supplied code
Lior Goldenberg
All Posts →