AI Agent Security Checklist

Most teams focus on making AI agents capable. The security and governance side gets added later — often after the first incident. This checklist covers what to verify before going to production, not after.

Access Control

Define a registry of allowed tasks Critical

AI should only be able to call explicitly registered actions. Open tool access means every endpoint is a potential blast radius. Define the list first — nothing outside it should be reachable.

Separate read and write tasks Critical

Read operations and write operations carry very different risk profiles. Keep them as separate tasks in the registry so you can apply different approval rules to each.

Scope task access by role or agent Medium

Not every agent should see every task. A customer support agent doesn't need access to billing tasks. Scope registry visibility the same way you'd scope database permissions.

Credential Safety

Credentials never appear in the LLM context Critical

API keys in system prompts or tool definitions are exposed to logs, traces, and caches. Credentials should be resolved at execution time by a layer the model never sees.

Store secrets in a managed secrets store Critical

Use AWS Secrets Manager, HashiCorp Vault, or equivalent. The registry stores only a reference path — not the actual credential. The execution layer resolves it at runtime.

Plan Validation

Validate every plan step against the registry before execution Critical

The AI may hallucinate a task that doesn't exist. Before any step runs, confirm that every task in the plan is registered and that the provided inputs match the declared schema.

Log the full plan before execution starts Medium

If something goes wrong mid-execution, you need the original plan. Logging only the result leaves you debugging backward. Capture the complete plan as a structured record before step one runs.

Human Oversight

Flag high-risk tasks for approval before execution Critical

Sending emails, processing refunds, modifying billing, and deleting records should require explicit human sign-off. Define a risk tier for each task and route high-risk steps to an approver.

Start with read-only tasks, add writes later Medium

The fastest path to safe production deployment: register read-only tasks first, validate the agent behaves correctly, then add write tasks incrementally as confidence grows.

Monitoring & Audit

Log inputs and outputs for every executed step Medium

Step-level logs are what let you pinpoint exactly where a plan went wrong. Aggregate result logs are not enough. Capture the exact inputs passed and the exact API response returned for each step.

Alert on plan anomalies, not just errors Medium

AI agents fail silently. An unusual plan length, a task called too many times, or an approval rate that spikes can all signal prompt injection or unexpected inputs — before a hard error surfaces.

Private Infrastructure

Use a private worker for internal APIs Low (if applicable)

If your APIs are behind a VPN or on-prem, don't open inbound firewall rules for the AI execution layer. Run a private worker inside your network that polls outbound for approved jobs. Credentials never leave your environment.