← Security and risk guides

How to use SpecDD to limit agent risk

How-To Security and risk Intermediate 1161000HOWTO-1161000

HOWTO-1161000Security and riskIntermediate

This guide shows you how to use spec-driven development to limit the risk of AI coding agent changes.

The goal is not to make agents automatically safe. The goal is to give humans and agents the same durable boundaries before editing starts: what the change is allowed to touch, what it must preserve, which access patterns are forbidden, and what evidence is required before the work can be considered complete.

Short answer

Use the nearest relevant local spec to define agent authority. Put writable scope in Can modify or Owns, readable context in Can read and References, required security behavior in Must, forbidden behavior in Must not, blocked dependencies or access in Forbids, and proof requirements in Done when. For security-sensitive work, ask for risk assessment and a plan before implementation.

When to use this guide

Use this guide when:

Principle

Agent risk drops when authority is narrow, local, and reviewable.

A broad prompt such as “fix billing” gives the agent too much room to infer. A local spec can say which files are in scope, which dependencies are allowed, which shortcuts are forbidden, and which checks must pass. SpecDD is especially useful here because the same files that guide the agent are also visible to human reviewers in Git.

Specs do not replace repository permissions, code review, tests, or security review. They make those controls easier to apply because the intended boundary is written down before the change happens.

Steps

1. Classify the request before editing

Before implementation, decide whether the request touches high-risk behavior.

High-risk examples:

If the request is ambiguous and touches one of these areas, do not let implementation begin from a vague prompt. Ask for a risk assessment or a spec update first.

Good prompt:

Assess risk for the charge capture retry change.

Good follow-up prompt:

Plan the charge capture retry change.

Keep planning and implementation separate when the change is security-sensitive.

2. Define local write authority

Use the nearest relevant local .sdd spec to say what the agent may modify.

Spec: Charge Capture

Purpose:
  Capture authorized payments exactly once and record the capture result.

Owns:
  ./charge-capture.ts
  ./charge-capture.test.ts

Can modify:
  ./charge-capture.ts
  ./charge-capture.test.ts

Can modify is the clearest way to narrow writable scope. If Can modify is absent, Owns acts as the modification boundary.

Do not rely on the prompt alone for write scope. Prompts disappear. Specs remain reviewable.

3. Limit context without widening authority

Security-sensitive code often needs context from nearby modules, but context is not permission to edit those modules.

Can read:
  ../payment-gateway/payment-gateway.sdd
  ../audit/billing-audit.sdd

References:
  ../payment-gateway/payment-gateway.sdd
  ../audit/billing-audit.sdd

Use Can read and References to identify useful context. Keep the write boundary in Can modify or Owns.

4. Write forbidden behavior and access rules

Use Must not for forbidden behavior:

Must not:
  Capture the same authorization more than once.
  Treat a client-provided amount as trusted after authorization.
  Skip audit logging for failed capture attempts.

Use Forbids for blocked dependencies, paths, libraries, tools, or access:

Forbids:
  Direct use of the payment provider SDK outside ../payment-gateway/*
  ../refunds/*
  Reading secrets from files in the repository.

Must not and Forbids are stronger than Tasks, Depends on, or a convenient implementation shortcut. If the task requires violating one of them, the task needs review or the spec needs an explicit approved change.

5. Add completion and review evidence

Use Done when to define the proof reviewers should expect.

Done when:
  Duplicate capture attempts return the existing capture result without calling the provider again.
  Failed capture attempts record a billing audit event without logging card or secret data.
  The charge capture tests cover success, duplicate retry, provider failure, and denied authorization.
  The implementation does not import the payment provider SDK directly.

This gives the agent a finish line and gives reviewers concrete evidence to check.

6. Keep implementation prompts narrow

After risk assessment and planning, make the implementation request specific.

Implement the approved charge capture retry task.

Avoid combining several security-sensitive requests into one broad prompt. One spec or one task at a time is easier to review and less likely to cross boundaries.

7. Review the diff against the governing specs

For the review, check:

If a safe subset was completed but the risky part remains blocked, the task should stay open or use a blocked marker according to the team’s task workflow.

Example

Spec: Password Reset Request

Purpose:
  Start a password reset flow without revealing whether an account exists.

Owns:
  ./password-reset-request.ts
  ./password-reset-request.test.ts

Can modify:
  ./password-reset-request.ts
  ./password-reset-request.test.ts

Can read:
  ../mail/password-reset-mail.sdd
  ../audit/auth-audit.sdd

Must:
  Return the same public response for known and unknown email addresses.
  Create reset tokens only for active accounts.
  Record an audit event for each accepted reset request.

Must not:
  Reveal account existence through response text, response code, timing-sensitive branches, or audit output.
  Send reset email for inactive, locked, or missing accounts.
  Log reset tokens.

Forbids:
  Direct email sending outside ../mail/*
  Storing reset tokens in plaintext.

Done when:
  Known and unknown email requests return the same public response.
  Missing-account requests do not send email.
  Reset tokens are not logged in success or failure paths.
  Existing auth audit checks pass.

This spec narrows the agent’s work to one flow, states the high-risk boundary, and gives reviewers concrete checks.

SpecDD pattern

This pattern uses:

Common mistakes

How to verify the result

Agent risk is better controlled when:

← Security and risk guides