Skip to main content
  1. Posts/

Building GRC Agents with the Anthropic Agent SDK

Ethan Troy
Author
Ethan Troy
hacker & writer
Table of Contents
Anthropic’s Agent SDK is “Claude Code as a library”—the same infrastructure powering one of the most capable coding assistants, now available for building your own agents. For GRC workflows, this means professional-grade agent infrastructure without building everything from scratch.

Why the Agent SDK?
#

I’ve written before about building agents from scratch—400 lines of Go to wire up an agent loop with basic tools. It works, and understanding that pattern is valuable.

But Anthropic’s Agent SDK offers something different: battle-tested infrastructure that handles the hard parts for you.

Built-in tools you don’t have to implement:

  • Read, Write, Edit for file operations
  • Glob, Grep for searching
  • Bash for system commands
  • WebSearch, WebFetch for external data

Subagents for delegation: Spawn specialized agents for different tasks—one for policy review, another for technical evidence analysis.

Structured output: Get JSON responses conforming to a schema, making integration with other systems straightforward.

The tradeoff is flexibility. Building from scratch gives you complete control. The SDK gives you velocity.

Core Concepts
#

The SDK exposes an async generator that yields messages as Claude works through a task:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Find all evidence files related to access control",
  options: {
    model: "sonnet",
    allowedTools: ["Glob", "Read", "Grep"],
    maxTurns: 50
  }
})) {
  switch (message.type) {
    case "system":
      if (message.subtype === "init") {
        console.log("Session ID:", message.session_id);
        console.log("Available tools:", message.tools);
      }
      break;

    case "assistant":
      for (const block of message.message.content) {
        if ("text" in block) {
          console.log("Claude:", block.text);
        } else if ("name" in block) {
          console.log("Tool call:", block.name);
        }
      }
      break;

    case "result":
      console.log("Status:", message.subtype);
      console.log("Cost:", message.total_cost_usd);
      break;
  }
}

Three message types:

  • system: Initialization info (session ID, available tools)
  • assistant: Claude’s responses and tool calls
  • result: Final status and cost

The agent loop, tool execution, and context management are all handled for you.

GRC Example: Evidence Reviewer Agent
#

Here’s a practical example—an agent that reviews evidence artifacts against control requirements and returns structured findings:

import { query } from "@anthropic-ai/claude-agent-sdk";

const evidenceReviewSchema = {
  type: "object",
  properties: {
    findings: {
      type: "array",
      items: {
        type: "object",
        properties: {
          severity: {
            type: "string",
            enum: ["low", "medium", "high", "critical"]
          },
          category: {
            type: "string",
            enum: ["missing", "incomplete", "outdated", "mismatch"]
          },
          controlId: { type: "string" },
          file: { type: "string" },
          description: { type: "string" },
          recommendation: { type: "string" }
        },
        required: ["severity", "category", "controlId", "description"]
      }
    },
    summary: { type: "string" },
    controlsCovered: { type: "number" },
    gapsIdentified: { type: "number" }
  },
  required: ["findings", "summary", "controlsCovered", "gapsIdentified"]
};

async function reviewEvidence(evidenceDir: string, controlFamily: string) {
  for await (const message of query({
    prompt: `Review the evidence files in ${evidenceDir} against ${controlFamily}
control requirements. Identify gaps, outdated artifacts, and mismatches between
evidence and stated control implementations.`,
    options: {
      model: "opus",
      allowedTools: ["Read", "Glob", "Grep"],
      permissionMode: "bypassPermissions",
      maxTurns: 50,
      outputFormat: {
        type: "json_schema",
        schema: evidenceReviewSchema
      }
    }
  })) {
    if (message.type === "result" && message.subtype === "success") {
      const review = message.structured_output;

      console.log(`\nEvidence Review: ${controlFamily}\n`);
      console.log(`Controls Covered: ${review.controlsCovered}`);
      console.log(`Gaps Identified: ${review.gapsIdentified}`);
      console.log(`Summary: ${review.summary}\n`);

      for (const finding of review.findings) {
        const icon = finding.severity === "critical" ? "🔴" :
                     finding.severity === "high" ? "🟠" :
                     finding.severity === "medium" ? "🟡" : "🟢";
        console.log(`${icon} [${finding.category.toUpperCase()}] ${finding.controlId}`);
        console.log(`   ${finding.description}`);
        if (finding.recommendation) {
          console.log(`   → ${finding.recommendation}`);
        }
        console.log();
      }
    }
  }
}

reviewEvidence("./evidence/access-control/", "AC");

The agent will:

  1. Use Glob to find evidence files in the directory
  2. Use Read to examine file contents
  3. Use Grep to search for control references
  4. Return structured JSON matching the schema

This structured output integrates directly with tracking systems, dashboards, or audit preparation workflows.

Subagents for Complex Reviews
#

For comprehensive assessments, delegate to specialized subagents:

import { query, AgentDefinition } from "@anthropic-ai/claude-agent-sdk";

async function comprehensiveEvidenceReview(evidenceDir: string) {
  for await (const message of query({
    prompt: `Perform a comprehensive evidence review of ${evidenceDir}.
Use the policy-reviewer for policy documents and procedure-analyzer for
technical implementation evidence.`,
    options: {
      model: "opus",
      allowedTools: ["Read", "Glob", "Grep", "Task"],
      permissionMode: "bypassPermissions",
      maxTurns: 50,
      agents: {
        "policy-reviewer": {
          description: "Policy document specialist",
          prompt: `You are a GRC policy analyst. Focus on:
- Policy completeness and coverage
- Version dates and review cycles
- Alignment with framework requirements
- Missing required sections`,
          tools: ["Read", "Grep", "Glob"],
          model: "sonnet"
        } as AgentDefinition,

        "procedure-analyzer": {
          description: "Technical evidence and procedure reviewer",
          prompt: `You are a technical evidence analyst. Examine:
- Screenshots and configuration exports
- Log samples and audit trails
- Technical procedure documentation
- Evidence freshness and relevance`,
          tools: ["Read", "Grep", "Glob"],
          model: "haiku"
        } as AgentDefinition
      }
    }
  })) {
    if (message.type === "assistant") {
      for (const block of message.message.content) {
        if ("text" in block) {
          console.log(block.text);
        } else if ("name" in block && block.name === "Task") {
          console.log(`\nDelegating to: ${(block.input as any).subagent_type}`);
        }
      }
    }
  }
}

comprehensiveEvidenceReview("./evidence/");

The orchestrating agent decides when to delegate. Policy documents go to the policy specialist. Technical artifacts go to the procedure analyzer. Results roll up into a comprehensive assessment.

Notice the model choices: sonnet for the policy reviewer (needs nuance), haiku for technical evidence (faster, cheaper for straightforward analysis).

Permission Handling for GRC Workflows
#

The SDK offers three permission modes:

options: {
  // Standard mode - prompts for each tool use
  permissionMode: "default",

  // Auto-approve edits (useful for generating reports)
  permissionMode: "acceptEdits",

  // No prompts - for automated pipelines
  permissionMode: "bypassPermissions"
}

For interactive evidence review, default makes sense—you want visibility into what the agent examines.

For batch processing (reviewing hundreds of artifacts overnight), bypassPermissions enables automation.

For custom control, implement a permission handler:

options: {
  canUseTool: async (toolName, input) => {
    // Allow all read operations
    if (["Read", "Glob", "Grep"].includes(toolName)) {
      return { behavior: "allow", updatedInput: input };
    }

    // Block access to sensitive directories
    if (input.path?.includes("/secrets/") || input.path?.includes("/.env")) {
      return { behavior: "deny", message: "Cannot access sensitive files" };
    }

    return { behavior: "allow", updatedInput: input };
  }
}

Extending with MCP
#

For domain-specific functionality beyond built-in tools, the SDK supports MCP (Model Context Protocol) servers. If you need custom tools for framework-specific operations, check out my post on Building MCP Servers for Compliance.

The short version: MCP lets you expose structured APIs that agents can call—search a requirements database, query a GRC platform, or access proprietary data sources.

When to Use What
#

ApproachBest For
Build from scratchLearning, complete control, minimal dependencies
Agent SDKProduction workflows, rapid development, complex orchestration
MCP serversDomain-specific tools, integrating with existing systems

They’re complementary. Understanding how agents work from scratch (the agent loop pattern) makes you a better user of the SDK. MCP extends both approaches with custom functionality.

Resources
#

Related