skip to content
Succeeded With Errors

Agentic Development in VS Code: A Practical Guide

/ 13 min read

Table of Contents

A few weeks into building with AI agents seriously, I started writing things down. Not blog-post style, more like an internal reference: what the pieces are called, how the file structures work, which patterns actually hold up when the workflow gets complex. I shared it with a few colleagues and kept refining it as I hit new situations.

This post is that guide, cleaned up and expanded. It covers everything I wish I’d had when I started: the vocabulary, the folder setup, how to write your first agent, how to build a team of them, and the advanced patterns (skills, memory, process tracking, honesty logging) that separate throwaway experiments from something you’d actually trust. The examples lean C# because that’s what I work in, but the concepts are the same in any stack.

If you’re looking for a faster on-ramp with a pre-built team to fork, I built one at github.com/bpelotto/Agentic_Aesir. This guide is the agnostic, build-it-yourself version.


1. What Is Agentic Development?

  • Agent: A Markdown file (.agent.md) that defines an AI persona with a specific role, scoped tool access, and explicit behavioral rules. VS Code Copilot loads it and runs it against your codebase.
  • Skill: An on-demand knowledge package (SKILL.md) loaded into agent context when specific triggers are met; domain-specific reference material the agent reads before acting.
  • Orchestration: A pattern where one lead agent receives a task, decomposes it, and delegates sub-tasks to specialist agents via #agent <AgentName>. The lead agent synthesizes results.
  • Sub-agent: Any agent invoked by another agent. Executes within its own scope, returns results, has no awareness of sibling agents.
  • Context slug: A stable identifier (usually the current git branch, normalized) used to scope memory, process, and honesty log files to a specific task or branch.

2. Prerequisites

Tools

ToolNotes
Visual Studio CodeLatest stable
GitHub Copilot + Copilot Chat extensionsGitHub.copilot + GitHub.copilot-chat
Copilot CLIcopilot command required
GitAny modern version

VS Code Settings

Add to settings.json:

{
"github.copilot.chat.agent.enabled": true
}

Enable the experimental agents feature if prompted in the Copilot Chat panel.

Folder Structure

Agents and skills live at user-level (~/.copilot/) or workspace-level (<repo>/.copilot/). Workspace-level takes precedence for repo-specific work.

~/.copilot/
├── agents/
│ └── my-agent.agent.md
└── skills/
└── my-skill/
└── SKILL.md
<repo>/
└── .copilot/
├── agents/
└── skills/

Create the base structure:

Terminal window
mkdir -p ~/.copilot/agents ~/.copilot/skills
# workspace-level:
mkdir -p .copilot/agents .copilot/skills

3. Your First Agent: Step by Step

Goal

A focused agent that reviews a single C# file for null-safety issues only.

Step 1: Create the file

Terminal window
touch ~/.copilot/agents/null-checker.agent.md

Step 2: Write the agent

---
name: NullChecker
description: >
Reviews C# code exclusively for null-safety issues.
Invoke when you want targeted null-reference analysis on a file or method.
model: claude-sonnet-4
tools: ["codebase", "search", "problems"]
---
You are NullChecker, a focused C# null-safety reviewer.
Mission:
- Identify nullable reference type violations, missing null guards, and unsafe dereferences.
- Report findings as a numbered list with file path, line number, issue, and a one-line fix suggestion.
- Do NOT fix the code. Do NOT review anything except null-safety.
Hard constraints:
- Never edit files.
- Never run commands.
- Do not comment on naming, formatting, architecture, or performance.
- If the input is not C# code, respond: "NullChecker only reviews C# files."
Output format:
1. [file.cs:line] Issue description → Suggested fix

Frontmatter Field Reference

FieldPurpose
nameHow you invoke it: @NullChecker
descriptionShown in Copilot’s agent picker; used by orchestrators for routing
modelWhich LLM backs this agent: claude-sonnet-4 or gpt-4o
toolsScoped permissions. Only declare what the agent needs.

Step 3: Invoke from VS Code

In the Copilot Chat sidebar:

@NullChecker Review FooService.cs for null-safety issues

Or using the #agent syntax:

#agent NullChecker Review the GetFoo method in FooService.cs

Step 4: Verify the constraints

@NullChecker What naming conventions should I use in C#?

Expected output: "NullChecker only reviews C# files." This confirms the hard constraint holds. I test constraints on every new agent before using it in a real workflow; it takes 30 seconds and saves a lot of trust issues later.


4. Building Your Agent Team

The Orchestrator Pattern

An orchestrator agent:

  • Receives a high-level task from the engineer
  • Decomposes it into sub-tasks
  • Delegates to specialists via #agent <AgentName>
  • Synthesizes results into a unified response

The orchestrator does not do specialist work itself. This boundary is the whole point: once an orchestrator starts writing code, it stops being an orchestrator and becomes a general assistant. I enforce it with an explicit hard constraint in every orchestrator system prompt.

Minimal Orchestrator

---
name: TeamLead
description: >
Top-level task router. Delegates code review to Reviewer,
test writing to TestWriter. Use for feature-level tasks.
model: gpt-4o
tools: ["codebase", "agent", "search"]
---
You are TeamLead, the orchestrating agent for feature development tasks.
Mission:
- Understand the engineer's intent.
- Delegate code review to: #agent Reviewer
- Delegate test writing to: #agent TestWriter
- Synthesize sub-agent findings into a single structured response.
- Do not perform code review or test writing yourself.
Hard constraints:
- Always delegate before synthesizing.
- If a sub-agent is unavailable, report it explicitly; do not attempt the task yourself.
- Never write or edit code.

Sub-Agent Delegation

When orchestrator instructions contain #agent AgentName, VS Code Copilot spins up that agent with the same context and waits for its response before continuing. For older configurations, #runSubagent AgentName may still appear as a legacy equivalent.

# Inside TeamLead's instructions:
Step 1: Use #agent Reviewer to check the changed files for correctness issues.
Step 2: Use #agent TestWriter to generate a test plan for the same changes.
Step 3: Combine both outputs into a single task checklist for the engineer.

Example Team (C# Focused)

TeamLead (orchestrator)
├── Reviewer → correctness, async pitfalls, OWASP patterns
└── TestWriter → xUnit test cases, edge cases, mocks/builders

5. Top 3 Agents for C# Engineers

5.1 Code Review Agent

Purpose: Adversarial review of C# changes. Catches async/await misuse, null-safety gaps, OWASP Top 10 patterns, naming violations, and layer boundary issues. Never writes code.

---
name: Reviewer
description: >
Adversarially reviews C# code changes for correctness, security,
async pitfalls, null safety, naming, and architectural violations.
Invoke after completing a feature or before raising a PR.
model: claude-sonnet-4
tools: ["codebase", "changes", "search", "usages", "problems"]
---
You are Reviewer, a senior C# code reviewer.
Mission:
- Review changed files for: async/await correctness, null reference risks,
OWASP Top 10 patterns (injection, improper error handling, insecure deserialization),
naming convention violations, and layer boundary breaches.
- Output findings as a structured table: file | line | severity | issue | recommendation.
- Severity levels: Critical / High / Medium / Low.
- Escalate architectural concerns: "Architecture question: [description] → escalate to Architect."
Hard constraints:
- Never write, edit, or suggest complete rewrites.
- Do not approve changes; only report findings.
- If no issues found, state: "Review complete. No findings."

Example invocation:

@Reviewer Review my current changes to FooService.cs

Output: Structured findings table plus escalation flags for architecture questions.


5.2 QA / Testing Agent

Purpose: Designs test strategies and writes xUnit/NUnit/MSTest test cases. Covers edge cases, boundary conditions, fixtures, mocks, and builders. Asks for clarification before writing if acceptance criteria are ambiguous.

---
name: TestWriter
description: >
Designs and writes C# test cases (xUnit preferred).
Invoke when you need unit tests, integration test plans,
or edge case analysis for a method or class.
model: claude-sonnet-4
tools: ["codebase", "findTestFiles", "search", "editFiles"]
---
You are TestWriter, a C# test automation specialist.
Mission:
- Analyse the target method/class for testable behaviours, edge cases, and boundary conditions.
- Write xUnit test cases with Arrange/Act/Assert structure.
- Use Moq for mocking dependencies. Use builder patterns (AutoFixture or hand-rolled) where appropriate.
- Always target the local/isolated test project. Never touch production code.
Hard constraints:
- Do not write tests if acceptance criteria are ambiguous: ask first.
- Never modify production source files.
- Confirm the test project path before creating files.
- Do not generate integration tests that require live infrastructure.
Clarification protocol:
If the described behaviour is unclear, respond:
"Before writing tests: [specific question about expected behaviour]"

Example invocation:

@TestWriter Write unit tests for the ProcessBar method in BarService.cs

Output: Complete test class with [Fact] and [Theory] methods, Moq setup, and an edge case list.


5.3 Architect / Design Agent

Purpose: Enforces architectural constraints, reviews layer violations, evaluates dependency direction, advises on NFRs (performance, scalability, security). Never implements. The escalation target for Reviewer and TeamLead.

---
name: Architect
description: >
Enforces architectural constraints and evaluates design decisions.
Invoke for layer violations, dependency direction questions,
integration contract reviews, or NFR guidance.
model: claude-sonnet-4
tools: ["codebase", "search", "usages", "fetch"]
---
You are Architect, the design authority for this codebase.
Mission:
- Review architectural concerns escalated by other agents or raised directly.
- Identify layer boundary violations (e.g., domain referencing infrastructure).
- Evaluate dependency direction against declared architecture.
- Advise on NFRs: latency budgets, throughput targets, scalability constraints.
- Validate integration contracts (API shapes, message schemas, event contracts).
Hard constraints:
- Never write or modify code.
- Do not make implementation decisions: guidance only.
- If accepting an architectural risk: state the risk explicitly and recommend mitigation.
- Do not approve anything that violates declared architectural invariants without explicit
engineer acknowledgment.
Output format:
- Concern: [description]
- Verdict: Approved / Violation / Needs discussion
- Rationale: [why]
- Recommendation: [if not Approved]

Example invocation:

@Architect Review the dependency graph for the new Baz module

6. Advanced Topics

6.1 Skills

Create a skill when:

  • Multiple agents need the same domain knowledge in different contexts
  • The knowledge is too verbose to embed in every agent
  • Trigger conditions are well-defined (e.g., “whenever working with async C# code”)

Minimal SKILL.md:

---
name: csharp-async-patterns
description: >
Load when reviewing or writing async C# code. Trigger conditions:
async/await, Task<T>, ConfigureAwait, CancellationToken patterns.
---
## Purpose
Reference guide for correct async patterns in C# 10+.
## Key Rules
- Always propagate `CancellationToken` through async call chains.
- Avoid `.Result` and `.Wait()`: use `await` exclusively.
- Use `ConfigureAwait(false)` in library code; omit in application code.
- Never `async void` except for event handlers.
## Common Pitfalls
| Anti-pattern | Correct alternative |
|---|---|
| `task.Result` | `await task` |
| `async void DoWork()` | `async Task DoWork()` |
| Missing `CancellationToken` param | Add `CancellationToken ct = default` |

Reference a skill in an agent or instruction file:

<skill>
<name>csharp-async-patterns</name>
<description>Load when reviewing or writing async C# code.</description>
<file>/path/to/.copilot/skills/csharp-async-patterns/SKILL.md</file>
</skill>

6.2 Scripts

Scripts in .copilot/scripts/ handle deterministic, side-effect-producing operations that shouldn’t live in prompt instructions: path resolution, log management, lifecycle tracking.

When to write a script: When an agent needs normalized output from the environment (current branch, timestamps, file paths) or needs audit logs that survive the session.

Example: slug.sh (resolves the context identifier used in memory/process file paths):

#!/usr/bin/env bash
# slug.sh — outputs a normalized slug from the current git branch
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "main")
echo "${BRANCH//\//-}" | tr '[:upper:]' '[:lower:]'

Usage in agent instructions:

Before reading memory, run: bash .copilot/scripts/slug.sh
Use the output as <context-slug> in all file paths.

When to skip a script: If the logic is trivial and used by only one agent, inline it in the agent instructions.


6.3 Instruction Files

.instructions.md files provide persistent, always-on context injected into every Copilot Chat interaction without explicit invocation.

Locations:

  • ~/.github/copilot-instructions.md (user-level, always active)
  • <repo>/.github/copilot-instructions.md (repo-level)
  • <repo>/.copilot/instructions/*.instructions.md (scoped by applyTo glob)
---
applyTo: "**/*.cs"
---
## C# Conventions (this repo)
- Use `sealed` on classes not intended for inheritance.
- All public APIs must accept and propagate `CancellationToken`.
- Repository interfaces belong in `Domain.Abstractions`, not `Infrastructure`.
- Do not use `dynamic` or reflection in hot paths.

Key distinction from agents: Instruction files fire automatically on matching files. Agents are invoked explicitly. Use instruction files for rules that must always be active; use agents for tasks.


6.4 Hooks

A hook is a script that fires at an agent lifecycle event, most commonly when a sub-agent is invoked.

Pattern: subagent-hook.sh:

#!/usr/bin/env bash
# fires before a sub-agent is invoked
# args: $1=calling agent, $2=target agent, $3=task slug
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
LOG_DIR=~/.copilot/data/$(date +%Y-%m-%d)/subagent-tracking
mkdir -p "$LOG_DIR"
echo "[$TIMESTAMP] $1$2 ($3)" >> "$LOG_DIR/invocations.log"

When to use hooks:

  • Audit trails across multi-agent pipelines
  • Detecting circular delegation (log and scan for A→B→A loops)
  • Triggering side-effects on agent transitions

When to skip hooks: Single-agent workflows, local experimentation, tasks where the overhead isn’t justified.


6.5 Memory & Process Governance

Agents have no persistent memory by default. Every invocation starts blank. These patterns fix that for multi-step workflows.

agent-memory stores decisions, context, and open questions:

~/.copilot/data/<YYYY-MM-DD>/memory/<agent-name>/<context-slug>.md
# Memory: Reviewer — feature-baz
## Decisions
- 2026-04-06: Confirmed this module uses CQRS. Flag any command handler that returns domain state.
## Known context
- FooService.cs was refactored yesterday; prior null-safety issues are resolved.
## Open questions
- Does BazAggregateRoot.Apply() need a null guard on the event parameter?

First action on every invocation: read this file. Last action: update it.

agent-process tracks task lifecycle:

~/.copilot/data/<YYYY-MM-DD>/process/<context-slug>/tasks.md
## Tasks
### to-do
- [ ] Review FooService.cs integration tests
### doing
- [~] Review BarService.cs for async pitfalls (started 09:42)
### done
- [x] Review BazService.cs — 3 findings, 1 escalated to Architect (09:15)

Without process tracking, two agents can start the same task in parallel, or an agent re-does completed work on restart. I’ve hit both of these issues. The process file is cheap; the debugging is not.


6.6 Honesty / Claims Logging

Agents present inferences with the same confidence as verified facts by default. The honesty pattern forces explicit tagging.

TagMeaning
VerifiedSourced from code, docs, tool output, or the engineer’s explicit statement
InferredDeduced or assumed; must include a one-sentence reason

Log format (honesty/<session-slug>.md):

| Claim | Tag | Source / Reason |
|-------|-----|-----------------|
| FooService uses async throughout | Verified | Inspected FooService.cs lines 12–89 |
| Team follows CQRS | Inferred | Observed command/query folder split; no ADR found |
| ConfigureAwait(false) required here | Verified | Team convention doc + .editorconfig rule |

Before stating a technical fact in output, the agent asks: “Did I read this from the codebase, or did I infer it?” The honesty log makes that distinction visible and auditable. This matters more the longer the pipeline runs; errors compound, and untagged inferences become “facts” by the third agent in the chain.


7. Best Practices

Agent Design

RuleRationale
Single responsibility per agentEasier to test, debug, and replace independently
Explicit hard constraintsPrevents agents drifting into out-of-scope actions
No live-system writes without authorizationAgents must never push to prod, drop tables, or force-push autonomously
Declare only the tools the agent needsPrinciple of least privilege; reduces blast radius
State the output format in instructionsDownstream agents and engineers need predictable structure

Skill Design

RuleRationale
Trigger conditions must be explicit in descriptionAgents need to know precisely when to load the skill
Self-contained: assume no prior contextThe agent reads the skill cold; it can’t rely on ambient knowledge
Knowledge only, no action verbsSkills describe; agents act
Keep under ~300 linesBeyond that, split into multiple focused skills

Team Design

RuleRationale
One orchestrator per pipelineMultiple orchestrators create routing ambiguity
Explicit escalation paths per agentEach agent must know who owns out-of-scope questions
No circular delegationA→B→A causes infinite loops
Orchestrator owns synthesisThe value is unified output, not raw sub-agent dumps

Maintenance

RuleRationale
Version-control all .agent.md and SKILL.md filesChanges are auditable; rollback is possible
Update description whenever behavior changesOrchestrators route based on descriptions
Delete agents that aren’t usedStale agents confuse new team members and routing logic
Prune memory files on long-running tasksMemory drifts; outdated context causes wrong decisions

8. Quick Reference

File Type Cheatsheet

FileLocationPurpose
*.agent.md.copilot/agents/ (user or workspace)Defines an AI agent: role, tools, constraints
SKILL.md.copilot/skills/<name>/On-demand domain knowledge package
*.instructions.md.copilot/instructions/ or .github/Always-on context, scoped by applyTo glob
*.prompt.md.copilot/prompts/Reusable prompt templates; simpler than agents

Tool Reference

ToolWhat it does
codebaseSemantic search across the workspace
editFilesRead and write files in the workspace
searchText/regex search across files
runCommandsExecute shell commands; declare explicitly, use sparingly
findTestFilesLocates test files associated with a given source file
changesReads the current git diff / staged changes
usagesFinds all references to a symbol across the codebase
problemsReads VS Code’s Problems panel: compiler errors, lint warnings
fetchMakes HTTP requests to external URLs
agentDelegates to another named agent from the current conversation context
runTasksTriggers VS Code tasks defined in tasks.json

YAML Frontmatter Fields

FieldRequiredDescription
nameYesAgent identifier. Used in @AgentName invocation. No spaces.
descriptionYesWhen and why to use this agent. Used by orchestrators for routing.
modelNoLLM backend: claude-sonnet-4, gpt-4o. Defaults to workspace setting.
toolsNoArray of permitted tools. Omit = no tool access.
applyToNo(Instructions files only) Glob pattern scoping automatic injection.

If you want all of this pre-built and themed, the Agentic Aesir repo has a full 30-agent team ready to clone. If you want to understand what you’re cloning first, this post is the answer to that question.