Developer
Implementation, TDD, bug fix.
The Developer is the persona that writes, fixes, and evolves code. In an AI-native SDLC, the Developer operates a stack of validated primitives, not a code editor.
Executive summary
The Developer turns an approved specification into working, tested, reviewed code that ships to production. In an AI-native SDLC, the Developer operates inside the Implementation phase with a fixed set of primitives: one implementation agent, four slash prompts, scoped instructions, schema-validated hooks, and a curated list of validated MCPs. The primary outputs are code changes, passing test suites, pull requests with traceable context, and updated documentation.
Role and responsibilities
Think of the Developer like a structural engineer on a construction site. The architect hands over drawings that satisfy the zoning constraints. The engineer does not rewrite the drawings, but they also do not execute them mechanically: they choose the concrete mix, the rebar layout, and the sequence of pours that make the structure stand up. In an AI-native SDLC the specification, architecture decisions, and acceptance criteria are upstream artifacts, and the Developer is accountable for translating them into code that survives production without rework.
Primary responsibilities:
- Implement features described in
SPECIFICATION.mdusing the EARS requirements and Given-When-Then acceptance criteria - Practice Test-Driven Development end-to-end, starting from the failing test suggested by the spec
- Fix bugs with the understand-reproduce-fix-verify loop, never jumping to the fix
- Review code from peers and AI agents with equal rigor
- Update the
CODEMAP.mdand developer docs whenever a public API changes - Keep dependency hygiene, resolve vulnerabilities within SLA
- Operate the Implementer agent and the
/implement,/fix-bug,/tdd,/refactorprompts
Jobs to be done
- As a Developer, I want to convert an approved spec into a merged pull request within one working day, so that the team keeps a daily delivery cadence.
- As a Developer, I want the AI agent to write the failing test first, so that every feature has a machine-verifiable acceptance criterion.
- As a Developer, I want to reproduce a production bug in a local test, so that the fix is protected against regression.
- As a Developer, I want to refactor without changing behavior, so that the code base stays coherent as it grows.
- As a Developer, I want the PR description to be auto-generated from my changes, so that reviewers have full context without asking me.
- As a Developer, I want to know which dependency upgrade will break which test, so that security patches land without manual triage.
Pain points before AI-native
- Spec drift. The feature in the ticket is not the feature that shipped. Without a machine-readable spec linked to tests, every sprint silently redefines scope.
- Copy-paste debugging. Bugs are fixed by pattern matching on the stack trace instead of by reproducing the root cause. The same class of bug returns every quarter.
- Review fatigue. Reviewers cannot hold the whole system in their head, so they rubber-stamp or nitpick, never both at the same time.
- Test gaps invisible until production. Coverage reports lie because they count lines, not branches or behaviors. Risk lives in the untested 15 percent.
- Documentation lag. The
README.mdand API docs describe last quarter’s architecture. New team members ramp slowly and ask senior engineers the same questions every week.
AI-native daily workflow
The Developer operates a fixed loop each day. The loop uses GitHub Copilot primitives inside Visual Studio Code and Claude Code at the terminal, plus a small catalog of validated MCPs for external context.
Morning setup
- Pull the latest
mainand rebase the feature branch. - Open the repo in Visual Studio Code. GitHub Copilot Chat loads the
AGENTS.mdand the scoped.github/instructions/*.instructions.md. - Run
/audit-contextfrom the Technical Lead kit (installed as a dependency) to confirm the context budget is under threshold. - Read the active ticket, open the linked
SPECIFICATION.md, confirm the EARS requirements and acceptance criteria.
Core work cycle
Each work cycle is a single unit of change, typically 1 to 4 hours of focused work.
- Spec to failing test. Invoke
/tddwith the acceptance criteria. The Implementer agent writes a failing test that encodes the Given-When-Then and refuses to proceed until the test is committed. - Implement. Invoke
/implement. The agent writes the minimum code to pass the failing test. Copilot inline completions handle boilerplate; the developer owns decisions. - Self-review. Run the test suite, lint, type-check. If any hook fires, fix before moving on. Hooks are zero-token governance.
- Refactor. Invoke
/refactorto improve structure without changing behavior. The agent runs the test suite before and after to prove behavioral equivalence. - Pull request. The PR description is composed from the commit messages and the linked spec. Copilot Code Review and the Quality Reviewer agent scan the diff.
Bug cycle
When a bug is reported, the Developer invokes /fix-bug, which runs the understand-reproduce-fix-verify loop:
- Understand. Read the error, the stack trace, and the related code. The agent summarizes the hypothesis.
- Reproduce. Write a failing test that reproduces the bug. No fix is allowed before this step.
- Fix. Minimum change to make the failing test pass.
- Verify. Run the full test suite, not just the new test. Confirm no regression.
End of day
- Push the branch. GitHub Actions runs the CI pipeline.
- Update the ticket with the link to the PR and the tests that encode the acceptance criteria.
- If the feature touched a public API, verify the
CODEMAP.mdand the generated docs are up to date.
Recommended primitives
Agents
| Agent | File | Purpose |
|---|---|---|
implementer | .github/agents/implementer.agent.md | Implementation, TDD, bug fixing with understand-reproduce-fix-verify |
The Implementer agent uses claude-sonnet-4-6 by default. It holds tools read, edit, search, grep, glob, bash. Extended thinking is disabled because iterative implementation tasks lose quality with deep think loops.
Prompts
| Command | File | Purpose |
|---|---|---|
/implement | .github/prompts/implement.prompt.md | Implement a feature against a spec, minimum code to pass the test |
/fix-bug | .github/prompts/fix-bug.prompt.md | Four-step bug fix loop, never skips reproduction |
/tdd | .github/prompts/tdd.prompt.md | Write the failing test first, enforced |
/refactor | .github/prompts/refactor.prompt.md | Behavior-preserving structural improvement |
Instructions
Scoped applyTo reduces token cost by approximately 68 percent compared to global instructions.
Scope (applyTo) | File | Purpose |
|---|---|---|
src/**/*.ts,src/**/*.tsx | .github/instructions/typescript.instructions.md | TypeScript conventions, strict mode, no any |
tests/**/* | .github/instructions/tests.instructions.md | AAA pattern, meaningful names, no brittle snapshots |
**/*.sql | .github/instructions/sql.instructions.md | Migrations are up-and-down, no schema drift |
Skills
Skills are lazy-loaded, so the developer can install many and pay tokens only for the ones that trigger.
tdd-enforcer: refuses to write implementation code if the failing test is missingdep-risk-scan: calls the GitHub MCP to read Dependabot alerts and CodeQL results on every dependency upgrade
Hooks
Hooks cost zero LLM tokens. They are the strongest governance layer.
pre-commit: lint, type-check, secret scanpost-commit: regenerateCODEMAP.mdif public API surfaces changedpre-push: run the fast test lane
Validated MCPs
Every MCP below is registered in the MCP catalog. Do not reference any MCP that is not in the catalog.
| MCP | Status | Use in this persona |
|---|---|---|
| GitHub MCP Server | Official | Read the repo, manage PRs and issues, read Actions runs |
| Microsoft Learn Docs MCP | Official | Fetch current Microsoft documentation when implementing on Azure stacks |
| Azure MCP Server | Official (Microsoft) | Pull Application Insights and Azure Monitor errors into the fix-bug loop; query Azure resources during implementation |
| Azure DevOps MCP Server | Official (Microsoft) | Read the active work item from Azure Boards, update it after PR merge (when the team uses Azure DevOps instead of GitHub Issues) |
| Playwright MCP | Official (Microsoft) | Drive end-to-end tests against the running feature |
| Microsoft 365 Agents SDK MCP | Official (Microsoft) | Integrate the feature with Teams, Copilot, and Microsoft 365 surfaces when the product requires it |
Real examples
Scenario A: implement a new endpoint
Input: SPECIFICATION.md contains the EARS requirement WHEN a user submits a rental claim with a valid contract ID, THE system SHALL return the claim status within 300 ms.
Invocation: /tdd followed by /implement.
Expected output:
- A failing integration test
tests/claims/returns-status-under-300ms.spec.tsthat asserts response time and payload shape. - A new route handler
src/claims/status.controller.tswith minimal code to pass the test. - A PR titled
feat(claims): return claim status within 300 mslinked to the spec section and the new test.
Scenario B: fix a production bug
Input: An Application Insights alert (via Azure Monitor) reports a null pointer in ContractService.findById triggered by concurrent requests.
Invocation: /fix-bug.
Expected output:
- A failing unit test
tests/contracts/find-by-id-concurrency.spec.tsthat reproduces the race condition. - A fix in
ContractServicethat introduces optimistic locking, with no other behavioral change. - A PR titled
fix(contracts): eliminate race in ContractService.findByIdlinked to the Application Insights incident and the new test. - A post-merge resolution of the Application Insights alert with the PR URL recorded in the incident timeline.
Scenario C: behavior-preserving refactor
Input: A monolithic orders.service.ts of 1,200 lines needs to be split into cohesive modules.
Invocation: /refactor.
Expected output:
- The full test suite runs green before the refactor.
orders.service.tsis split intoorders/pricing.ts,orders/validation.ts,orders/persistence.tswith identical public surface.- The full test suite runs green after the refactor.
- A PR titled
refactor(orders): split service into pricing, validation, persistencewith a diff summary and a test parity report.
Anti-patterns
- Skipping the failing test. Writing the implementation first and adding a test that happens to pass defeats TDD. Mitigation: the
tdd-enforcerskill refuses to generate implementation code when no failing test exists. - Trusting coverage percentage as the signal for quality. Line coverage is a vanity metric. Mitigation: track mutation score or branch coverage, and include negative-path assertions in every test file.
- Letting Copilot choose naming without context. Names hallucinated from patterns outside the repo produce inconsistent code. Mitigation: scope instructions with
applyToand teach Copilot the domain vocabulary. - One-shot large refactors. Refactors that touch dozens of files at once cannot be reviewed safely. Mitigation: split into a sequence of small, behavior-preserving commits, each green on the test suite.
- Ignoring hooks. A pre-commit hook that fails is a gift, not a blocker. Mitigation: treat hook output as the first review; fix before committing.
KPIs and impact metrics
The Developer persona is evaluated with a mix of DORA, SPACE, and Agentic DevOps metrics.
| Metric | Baseline (manual) | Target (agentic) | Measurement |
|---|---|---|---|
| PR lead time | 3 days | < 1 day | GitHub API |
| Deployment frequency | Weekly | Multiple per day | GitHub Deployments |
| Change failure rate | 20 percent | < 5 percent | Application Insights or incidents post deploy |
| Mean time to restore | 4 hours | < 1 hour | Incident tracker |
| Test suite reliability | 85 percent | > 99 percent | Flake rate |
| Mutation score | Unknown | > 70 percent | Stryker, Pitest, or equivalent |
| Rework rate | 30 percent | < 10 percent | Percent of merged code rewritten within 30 days |
| Token efficiency | N/A | < 1M tokens per merged PR | Copilot usage report |
Maturity in four levels
| Level | Name | Markers |
|---|---|---|
| L1 | Manual | Copy-paste from Stack Overflow, no standard prompt, no scoped instructions, no MCPs |
| L2 | Assisted | GitHub Copilot autocomplete only, no agent, AGENTS.md missing or generic |
| L3 | Augmented | One Implementer agent, four slash prompts, scoped instructions, one or two MCPs, TDD workflow |
| L4 | Agentic | Full primitives kit, hooks enforced, validated MCPs in the catalog only, PR narrative auto-generated, maturity scorecard above 80 percent |
Integration with other personas
Handoffs:
- From Technical Lead: routing table, scoped instructions,
AGENTS.md, project baseline - From Software Architect:
CODEMAP.md,IMPLEMENTATION_PLAN.mdwith parallel markers, API contracts - To QA Engineer: merged PR with passing tests, test matrix updated
- To Tech Writer: updated
CODEMAP.md, new public API surfaces, changelog entry - To SRE: deployment-ready artifact, feature flag configuration, runbook updates
Glossary
- Agent: a configured LLM role with tools, instructions, and a defined output shape. Lives in
.github/agents/<name>.agent.md. - Prompt: a reusable slash command that invokes an agent with a specific task. Lives in
.github/prompts/<name>.prompt.md. - Instructions: scoped guidance applied by pattern match on file paths via
applyTo. Lives in.github/instructions/<name>.instructions.md. - Skill: a lazy-loaded capability that activates on keyword match. Costs tokens only when triggered.
- Hook: a zero-token rule enforced at a specific lifecycle event (pre-commit, post-commit, pre-push, pre-merge).
- MCP: Model Context Protocol server that exposes external systems (GitHub, Azure, Azure DevOps, etc.) to the agent.
- EARS: Easy Approach to Requirements Syntax. Format used in
SPECIFICATION.md. - TDD: Test-Driven Development. Write the failing test first, then the minimum code to pass it.
- CODEMAP: A generated document that describes the program skeleton for the LLM and for humans.
References
- GitHub Copilot documentation — authoritative source for Copilot features, agent mode, and instructions
- Claude Code overview — Anthropic’s agentic CLI used for long-running tasks
- Spec-Kit open source reference — spec-driven development scaffolding
- Model Context Protocol specification — the protocol that binds agents to external systems
- Effective context engineering for AI agents, Anthropic — canonical guidance for token-efficient agent design
- DORA metrics research — the empirical foundation behind four key metrics for software delivery
- SPACE framework, Microsoft Research — developer productivity dimensions beyond velocity