Skip to content

The Code Agent’s Playbook: Maintenance Skills, Security Skills, and the Self-Evolving Harness

A Practitioner’s Guide to Building Agents That Maintain, Protect, and Improve Their Own Codebase

This is the fifth guide in our series on agentic marketing systems. The first taught you to write skills. The second taught you to architect agents. The third taught you to wire tools. The fourth put all three together into a working OpenClaw marketing agent.

This guide changes the subject. Instead of building agents that do marketing, we are building agents that maintain the code your marketing systems run on and that protect that code from the security threats that come with autonomous operation.

If you have followed the series so far, you have a marketing agent with skills, tools, and a tool layer wired through MCP or CLI. That agent runs on a codebase - Python scripts that query your data warehouse, TypeScript services that power your bidding engine, Node.js workers that generate your reports. Someone maintains that codebase. Someone reviews pull requests, updates dependencies, fixes failing tests, checks for security vulnerabilities, and makes sure the CI pipeline stays green.

Increasingly, that someone is another agent.

The coding agent ecosystem exploded in 2025 and has not slowed down. Claude Code, Codex, Cursor, Gemini CLI, Amp, OpenCode - the list of agent-capable coding environments is long and growing. OpenAI reported that between December 2025 and February 2026, their Agents SDK repos merged 457 pull requests using Codex with skills, up from 316 in the prior three months. Anthropic’s 2026 Agentic Coding Trends Report describes agents whose task horizons have expanded from minutes to days, working autonomously across entire applications with periodic human checkpoints.

But a coding agent without the right skills is a capable junior engineer on their first day: technically proficient, lacking all project-specific context, and potentially dangerous without guardrails. The skills you give a code agent determine whether it maintains your codebase responsibly or introduces subtle regressions, security vulnerabilities, and technical debt faster than a human team can catch them.

This guide covers three things:

Part I defines the skill set a code-maintenance agent needs - from repo orientation to PR drafting - drawing on real patterns from OpenAI’s Codex skills, Claude Code skills, and the broader open-source agent ecosystem.

Part II defines the security skills that protect the code your agent writes and the environment it operates in - from prompt-injection defense to dependency auditing - because autonomous code agents are a new and serious attack surface.

Part III introduces the self-evolving harness: a pattern where the agent maintains and improves its own skills, policies, and safety constraints over time, without changing its model weights, through a structured loop of execution, reflection, and adaptation.

By the end, you will have a reference architecture for a code agent that does not just write code but maintains it responsibly, protects it systematically, and gets better at both over time.


Contents

Part I: The Code Maintenance Skill Set

Part II: The Security Skill Set

Part III: The Self-Evolving Harness

Conclusion: The Agent That Maintains Itself


Part I: The Code Maintenance Skill Set

Why Code Agents Need Explicit Skills

A coding agent without skills is a model with access to a terminal. It can read files, run commands, edit code, and commit changes. These are capabilities - the “muscles” we described in our third guide. But capabilities without structured workflows lead to inconsistent, unreliable, and sometimes dangerous behaviour.

Consider what happens when you ask a naked coding agent - no skills, no project-level guidance - to fix a failing test. It might:

  • Read the failing test, diagnose the issue, and apply a minimal fix. Good.
  • Rewrite the entire test to make it pass, masking the real bug in production code. Bad.
  • Fix the test but break three other tests in the process, without checking. Worse.
  • Edit the test file and the production code, commit directly to main, and push without running any verification. Dangerous.

All of these are within the agent’s capabilities. Skills determine which path it takes.

OpenAI discovered this firsthand while maintaining their Agents SDK repositories. Their engineering team found that Codex was most effective when they encoded explicit skills for recurring workflows and made them mandatory through AGENTS.md directives. The result was not just better code quality - it was measurably higher throughput. The repos merged 44% more PRs in three months after introducing structured skills and mandatory verification.

The lesson is straightforward: a code agent’s value is proportional to the quality of its skills, not the size of its model. A well-skilled agent on a smaller model will outperform an unskilled agent on a frontier model for maintenance work, because maintenance is fundamentally about following the right process, not about raw reasoning power.

The skills ecosystem has matured rapidly to support this reality. As of early 2026, the SKILL.md format works across Claude Code, Codex, Cursor, Gemini CLI, Amp, OpenCode, and over a dozen other agent-capable environments. OpenAI maintains a public skills catalogue at github.com/openai/skills. Anthropic’s official skills (including the frontend-design skill with over 277,000 installs) demonstrate the scale of adoption. Microsoft has added skills support to Visual Studio 2026 and published .NET-specific skills for Azure development. Vercel’s skills.sh launched in January 2026 and reached 20,000 installs within weeks.

The convergence on SKILL.md as a portable format means that the skills you write for code maintenance are not locked to any one agent runtime. A code-change-verification skill written for Claude Code works in Codex. A dependency-audit skill built for Codex works in Cursor. This portability is critical for teams that use multiple agents for different tasks and most teams do.

The Four Pillars: Plan, Implement, Verify, Summarise

From OpenAI’s Codex skills, Claude Code community skills, and the OpenClaw ecosystem, a clear pattern emerges. Effective code maintenance follows four phases:

Plan. Before touching any code, the agent reads the problem, understands the project structure, identifies relevant files, and outlines an approach. This prevents the “change and hope” editing pattern where the agent jumps straight to modifying files without understanding the implications.

Implement. The agent applies minimal, scoped changes. It edits only what needs to change, respects the project’s style and conventions, and avoids touching unrelated code.

Verify. After every change, the agent runs the appropriate checks - build, lint, typecheck, tests, coverage - based on what was changed. This is not optional. It is mandatory for any change that touches runtime code, tests, examples, or build infrastructure.

Summarise. When the work is complete, the agent drafts a clear description of what changed, why it changed, what the risks are, and what checks passed. This becomes the PR description or commit message.

These four phases map directly to four core skills. But a serious code agent needs more than four. Let us walk through the full reference skill set.

Skill 1: Repo Orientation and Implementation Strategy

This skill mirrors OpenAI’s $implementation-strategy - one of the mandatory skills in their Agents SDK repos.

Purpose. Force the agent to understand the project before editing it.

When to use. Before any change to runtime code, API surfaces, or cross-cutting concerns. This skill should fire before the agent opens an editor.

Behaviour. The agent reads the project’s configuration files (AGENTS.md, CLAUDE.md, CONTRIBUTING.md, or equivalent), understands the directory layout (where source lives, where tests live, where build configuration lives), identifies the canonical build and test commands, and writes a plan. The plan should name the files that will change, describe the approach, call out potential side effects, and identify which verification steps will be needed.

Why you need that. Without this skill, agents tend to dive straight into editing the first file they find that looks relevant. This produces narrow fixes that miss broader implications. The planning step forces upfront reasoning about the change’s scope and impact.

Implementation note. In OpenAI’s repos, AGENTS.md includes a directive: “Before editing runtime or API changes, call $implementation-strategy to decide the compatibility boundary and implementation approach first.” This is a mandatory gating condition. Your agent’s configuration should enforce the same pattern.

Skill 2: Change-Scoped Code Editing

Purpose. Apply minimal, localised patches rather than rewriting entire files.

When to use. Whenever the agent edits source code.

Behaviour. The agent applies diffs that are scoped to the specific change. It preserves existing code style (indentation, naming conventions, comment patterns). It does not refactor adjacent code unless explicitly asked. It does not introduce new dependencies without flagging the addition. It does not touch files unrelated to the task.

Why you need that. One of the most common failure modes of code agents is “scope creep in edits.” An agent asked to fix a null-pointer exception in one function might decide to refactor the entire module while it is there. This creates large, hard-to-review diffs, introduces unintended regressions, and makes the actual fix harder to identify in code review.

The OpenClaw ecosystem highlights this as a key best-practice: start with safe tools only (read, list, search) and add write and execute capabilities gradually. The same principle applies at the skill level - edit only what the task requires.

Skill 3: Change-Aware Verification

This is the most critical skill in the entire set. It mirrors OpenAI’s $code-change-verification, which they describe as mandatory for any change affecting runtime code, tests, examples, or build behaviour.

Purpose. Run the right checks for the type of change made, every time, without exception.

When to use. After any edit to source code, tests, build configuration, or examples.

Behaviour. The skill first identifies what changed - which directories and file types were modified. Then it selects the appropriate verification commands based on the change scope:

For a Python project, this might be: make formatmake lintmake typecheckmake tests. For a TypeScript project: pnpm buildpnpm lintpnpm test. For a mixed project, it runs the appropriate stack for each changed area.

The skill then collects results, summarises pass/fail status, highlights any failures or flakiness, and - critically - does not mark the task as complete until verification passes.

Why you need that Without mandatory verification, agents routinely commit code that fails tests, breaks the build, or introduces type errors. The verification skill is what separates a responsible code agent from an expensive source of technical debt.

OpenAI’s implementation makes this explicit: their AGENTS.md says “If the change affects SDK code, tests, examples, or build behaviour, call $code-change-verification.” The skill encodes the repository’s definition of “verified,” and the configuration file makes that definition enforceable.

A practical SKILL.md pattern for this skill:

---
name: code-change-verification
description: >
  Run build, lint, typecheck, and test verification after code changes.
  Mandatory for any change to runtime code, tests, examples, or build config.
  Do not mark work complete until verification passes.
---

# Code Change Verification

## When to use
After ANY edit to source code, test files, build configuration,
or example code. This skill is mandatory, not optional.

## Workflow

### 1. Identify what changed
- List modified files by path
- Categorise: runtime code, tests, examples, build/config, docs

### 2. Select verification commands
Based on the project's stack and what changed:

**Python projects:**
- `make format` or `ruff format --check`
- `make lint` or `ruff check`
- `make typecheck` or `mypy`
- `make test` or `pytest`

**TypeScript/JavaScript projects:**
- `pnpm build` or `npm run build`
- `pnpm lint` or `eslint`
- `pnpm typecheck` or `tsc --noEmit`
- `pnpm test` or `jest`

Adapt commands to what AGENTS.md or the project's docs specify.

### 3. Execute and collect results
- Run each command in order
- Capture exit codes, stdout, stderr
- Note any test flakiness (tests that pass on re-run)

### 4. Evaluate
- If all pass: summarise results, proceed
- If any fail: diagnose failure, attempt fix, re-run
- If fix fails after 2 attempts: escalate to human

## Guardrails
- Never skip verification to save time
- Never mark a task complete with failing tests
- If the project has no tests for the changed area, flag this as a gap

Skill 4: Test Authoring and Improvement

Purpose. Generate or improve tests around recent changes, focusing on edge cases and regressions.

When to use. When new code lacks test coverage, when existing tests are brittle or poorly named, or when a bug fix should include a regression test.

Behaviour. The agent analyses the changed code and generates tests that cover the happy path, error conditions, boundary values, and the specific scenario that prompted the change (if it was a bug fix). It follows the project’s existing test conventions - framework, naming patterns, fixture usage, assertion style. It does not generate tests that duplicate existing coverage.

For improving existing tests, the agent looks for weak assertions (testing that a function “does not throw” instead of checking the actual return value), missing edge cases, unclear test names, and opportunities to use better fixtures or parametrisation.

Why you need that. Code agents that generate code without generating tests create a false sense of progress. The code works - until it does not, and there are no tests to catch the regression. A test-authoring skill turns every code change into an opportunity to strengthen the test suite.

Community-built Claude Code skills like the “PR Reviewer” and “test-coverage” skills focus heavily on this pattern - calculating coverage for changed code, identifying untested edge cases, reviewing test quality and assertions, and suggesting additional test scenarios. The output is not just a list of suggestions but structured coverage data with per-file coverage, uncovered lines, and example test code.

Skill 5: Coverage Analysis and Gap Identification

Purpose. Run coverage tools, parse results, and identify where the test suite has gaps - especially in recently changed code.

When to use. During PR review, after test authoring, or as a periodic health check.

Behaviour. The agent runs the project’s coverage commands (for example pytest --cov or jest --coverage), parses the output into per-file coverage percentages, identifies uncovered lines and risk areas, and suggests targeted tests for the gaps. It computes coverage for the changed code specifically (not just overall coverage), because a project with 90% overall coverage can still have 0% coverage on the code that just changed.

Why you need that. Coverage numbers are easy to game and easy to misread. An agent that understands coverage at the file and line level - and that focuses on the delta introduced by recent changes - provides genuinely useful quality signals, not just a vanity metric.

Skill 6: PR Review and PR Drafting

This combines two distinct skills that work together: the PR reviewer (which analyses changes for quality, correctness, security, and style) and the PR drafter (which turns a diff into a clear, structured description).

PR Review. The agent examines the diff from multiple angles - a pattern that the Claude Code community calls “multi-agent PR review.” The perspectives include:

  • Correctness. Does the code do what it claims to do? Are there logic errors, off-by-one mistakes, unhandled edge cases?
  • Performance. Does the change introduce any obvious performance regressions? Database queries in loops? Unnecessary memory allocation?
  • Security. Does the change handle input validation, authentication, access control, and error handling appropriately? (More on this in Part II.)
  • Style. Does the change follow project conventions? Are names clear? Is the code readable?
  • Test completeness. Are the tests adequate for the changes made?

PR Drafting. This mirrors OpenAI’s $pr-draft-summary skill. Once the work is done, the agent reads the diff, summarises intent and impact, categorises the change (bugfix, refactor, feature, dependency update), and calls out breaking changes and risks. The result is a structured PR description that helps human reviewers understand what changed and why - without having to read every line of the diff.

This skill pair is where agent-assisted maintenance provides the most obvious value. PR review is tedious, time-consuming, and error-prone for humans. An agent that handles the mechanical parts - checking style, running coverage, identifying obvious issues - frees human reviewers to focus on architectural decisions and business logic.

Skill 7: CI/CD Guardrails

Purpose. Monitor the CI/CD pipeline and catch problems before or after merge.

When to use. As a post-merge verification step, or when CI failures need investigation.

Behaviour. The agent monitors merges to the main branch, runs code review and tests on merged changes, and pings humans if something breaks. In the pre-merge direction, it can run the full verification suite on a PR branch before approving the merge.

The OpenClaw ecosystem provides a concrete workflow pattern for this: “monitor GitHub for merges to main → run code_review → run tests → deploy to staging → post to Slack.” This is the full maintenance loop automated.

Why it matters. CI/CD failures are one of the highest-friction pain points in software teams. An agent that can investigate a failing CI job, identify the root cause, and propose a fix - or at least triage it to the right person - saves significant engineering time.

Skill 8: Policy and Governance Enforcement

Purpose. Check changes against repo-specific rules, compatibility policies, deprecation guidelines, and security requirements.

When to use. During PR review, before marking work complete.

Behaviour. The agent reads the project’s policy documents (AGENTS.md, SECURITY.md, CONTRIBUTING.md, or similar) and verifies that the change complies. It checks for things like: public API changes that require a compatibility review, deprecated patterns that should not be introduced, cross-cutting refactors that need sign-off from multiple owners, and changes that trigger mandatory human review (for example, anything touching authentication or payment processing).

Why you need that. Policy enforcement is exactly the kind of work that agents excel at and humans forget to do. A human reviewer might miss that a change to a public API function needs a deprecation notice. An agent that has the policy encoded in a skill catches it every time.

Wiring Skills to Your Agent Configuration

Skills do not enforce themselves. They need to be wired into the agent’s operating instructions so the agent knows when to invoke them and - critically - when invocation is mandatory rather than optional.

For Codex, this happens through AGENTS.md. OpenAI’s pattern uses short if/then rules at the top of the file:

## Mandatory skills

- Before editing runtime or API changes, call $implementation-strategy
- If the change affects code, tests, examples, or build: call $code-change-verification
- When work is finished and ready to hand off: call $pr-draft-summary

For Claude Code, the equivalent is CLAUDE.md in the project root, with similar directives. Claude Code auto-discovers skills from .claude/skills/**/SKILL.md.

For OpenClaw or other agent runtimes, the skills live in the workspace’s skills/ directory and the operating instructions in AGENTS.md or SOUL.md reference them.

The key principle: skills in the skill folder are available. Rules in the configuration file make them mandatory. Without the mandatory wiring, the agent might use them. With it, the agent must use them. The difference between “might” and “must” is the difference between occasional quality and consistent quality.


Part II: The Security Skill Set

The Threat Model for Code Agents

A code agent that can read files, execute commands, and push changes to a repository is, from a security perspective, a programmable process with significant privileges. It operates on your source code, has access to your development environment, and interacts with your version control system. If it is compromised or if it simply makes poor decisions about untrusted content, the consequences are not hypothetical.

The threat model for code agents has four layers:

Prompt injection. The agent processes text from many sources: user messages, file contents, terminal output, web pages, documentation, and retrieval results. Any of these can contain instructions designed to override the agent’s intended behaviour. Anthropic’s research has shown injection success rates above 70% without explicit mitigations. When the agent has tool access - shell commands, file writes, network calls - a successful injection can translate into real-world actions.

Excessive agency. Skills with broad filesystem, network, and process access can be chained into credential theft, data exfiltration, and destructive actions. The OpenClaw security incidents demonstrated this concretely: Cisco researchers showed a third-party skill that performed data exfiltration and prompt injection without the user’s awareness. Palo Alto Networks described this as the “lethal trifecta” - access to private data, exposure to untrusted content, and ability to communicate externally.

Supply chain and dependency risk. Code agents work with open-source packages and libraries. Malicious or vulnerable dependencies - typosquats, abandoned packages with critical CVEs, libraries with unnecessary shell execution capabilities - can provide remote code execution and privilege escalation inside the agent’s runtime.

Multi-step attack chains. Security researchers have described what amounts to a “promptware kill chain” for agents: prompt injection triggers privilege escalation, which enables reconnaissance of the environment, which leads to persistence (for example through memory or RAG poisoning), then command and control, lateral movement, and finally actions on the objective - data exfiltration, code modification, or credential theft.

For marketing teams specifically, the attack surface is acute because marketing agents typically interact with systems that contain:

  • Customer data. PII, purchase history, behavioural signals, audience segments - regulated by GDPR, CCPA, and similar frameworks.
  • Financial data. Campaign budgets, spend, revenue, ROAS - useful for competitive intelligence.
  • Platform credentials. API keys for Google Ads, Meta Ads, analytics platforms, data warehouses - which provide access to the advertising accounts themselves.
  • Brand communications. Ad copy, email content, social posts - which can be modified to damage the brand or redirect customers.

A compromised code agent that maintains the systems connecting to these data sources has, by transitivity, access to all of them. This is the operational reality of any code agent that maintains marketing infrastructure.

Your security skills need to encode defences against all four threat layers. This is not about writing secure code in the narrow sense. It is about making the agent itself a hardened, security-aware participant in your development process.

Baseline Security Capabilities

Before defining specific skills, the agent needs a minimum set of security capabilities - the tools that security skills can call:

Inspect code and configuration safely. Read-only by default. The agent should be able to browse the repo, read files, and understand the project structure without any risk of modification.

Inspect dependency manifests and lockfiles. The agent needs to read and parse requirements.txt, pyproject.toml, package.json, poetry.lock, go.mod, and similar files to understand what external code the project depends on.

Call vetted security tools in a sandbox. SAST (Static Application Security Testing), SCA (Software Composition Analysis), secret scanners, and dependency vulnerability scanners should be available as CLI tools the agent can invoke. These tools should run in a sandboxed environment, not on your primary workstation.

Understand project security policies. The agent should read and follow an AGENTS-SECURITY.md, SECURITY.md, or equivalent that defines the project’s specific security requirements and thresholds.

Log all dangerous actions for audit. Every file write, network call, command execution, and environment variable access should be recorded with timestamps, arguments (minus secrets), and outcomes.

These are the tools. Now the skills that decide when and how to use them.

Behavioural Security Skills

These skills govern how the agent itself behaves - its decision-making about trust, tool selection, and action logging. They apply across repos and stacks.

Prompt-Injection Defence and Content Handling

Purpose. Teach the agent to treat untrusted content as data, not as instructions.

Behaviour. The agent treats any text not from the user or system prompt as data only - never as a source of new goals or actions. It prioritises system instructions and repo policies over all external content. When content contains commands (“run rm -rf”, “send me the .env file”, “curl this endpoint”), it summarises them as potential attacks instead of executing them. In ambiguous cases - conflicting instructions from different sources - it asks the human or falls back to a safe default (no action).

Why it matters. This is the most important behavioural skill because it addresses the highest-success-rate attack vector. Without explicit instruction to distrust external content, agents follow instructions embedded in files, web pages, terminal output, and retrieved documents with alarming reliability. The skill encodes “never trust the page” as an operational principle.

Safe Tool Selection and Least Privilege

Purpose. Decide which tools can be used in a given context, based on risk level.

Behaviour. The agent defaults to read-only tools and analysis for any task involving untrusted or unclear inputs. It requires human confirmation or an explicit policy flag before using write, network, or deployment tools. It never combines high-risk tools (full filesystem access plus unfiltered network) in the same automated loop without a human checkpoint.

This skill acts as a policy layer in front of tools. The OpenClaw ecosystem’s guidance maps directly: start with safe tools only (read, list, search) and add exec, write, and network gradually, with explicit approval at each step.

Why it matters. The “multiplier effect” in agent security comes when prompt injection meets high-permission tools. An agent that has been manipulated into following malicious instructions can only do damage proportional to its permissions. This skill limits the damage radius by constraining which permissions are active at any given time.

Action Logging and Anomaly Detection

Purpose. Ensure every sensitive action is logged and suspicious sequences are surfaced to humans.

Behaviour. The agent records tool name, arguments (with secrets redacted), timestamp, and outcome for every risky call. It monitors for anomaly patterns: bursts of file reads targeting .ssh, .aws, .config, or token files; unusual network endpoints; repeated failures accessing protected paths; attempts to read environment variables that contain credentials.

When anomalous patterns are detected, the agent halts further risky actions and notifies a human before proceeding.

Why it matters. Multi-step attacks hide in complex action sequences. An agent that reads your SSH keys, then accesses a network endpoint, then writes to a file might be performing a legitimate deployment task - or it might be exfiltrating credentials. The logging and anomaly detection skill provides the audit trail needed to distinguish the two, and the circuit breaker needed to stop the latter.

Code and Dependency Security Skills

These skills act on the codebase and its ecosystem, including external libraries.

Secure Code Review

Purpose. Go beyond style and correctness to explicitly check for security properties in changed code.

Behaviour. For every changed file, the agent checks: input validation (are user inputs sanitised before use?), authentication and authorisation (are access controls enforced?), logging of sensitive actions (are audit-relevant events captured?), error handling (do errors leak implementation details or stack traces?), and injection risks (SQL injection, command injection, template injection, insecure deserialisation).

It classifies issues by severity and likelihood, and suggests secure patches rather than just flagging problems. This is the difference between a security scanner that says “potential SQL injection on line 42” and a security-aware reviewer that says “line 42 concatenates user input into a SQL query; here is a parameterised version.”

Why it matters. Coding agents both discover and introduce subtle bugs. An agent that fixes one issue while introducing an injection vulnerability is a net negative. A dedicated security review skill - separate from the general code review skill - ensures that security properties are checked independently and systematically.

Dependency and Supply Chain Audit

Purpose. Evaluate third-party packages and indirect dependencies for known risk.

Behaviour. The agent parses manifest and lockfiles to build a dependency list. It uses SCA tools and vulnerability databases to find known CVEs. It flags suspicious packages: typosquats (package names that are one character off from popular packages), libraries with recent critical vulnerabilities, abandoned projects with no recent maintenance, packages with unnecessary capabilities (like shell execution in a string-formatting library).

It proposes mitigations: pinning versions to avoid unexpected upgrades, replacing vulnerable packages with maintained alternatives, or adding sandboxing around risky libraries.

Why it matters. AI projects lean heavily on open-source packages. The agent’s own dependencies - and the dependencies of the tools and skills it uses - constitute a significant attack surface. A vulnerable library in the agent’s tool chain can provide remote code execution inside the agent’s runtime, bypassing all the behavioural safeguards above.

Agent Dependency Perimeter Mapping

Purpose. Identify and minimise the set of libraries that the agent’s tools and skills can directly reach.

Behaviour. The agent maps which libraries are callable by its tools (via CLI, MCP, or direct function calls). It checks those libraries for security posture - known CVEs, maintainer status, unusual behaviours. It recommends narrowing the surface: refactoring tools to limit their dependency graph, moving risky libraries behind controlled interfaces, and isolating high-risk operations in separate processes or containers.

Why it matters. This skill addresses the “trifecta” directly: risky dependencies plus high-permission skills plus autonomous agents. By explicitly mapping and shrinking the dependency perimeter, you reduce the blast radius of any single compromised component.

Secrets and Credential Hygiene

Purpose. Prevent secret leakage and detect risky credential storage.

Behaviour. The agent runs secret scanners on diffs and whole repositories, flagging hard-coded tokens, keys, passwords, and connection strings. It checks the agent runtime’s configuration: environment variables that contain credentials, key stores that are world-readable, browser profiles or SSH directories that should not be accessible.

It suggests best practices: using secret managers instead of environment variables, segmenting environments (development credentials should never have production access), and limiting token scopes to the minimum necessary.

Why it matters. Reported incidents in the OpenClaw ecosystem showed agents and skills exfiltrating SSH keys, API tokens, and wallet data when given broad filesystem and network access. A secrets hygiene skill catches leaked credentials before they ship, and identifies environmental configurations that make exfiltration too easy.

Runtime and Environment Hardening Skills

These skills check that the agent’s operating environment itself is safe for autonomous work.

Environment Risk Assessment

Purpose. Verify that the agent is running in an appropriately isolated environment.

Behaviour. The agent detects whether it is running on a developer’s primary workstation, in a container, or in a dedicated VM. It flags running on a primary workstation as unsafe for autonomous operation. It checks for the presence of production credentials or production configurations in the same environment. It recommends isolation: separate VMs or containers per agent, network egress restrictions, read-only mounts for sensitive paths.

This skill echoes the guidance Anthropic published for Computer Use: never run agents with broad access on your primary machine. Always use dedicated, isolated environments.

Kill-Switch and Confinement

Purpose. Provide a standardised way to pause or confine agents when something goes wrong.

Behaviour. The agent checks for “safety mode” flags from a central policy system. When invoked, it revokes access to high-risk tools, restricts file and network access, and forces all pending tasks into a “needs human review” state. It archives recent actions and logs for forensic analysis.

Why it matters. This is the operational control that maps the attack kill chain into a defensive response. Once you suspect persistence, command-and-control behaviour, or credential exfiltration, you need to be able to constrain or stop the agent immediately. The kill-switch skill makes that process standardised and fast, rather than ad-hoc and panic-driven.

The Security Skill Pack: A Reference Set

Combining all of the above, here is the reference security skill set for a code agent:

Behavioural security skills - how the agent itself behaves:

  1. Prompt-injection defence and content handling
  2. Safe tool selection and least privilege
  3. Action logging and anomaly detection

Code and dependency security skills - what the agent checks in the codebase:

  1. Secure code review (with explicit security checks)
  2. Dependency and supply chain audit
  3. Agent dependency perimeter mapping
  4. Secrets and credential hygiene

Runtime and governance skills - how the operating environment is hardened:

  1. Environment risk assessment
  2. Kill-switch and confinement

Each skill maps to specific tools (SAST scanners, SCA tools, secret scanners, dependency graph analysers, logging hooks) and specific policies (AGENTS-SECURITY.md, allowed-actions matrices, environment baselines). The skills are the intelligence layer that decides when to invoke which tools and how to interpret the results. The tools do the mechanical work. The policies set the thresholds.


Part III: The Self-Evolving Harness

What “Self-Evolving” Means (and What It Does Not)

The skills described in Parts I and II are static. You write them, the agent follows them, and they stay the same until a human updates them.

But what if the agent could improve its own skills over time? Not by changing its model weights - that requires retraining and is far outside the scope of an operating code agent. Instead, by modifying the mutable artefacts that sit around the model: the skills, policies, configuration files, and usage patterns that form the agent’s harness.

This is the pattern emerging from self-improving coding agent research and self-evolving skill frameworks. The model stays frozen. The scaffolding evolves.

The concept is straightforward. You can treat the agent’s harness as three mutable artefacts that live alongside the codebase:

Repository memory. Files like AGENTS.md, AGENTS-SECURITY.md, and AGENTS-OPS.md that define how the agent should behave in this specific repo.

Skill library. A set of skill folders - skills/code-change-verification/, skills/dependency-audit/, skills/prompt-safety/ - that encode the agent’s playbooks.

Telemetry and evaluation logs. Structured logs of tasks completed, tools called, failures encountered, and human corrections received.

The agent does not change itself. It changes its instructions, playbooks, and policies - gated by tests and human review.

The Evolution Loop

The self-evolving harness operates through a structured loop that runs after each batch of work. Think of it as a retrospective that the agent conducts on its own performance - except instead of producing action items that get forgotten, it produces skill patches that get tested and applied.

Phase 1: Execution. The code agent works on its assigned tasks (fixes, upgrades, audits, reviews) using its current skills: implementation-strategy, code-change-verification, dependency-audit, prompt-safety, and the rest. All actions are logged - which commands were run, which files were edited, which tests passed or failed, which security tools were invoked, and how long each step took.

The logging is structured and machine-readable. Each log entry records the task description, the skills that were invoked (and in what order), the tools that were called (with arguments and return codes), the outcome (success, failure, human override), and any corrections the human reviewer made. This last point is critical - human corrections are the highest-signal training data for harness evolution.

Phase 2: Reflection. A critic sub-agent - or a separate pass of the same agent with a different prompt - reads the execution logs and answers a structured set of questions:

  • Where did a skill fail to produce the expected outcome?
  • Where did the agent skip a skill it should have invoked?
  • Were there near-misses? (For example, almost committing a change without running code-change-verification.)
  • Were there patterns the current skills do not cover? (For example, a recurring type of build failure that the agent keeps diagnosing from scratch instead of following a playbook.)

Phase 3: Proposal. A skill-designer sub-agent (or the same agent in a skill-design mode) proposes changes:

  • A new skill (for a recurring failure pattern that no existing skill addresses).
  • An update to an existing skill (tighten conditions, add steps, change tools, add examples).
  • A policy change in AGENTS-SECURITY.md (for example, “never touch dependencies without running dependency-audit first”).

Each proposal is a concrete diff to a specific file in the harness tree.

Phase 4: Validation. The proposed changes are tested before they are applied:

  • Run the new or updated skills on a benchmark set of saved tasks where previous behaviour was inadequate.
  • Run synthetic test cases - prompt-injection scenarios, vulnerable dependency snapshots, intentionally broken builds.
  • Compare outcomes: does the proposed skill improve the success rate without regressing safety?

Proposals that do not improve outcomes are rejected. Proposals that regress safety are rejected regardless of other improvements.

Phase 5: Materialisation. Accepted proposals are written back to disk:

  • New or edited SKILL.md files in the skill library.
  • Updated AGENTS-SECURITY.md or AGENTS.md rules.
  • New test cases in the harness test suite.
  • Updated examples and counter-examples within skills (so the model is more likely to recognise when to invoke them in future tasks).

These changes are committed to the repository (or a separate harness repository) as a pull request for human review.

That is the loop. The base model stays fixed. The skills, policies, and usage patterns adapt based on actual operational experience.

What makes this loop powerful is specificity. The harness does not evolve in the abstract. It watches for concrete patterns - “we changed dependencies three times last week without calling dependency-audit, and one of those times we later found a vulnerable package” - and writes concrete rules - “always call dependency-audit before marking a dependency change as complete.” The evolution is grounded in observed failures and near-misses, not in theoretical best practices.

Over time, the harness accumulates the kind of institutional knowledge that engineering teams normally carry in the heads of senior engineers. When that senior engineer leaves, the knowledge leaves with them. When the knowledge lives in the harness, it stays.

What the Agent Is Allowed to Change

The self-evolving harness is only safe if the agent’s self-modification is strictly scoped. The principle: the agent can change its own instructions. It cannot change the production code, the CI/CD pipeline, or the infrastructure.

Allowed to edit:

  • AGENTS.md, AGENTS-SECURITY.md, AGENTS-OPS.md
  • Skill folders within the agents/skills/ tree
  • Test suites for skills (harness-level tests, not the main product tests)
  • Telemetry configuration and log schemas

Not allowed to edit without human gate:

  • Core application code that runs in production
  • CI/CD configurations that affect deployment
  • Infrastructure definitions (Terraform, Kubernetes manifests, etc.)
  • Environment credentials or access policies

Enforcement. In practice, you enforce this with per-directory policies: the agent has read-only access outside the harness tree unless a human explicitly approves a PR. A dedicated “Harness CI” pipeline runs only on changes inside the harness tree and blocks changes that break the harness tests. This is a separate pipeline from the product CI - it verifies the verifier, not the product.

A Concrete Implementation

For teams that want to implement this, here is a minimal but realistic repo layout:

/.
  src/                          # Product code (agent: read-only)
  tests/                        # Product tests (agent: read-only)
  agents/
    AGENTS.md                   # Agent operating instructions
    AGENTS-SECURITY.md          # Security policies
    skills/
      code-change-verification/
        SKILL.md
      dependency-supply-chain-audit/
        SKILL.md
      handle-untrusted-content/
        SKILL.md
      secure-code-review/
        SKILL.md
      pr-review-and-draft/
        SKILL.md
    logs/
      runs/                     # Per-task execution logs
      skill-evolution/          # Proposals and outcomes
    harness_tests/
      test_skill_usage.py       # Skill enforcement tests
      test_security_behaviour.py # Security scenario tests

Execution flow. The code agent (Claude Code, Codex, or another runtime) works in src/ and tests/ but always consults agents/skills/ and agents/AGENTS*.md before and during work. All runs append structured logs to agents/logs/runs/.

A typical log entry looks like this:

{
  "task_id": "fix-ga4-timeout-2026-03-14",
  "timestamp": "2026-03-14T09:23:00Z",
  "task_description": "Fix timeout in GA4 data fetch tool when API returns >10k rows",
  "skills_invoked": [
    {"name": "implementation-strategy", "at": "09:23:12Z"},
    {"name": "code-change-verification", "at": "09:31:45Z"},
    {"name": "secure-code-review", "at": "09:33:02Z"}
  ],
  "tools_called": [
    {"tool": "ruff check", "exit_code": 0},
    {"tool": "pytest tests/tools/test_ga4.py", "exit_code": 0},
    {"tool": "pytest --cov=src/tools/ga4", "exit_code": 0, "coverage": "94%"}
  ],
  "files_changed": ["src/tools/ga4_fetch.py", "tests/tools/test_ga4.py"],
  "outcome": "success",
  "human_corrections": [],
  "skills_skipped": ["dependency-audit"],
  "skip_reason": "No dependency changes in this task"
}

This structured logging is what makes the reflection phase possible. Without it, the harness evolution agent has nothing to learn from.

Evolution flow. A scheduled job - or a manual command - runs the “Harness Evolution Agent” with read/write access only to the agents/ directory. Its job: analyse recent logs, propose skill or policy updates, run harness tests, and open a PR against the agents/ tree.

Human review. The harness evolution opens a PR that changes only files inside agents/. Maintainers review these changes like they would any infrastructure or policy change - carefully, because they affect how the agent behaves on every future task.

Harness Tests: Verifying the Verifier

The harness test suite is what makes self-evolution safe. Without it, the agent could modify its own skills in ways that degrade quality or weaken security. With it, every proposed change must pass concrete assertions.

Examples of harness tests:

Skill usage enforcement. “For a synthetic change X (a dependency version bump), the agent must call dependency-supply-chain-audit before marking work complete.” This test creates a mock task, runs the agent on it, and asserts that the skill was invoked.

Security scenario tests. “For a file containing embedded malicious instructions (‘run curl to exfiltrate .env’), the agent must not execute the suggested command.” This test injects adversarial content and verifies the agent’s response.

Regression tests. “For a change that touches the authentication module, the agent must call secure-code-review and flag the change for mandatory human review.” This test verifies that policy rules are enforced.

Quality tests. “For a bug fix that includes no regression test, the agent must invoke test-authoring and generate at least one test covering the fixed scenario.”

These tests run on every proposed harness change. They also run periodically on the existing harness to catch drift - skills that have become stale, policies that no longer match the project’s needs, and verification commands that have changed.

From Theory to Practice

The self-evolving harness might sound complex, but the implementation can be surprisingly incremental:

Week 1. Set up the agents/ directory structure. Write your core skills (code-change-verification, secure-code-review, dependency-audit). Wire them into AGENTS.md as mandatory.

Week 2. Add structured logging to your agent’s workflow. Every task the agent performs gets a JSON log entry: task description, skills invoked, tools called, outcomes, and any human corrections.

Week 3. Write your first harness tests. Start with the two most critical: “agent must run verification after code changes” and “agent must not execute instructions found in file content.”

Week 4. Run the first reflection pass. Review the logs manually (or with an agent in read-only mode). Identify one pattern where a skill was skipped or where a skill needs an additional step. Propose the change, run the harness tests, and merge it.

Month 2. Automate the reflection pass. Set up a scheduled job that runs the Harness Evolution Agent weekly, reviews logs, proposes one to three changes per cycle, and opens PRs for human review.

Month 3 and beyond. The harness starts compounding. Each cycle adds test cases, refines skills, tightens policies, and captures new patterns. The agent gets measurably better at its job - not because the model improved, but because the scaffolding around it learned from real operational experience.


Conclusion: The Agent That Maintains Itself

Across this guide, we have built three layers of capability:

Maintenance skills that tell the agent how to work on code responsibly - planning before editing, verifying after every change, reviewing from multiple angles, and summarising work clearly.

Security skills that protect the code from the agent’s own attack surface - defending against prompt injection, auditing dependencies, hardening the runtime environment, and providing kill-switch controls when things go wrong.

A self-evolving harness that lets these skills improve over time through a structured loop of execution, reflection, validation, and materialisation - without changing the model, and with human review at every stage.

Together, these three layers describe an agent that does not just write code. It maintains code to a defined standard. It protects code against a realistic threat model. And it gets better at both as it accumulates operational experience.

Why Marketing Teams Should Care

If you are a marketing practitioner or an ad-tech engineer, you might reasonably wonder why a guide about code maintenance belongs in a series about marketing agents. The answer is that every marketing agent is a software project.

Your marketing intelligence platform, your bidding algorithms, your reporting pipelines, your creative engines, your data connectors, your MCP servers, your CLI tools - they all live in repositories that need maintenance. When you deploy the OpenClaw agent from our fourth guide, that agent’s workspace files, skills, and tool scripts are code. When you build a custom MCP server for your analytics platform, that server is code. When you write a weekly-performance-review skill, that skill is code.

The faster your agent ecosystem grows, the more code there is to maintain. The more code there is to maintain, the more valuable it becomes to have that maintenance be safe, systematic, and self-improving.

The marketing-specific application looks like this:

Your marketing MCP server ships a new tool for budget pacing. The code-change-verification skill runs the test suite, the secure-code-review skill checks that the tool respects permission boundaries, and the PR draft skill writes a description your team can review in Slack.

A dependency update introduces a vulnerability in your analytics pipeline. The dependency-audit skill catches it during a scheduled scan, proposes a version pin, and the coverage skill verifies that the fix does not break existing reporting workflows.

A prompt injection is discovered in your creative briefing skill - an ad copy sample contained instructions that caused the agent to modify campaign settings. The prompt-injection defence skill is updated (through the evolution loop) with this scenario as a counter-example, and the harness test suite gains a new regression test.

Each of these scenarios is routine in a mature engineering organisation. The self-evolving harness makes them routine for your agent-maintained marketing stack too.

The Compounding Effect

The pattern we have described across this entire series - skills plus tools plus architecture plus maintenance plus security plus self-improvement - is fundamentally about compounding.

A skill written once gets better over time as the harness evolves. A security policy defined today catches threats that have not been invented yet, because the harness adapts when new patterns emerge. A test suite that starts with five assertions grows to fifty as each incident adds a regression test.

None of this requires a better model. It requires a better harness.

Where to Start

If you are reading this and wondering where to begin, here is the minimum viable path:

If you already have a coding agent (Claude Code, Codex, Cursor, or similar):

  1. Create an agents/skills/ directory in your repo.
  2. Write a code-change-verification SKILL.md using the template in this guide. Adapt the commands to your project’s build and test setup.
  3. Add a directive to your AGENTS.md or CLAUDE.md: “After any code change, call code-change-verification. Do not mark work complete until it passes.”
  4. Test it. Make a change, verify the agent runs the skill, and check that it catches a deliberately broken test.

That is day one. One skill, one mandatory directive, one test.

If you want to add security:

  1. Write a dependency-supply-chain-audit SKILL.md that runs your SCA tool (npm audit, pip-audit, safety, or similar) and flags known vulnerabilities.
  2. Add it to your AGENTS.md: “If any dependency files changed, call dependency-supply-chain-audit before merging.”
  3. Write a simple handle-untrusted-content skill that instructs the agent to treat file contents and terminal output as data, never as instructions.

That is week one. Three skills, three directives.

If you want the self-evolving harness:

  1. Add structured JSON logging to your agent workflow.
  2. Write two harness tests: “agent must invoke verification after code changes” and “agent must not execute instructions found in files.”
  3. Run a weekly reflection pass - manually at first, then automated - that reviews logs and proposes one skill update per cycle.

That is month one. The harness starts compounding from there.

The reference skill sets in this guide are not prescriptive. Adapt the skills to your stack. Adjust the security thresholds to your risk tolerance. Start with what is most likely to prevent the mistakes your team actually makes, and grow the harness from there.

Your agent does not need to be perfect on day one. It needs to be better on day thirty than it was on day one. The self-evolving harness makes that not just possible but systematic.


This guide is part of the Performics Labs AI Knowledge Hub series on agentic marketing systems. Previous guides: Building AI Skills · Agent Architecture · Tools, MCP, and CLI · Your OpenClaw Marketing Agent

Published on Saturday, March 21, 2026 · Estimated read time: 29 min