Use when verifying that completed work actually works. Auto-surface during /verify mode, post-implementation review, or before claiming a task is done. Teaches the discipline of testing outcomes vs implementation, the unit/integration/smoke gradient, and what "done" actually means.
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "verification-discipline" 技能: 1. 下载 https://raw.githubusercontent.com/microsoft/amplifier-bundle-skills/main/skills/verification-discipline/SKILL.md 2. 保存为 ~/.claude/skills/verification-discipline/SKILL.md 3. 装好后重载技能,告诉我可以用了
Unit tests verify that code-as-written behaves as-written. Smoke and integration tests verify that the system achieves the intended outcome. Those are different questions. You need both.
"All unit tests pass" is necessary. It is rarely sufficient. A finding from the field: four consecutive integration-blocking bugs, all of which passed unit tests, all of which would have been caught by a five-minute smoke test on a fresh environment. The bugs were not exotic — they were the cost of declaring "done" too early.
Treat verification as a ladder. Skip a rung and you discover its bugs in production.
| Tier | What it verifies | Example |
|---|---|---|
| 1. Unit | Code does what I wrote it to do | pytest tests/unit/ |
| 2. Integration | Component pairs interact correctly | pytest tests/integration/, real DB |
| 3. Smoke / E2E | System achieves the user-visible outcome | Fresh DTU launch, run real pipeline, observe artifacts |
| 4. Production-equivalent | Real environment, real load, real data | Staging deployment, canary, replay traces |
Each tier catches bugs the tier below it cannot. Each tier costs more time than the tier below it. The economic choice is not "skip the expensive tiers." The economic choice is "spend five minutes on tier 3 to avoid five hours of rollback."
Before claiming a task is done, satisfy this checklist:
AGENTS.md and .github/PULL_REQUEST_TEMPLATE.md
are satisfied.If any box is unchecked, the work is not done. Say so, explicitly.
Different from classic TDD. TDD writes unit tests first. Tests-from-outcomes writes the outcome assertion first.
1. Before writing implementation, write down the user-observable outcome.
"After running this pipeline, events.jsonl contains a `branch_completed`
event for each branch and no `contract_violation` events."
2. Write a test asserting that outcome. The test runs the real pipeline,
inspects the real events.jsonl, checks the real conditions.
3. Implement code until the test passes.
Both patterns are valuable. Unit-level TDD verifies internal correctness. Outcome-level testing verifies that the system behaves as the user expects. Use both.
…
Guide for creating new Amplifier modules including protocol implementation, entry points, mount functions, and testing patterns. Use when creating new modules or understanding module architecture.
Python coding standards for Amplifier including type hints, async patterns, error handling, and formatting. Use when writing Python code for Amplifier modules.
Adapt a skill written for another AI coding assistant (Claude Code, Cursor, etc.) into a properly structured Amplifier SKILL.md file. Reads the source skill, identifies platform-specific conventions, researches the source platform if needed, and produces an Amplifier-native skill conforming to the Agent Skills specification with Amplifier extensions. Use when the user wants to adapt a skill, port a skill, convert a skill to amplifier, translate a skill, or has a SKILL.md from another platform they want to bring into Amplifier.
Use when your service needs authentication that works without friction locally but secures remote access, automatic TLS certificate setup, or token-based auth with auto-generation and localhost bypass.
Use when building a new CLI tool that needs one-line install via uv or npm, subcommand dispatch with a default action, or 3-tier config resolution (CLI flags, config file, hardcoded defaults).
Amplifier design philosophy using Linux kernel metaphor. Covers mechanism vs policy, module architecture, event-driven design, and kernel principles. Use when designing new modules or making architectural decisions.