帮助团队快速开展事故响应,完成分级、通报协调与无责复盘撰写。
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "incident-response" 技能: 1. 下载 https://raw.githubusercontent.com/anthropics/knowledge-work-plugins/main/engineering/skills/incident-response/SKILL.md 2. 保存为 ~/.claude/skills/incident-response/SKILL.md 3. 装好后重载技能,告诉我可以用了
我们有事故,生产环境大面积报错。请根据当前症状做初步分级,列出立即止损步骤、需要拉入的角色,以及接下来30分钟的响应节奏。
一份结构化事故响应方案,包含严重级别、处置优先级、协作分工与短期行动计划。
请根据以下进展起草一则事故状态更新:问题已定位到数据库连接池耗尽,服务已部分恢复,仍有15%请求失败,下一次更新时间为20分钟后。分别输出内部通报和面向客户的版本。
两版清晰一致的状态通报,说明现状、影响范围、已采取措施与下次更新时间。
事故已恢复,请根据以下信息撰写无责复盘:开始时间10:12,恢复时间11:03,根因是错误配置导致缓存击穿,影响支付接口超时。请输出时间线、影响评估、根因分析、处置过程、改进项和负责人建议。
一份完整的无责复盘文档,便于团队复盘学习并跟进改进行动。
If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.
Manage an incident from detection through postmortem.
/incident-response $ARGUMENTS
/incident-response new [description] # Start a new incident
/incident-response update [status] # Post a status update
/incident-response postmortem # Generate postmortem from incident data
If no mode is specified, ask what phase the incident is in.
┌─────────────────────────────────────────────────────────────────┐
│ INCIDENT RESPONSE │
├─────────────────────────────────────────────────────────────────┤
│ Phase 1: TRIAGE │
│ ✓ Assess severity (SEV1-4) │
│ ✓ Identify affected systems and users │
│ ✓ Assign roles (IC, comms, responders) │
│ │
│ Phase 2: COMMUNICATE │
│ ✓ Draft internal status update │
│ ✓ Draft customer communication (if needed) │
│ ✓ Set up war room and cadence │
│ │
│ Phase 3: MITIGATE │
│ ✓ Document mitigation steps taken │
│ ✓ Track timeline of events │
│ ✓ Confirm resolution │
│ │
│ Phase 4: POSTMORTEM │
│ ✓ Blameless postmortem document │
│ ✓ Timeline reconstruction │
│ ✓ Root cause analysis (5 whys) │
│ ✓ Action items with owners │
└─────────────────────────────────────────────────────────────────┘
| Level | Criteria | Response Time |
|---|---|---|
| SEV1 | Service down, all users affected | Immediate, all-hands |
| SEV2 | Major feature degraded, many users affected | Within 15 min |
| SEV3 | Minor feature issue, some users affected | Within 1 hour |
| SEV4 | Cosmetic or low-impact issue | Next business day |
Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is.
## Incident Update: [Title]
**Severity:** SEV[1-4] | **Status:** Investigating | Identified | Monitoring | Resolved
**Impact:** [Who/what is affected]
**Last Updated:** [Timestamp]
### Current Status
[What we know now]
### Actions Taken
- [Action 1]
- [Action 2]
### Next Steps
- [What's happening next and ETA]
### Timeline
| Time | Event |
|------|-------|
| [HH:MM] | [Event] |
## Postmortem: [Incident Title]
**Date:** [Date] | **Duration:** [X hours] | **Severity:** SEV[X]
**Authors:** [Names] | **Status:** Draft
### Summary
[2-3 sentence plain-language summary]
### Impact
- [Users affected]
- [Duration of impact]
- [Business impact if quantifiable]
### Timeline
| Time (UTC) | Event |
|------------|-------|
| [HH:MM] | [Event] |
### Root Cause
[Detailed explanation of what caused the incident]
### 5 Whys
1. Why did [symptom]? → [Because...]
2. Why did [cause 1]? → [Because...]
3. Why did [cause 2]? → [Because...]
4. Why did [cause 3]? → [Because...]
5. Why did [cause 4]? → [Root cause]
### What Went Well
- [Things that worked]
### What Went Poorly
- [Things that didn't work]
### Action Items
…
帮助开发者在 Web 视频场景中快速接入 Zoom 预置 React 通话界面。
生成带权重的销售预测,输出情景拆解、承诺分层与缺口分析。
帮助你选择并排查 Zoom OAuth 认证配置、权限范围与令牌刷新策略。
帮助团队进行产能规划、工作负载分析与资源利用预测,支持排期和招聘决策。
拆解财务差异成因,生成瀑布分析与管理层解读说明。
端到端处理客户投诉,整合上下文、起草回复并提出运营改进建议。
帮助团队安全分诊事故、提取证据并推进工单与通知流程。
帮助团队按既定顺序执行事故响应流程,并完整记录每一步操作日志。
用标签驱动的问题分诊流程,规范状态流转并生成可追溯分诊记录。
帮助用户对工作区内容条目进行初步分流与处理判断。
帮助团队快速分诊客服工单,判定优先级、归属团队与是否重复问题。
帮助用户对收件箱消息进行分流、跟进等待回复并生成后续摘要。