用于统计分析数据分布、趋势、异常与显著性检验,辅助得出可靠结论
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "statistical-analysis" 技能: 1. 下载 https://raw.githubusercontent.com/anthropics/knowledge-work-plugins/main/data/skills/statistical-analysis/SKILL.md 2. 保存为 ~/.claude/skills/statistical-analysis/SKILL.md 3. 装好后重载技能,告诉我可以用了
请对这份问卷数据做统计分析:先给出描述性统计,再比较实验组和对照组的均值差异是否显著,说明应使用的检验方法、p 值含义,并用通俗语言解释结论。
包含描述性统计、显著性检验结果、方法说明和易懂结论的分析报告。
请分析这组月度销售数据,找出异常值和异常波动月份,说明你使用的异常检测方法,并判断这些异常是否可能影响整体趋势判断。
给出异常值列表、检测依据、趋势解读及对业务判断的影响说明。
我有广告投入和转化率两列数据,请计算它们的相关性,判断相关强度与方向,说明是否具有统计显著性,并提醒我相关不等于因果。
输出相关系数、显著性结果、关系解释以及分析限制说明。
Descriptive statistics, trend analysis, outlier detection, hypothesis testing, and guidance on when to be cautious about statistical claims.
Choose the right measure of center based on the data:
| Situation | Use | Why |
|---|---|---|
| Symmetric distribution, no outliers | Mean | Most efficient estimator |
| Skewed distribution | Median | Robust to outliers |
| Categorical or ordinal data | Mode | Only option for non-numeric |
| Highly skewed with outliers (e.g., revenue per user) | Median + mean | Report both; the gap shows skew |
Always report mean and median together for business metrics. If they diverge significantly, the data is skewed and the mean alone is misleading.
Report key percentiles to tell a richer story than mean alone:
p1: Bottom 1% (floor / minimum typical value)
p5: Low end of normal range
p25: First quartile
p50: Median (typical user)
p75: Third quartile
p90: Top 10% / power users
p95: High end of normal range
p99: Top 1% / extreme users
Example narrative: "The median session duration is 4.2 minutes, but the top 10% of users spend over 22 minutes per session, pulling the mean up to 7.8 minutes."
Characterize every numeric distribution you analyze:
Moving averages to smooth noise:
# 7-day moving average (good for daily data with weekly seasonality)
df['ma_7d'] = df['metric'].rolling(window=7, min_periods=1).mean()
# 28-day moving average (smooths weekly AND monthly patterns)
df['ma_28d'] = df['metric'].rolling(window=28, min_periods=1).mean()
Period-over-period comparison:
Growth rates:
Simple growth: (current - previous) / previous
CAGR: (ending / beginning) ^ (1 / years) - 1
Log growth: ln(current / previous) -- better for volatile series
Check for periodic patterns:
For business analysts (not data scientists), use straightforward methods:
Always communicate uncertainty. Provide a range, not a point estimate:
…
围绕客户问题进行多来源调研与溯源,快速整理背景并支持准确回复。
拆解财务差异成因,生成瀑布分析与管理层解读说明。