快速剖析新数据集的结构、质量与分布特征,辅助后续分析决策
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "explore-data" 技能: 1. 下载 https://raw.githubusercontent.com/anthropics/knowledge-work-plugins/main/data/skills/explore-data/SKILL.md 2. 保存为 ~/.claude/skills/explore-data/SKILL.md 3. 装好后重载技能,告诉我可以用了
请分析这份数据表的基本结构,包括行数、列数、字段类型、缺失率、重复记录情况,以及每列的主要取值分布,并指出明显的数据质量问题。
一份数据概况报告,包含结构统计、缺失与重复分析、异常值提示和质量风险总结。
帮我检查这份数据里哪些字段存在可疑值,例如异常极值、不合理日期、格式不一致或类别拼写混乱,并按风险高低排序说明。
按字段列出的异常排查结果,说明问题类型、示例值和优先处理建议。
基于这份数据的字段内容,帮我判断适合做分析的维度和指标,并建议可以优先探索的几个业务问题或图表方向。
一份分析规划建议,包含可用维度、核心指标、优先问题和推荐可视化方向。
If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.
Generate a comprehensive data profile for a table or uploaded file. Understand its shape, quality, and patterns before diving into analysis.
/explore-data <table_name or file>
If a data warehouse MCP server is connected:
If a file is provided (CSV, Excel, Parquet, JSON):
If neither:
Before analyzing any data, understand its structure:
Table-level questions:
Column classification — categorize each column as one of:
Run the following profiling checks:
Table-level metrics:
All columns:
Numeric columns (metrics):
min, max, mean, median (p50)
standard deviation
percentiles: p1, p5, p25, p75, p95, p99
zero count
negative count (if unexpected)
String columns (dimensions, text):
min length, max length, avg length
empty string count
pattern analysis (do values follow a format?)
case consistency (all upper, all lower, mixed?)
leading/trailing whitespace count
Date/timestamp columns:
min date, max date
null dates
future dates (if unexpected)
distribution by month/week
gaps in time series
Boolean columns:
true count, false count, null count
true rate
Present the profile as a clean summary table, grouped by column type (dimensions, metrics, dates, IDs).
Apply the quality assessment framework below. Flag potential problems:
After profiling individual columns:
…
围绕客户问题进行多来源调研与溯源,快速整理背景并支持准确回复。
通过 AI 数据分析代理探索并查询数据仓库,快速获得业务洞察。