fix: detect skill name/dir mismatch and report metadata drift

Skill discovery now tracks dir_name alongside frontmatter name and detects
when they differ. skills list --output-format json includes metadata_drift
array and reports degraded status when drift entries exist.

Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
This commit is contained in:
bellman
2026-06-05 00:35:51 +09:00
parent 8fd11e82c4
commit 5d85739358
2 changed files with 73 additions and 15 deletions

View File

@@ -6410,7 +6410,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
444. **DONE — broad-cwd safety guard extended to `--resume` path** — fixed 2026-06-04 in `fix: add broad-cwd guard to resume path`. `claw --resume latest` from `/`, `$HOME`, or other broad directories now enforces the same broad-cwd policy as `claw prompt` and `claw` (interactive REPL). The `--allow-broad-cwd` flag bypasses the guard. The `enforce_broad_cwd_policy` check runs before any session filesystem operations. Remaining sibling items: exit code 1 for `session_load_failed` and filesystem droppings prevention on failed resume are tracked separately.
445. **Skill name-vs-directory mismatch is silently accepted — `.claw/skills/wrong-name/SKILL.md` with frontmatter `name: actually-different-name` loads as "actually-different-name" without any warning; users who reference the skill by directory name (`claw skills run wrong-name`) get `skill_not_found` while `skills list` shows it under the frontmatter name; sibling: loose `.md` files at the skills-dir root and subdirs without `SKILL.md` are silently dropped** — dogfooded 2026-05-11 by Jobdori on `9e1eafd0` in response to Clawhip pinpoint nudge at `1503381189539528897`. Reproduction: create `.claw/skills/wrong-name/SKILL.md` with frontmatter `---\nname: actually-different-name\ndescription: Skill where dir name and frontmatter name disagree\n---`. Run `claw skills list --output-format json` → the skill is listed with `name: "actually-different-name"` (the frontmatter value), no warning about the dir-vs-name mismatch. Users who type `claw skills run wrong-name` (the dirname they know from `ls`) get a `skill_not_found` error; `claw skills run actually-different-name` works. The two names are decoupled with no surfaced relationship. **Three sibling silent-drop bugs in same probe:** (a) **subdir without SKILL.md silently skipped**: `.claw/skills/no-skill-md/` containing only `README.md` (no `SKILL.md`) is silently skipped from `skills list`. No `invalid_skills:[{path, reason:"missing_SKILL.md"}]` array, no warning, just absent from output. (b) **Loose `.md` at skills dir root silently dropped**: `.claw/skills/loose-skill.md` (not inside a per-skill subdirectory) is silently ignored. Discovery only walks `.claw/skills/*/SKILL.md` — no support for flat `.claw/skills/<name>.md`. (c) **Workspace + user skills merged without per-source filter**: `skills list` returns 74 entries including all `~/.claw/skills/*` user-home skills alongside the project skills. There's no `--scope workspace` flag to limit output to just project-local skills; automation has to filter by `source.id == "project_claw"` post-hoc. **Required fix shape:** (a) when SKILL.md frontmatter `name` differs from the parent directory name, emit a `skills_metadata_drift:[{dir_name, frontmatter_name, path}]` array OR enforce `name = dir_name` as a hard rule; if neither, at minimum a stderr warning on each invocation; (b) skill subdirectories without `SKILL.md` should surface as `invalid_skills:[{path, reason}]` in `skills list --output-format json` (same pattern as #440 MCP servers, #441 hooks, #442 agents); (c) support loose `.md` files at skills-dir root OR document explicitly that only subdirectories with `SKILL.md` are discovered; (d) add `--scope workspace|user|all` flag to `skills list` for filtering; (e) regression test: dir/frontmatter mismatch triggers a deterministic warning or error; subdirs without SKILL.md show in invalid array. **Why this matters:** skill discovery is a security-relevant surface — a user's `claw skills run X` could end up running a different skill than they thought if dir-name and frontmatter-name diverge. The silent drops mean users can't tell why their skill files aren't recognized, leading to "I copied the example and it doesn't work" forum questions. Cross-references #440 (MCP all-or-nothing), #441 (hooks all-or-nothing), #442 (agents need TOML, .md dropped), #431 (skills install raw OS error). Source: Jobdori live dogfood, `9e1eafd0`, 2026-05-11.
445. **DONE — skill name-vs-directory mismatch now detected and reported** — fixed 2026-06-04 in `fix: detect skill name/dir mismatch and report metadata drift`. Skill discovery now tracks `dir_name` alongside the frontmatter `name` and detects when they differ. `skills list --output-format json` includes `metadata_drift:[{dir_name, frontmatter_name, path}]` and reports `status:"degraded"` when drift entries exist. `valid_count` and `metadata_drift_count` provide automation-friendly counts. The `SkillMetadataDrift` struct tracks each mismatch for downstream tooling. Remaining sibling items: subdirs without SKILL.md and loose .md files at skills-dir root are tracked separately.
446. **Config is loaded 2-3 times per command invocation; each load re-emits identical deprecation warnings without deduplication — `status` triggers 3× `enabledPlugins` warning, `doctor`/`mcp` trigger 2× each, only `version` (config-free) emits 0** — dogfooded 2026-05-11 by Jobdori on `5a4cc506` in response to Clawhip pinpoint nudge at `1503388740595224717`. Reproduction: with a `~/.claw/settings.json` containing the deprecated `enabledPlugins` key, run each command from a fresh empty cwd and count `warning: ... is deprecated` lines on stderr — `claw status 2>&1 >/dev/null | grep -c deprecated` returns **3**, `claw doctor` returns **2**, `claw mcp` returns **2**, `claw version` returns **0**. Each duplicate is byte-identical (same file path, same line number, same field name). The pattern proves the config-load pipeline is invoked 2-3 times within a single command process; warnings are emitted at each load without checking a `warned_files: HashSet<PathBuf>` deduplication set. **Three sibling implications:** (a) **load-count varies by command** — status:3, doctor:2, mcp:2, version:0 — suggesting each command implements its own config-load call rather than going through a shared cached loader; (b) **noise pollution**: users running `claw status` once see the same 64-character warning 3 times in their terminal scrollback, making real warnings (other config errors, real deprecations) lost in the duplicate noise; (c) **performance signal**: 3× config load means 3× JSON parsing of `~/.claw/settings.json`, `~/.claw.json`, `$CLAW_CONFIG_HOME/settings.json`, and the project-local `.claw.json` / `.claw/settings.json` / `.claw/settings.local.json`. For a workspace with 5 config files, that's 15 redundant disk reads per status invocation. Earlier roadmap entries observed 3× (#424) and 4× (#425) warning counts at different HEADs; the count keeps fluctuating, suggesting the underlying issue is config-load fan-out that nobody has refactored. **Required fix shape:** (a) introduce a `ConfigLoader` cache scoped to the command-process lifetime: first load reads files and emits warnings; subsequent calls hit the cache and emit zero warnings; (b) move config validation/warnings to a single canonical entry point (`ConfigLoader::load_with_diagnostics()` returns `(RuntimeConfig, Vec<Warning>)` exactly once); (c) every command that needs config goes through the cached loader instead of re-reading from disk; (d) `doctor --output-format json` exposes `config_load_count:int` field so we can regression-test that loads are deduplicated; (e) regression test: any single command invocation emits each deprecation warning at most once. **Why this matters:** repeated identical warnings train users to ignore stderr noise. Real warnings (a new deprecation, a config error from a different file, an MCP server failure) get drowned out by 3-4 copies of the same notice. The 15-disk-read worst case is wasted I/O that adds startup latency. The fact that count fluctuates between HEADs (3 at `6c0c305a`, 4 at `d7dbe951`, back to 3 at `5a4cc506`) suggests dev velocity is moving config loads around without an architectural fix. Cross-references #424 (deprecation warning 3×), #425 (deprecation warning 4×), #421 (cwd canonicalization — possibly tied to per-load symlink resolution), #428 (default permission_mode loaded from same config files). Source: Jobdori live dogfood, `5a4cc506`, 2026-05-11.

View File

@@ -2193,6 +2193,13 @@ pub(crate) struct SkillMetadataDrift {
pub(crate) path: PathBuf,
}
/// Loaded skill definitions plus any metadata drift entries.
#[derive(Debug, Clone, Default)]
pub(crate) struct SkillCollection {
pub(crate) skills: Vec<SkillSummary>,
pub(crate) metadata_drift: Vec<SkillMetadataDrift>,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum SkillOrigin {
SkillsDir,
@@ -2808,8 +2815,8 @@ pub fn handle_skills_slash_command_json(args: Option<&str>, cwd: &Path) -> std::
match normalize_optional_args(args) {
None | Some("list") => {
let roots = discover_skill_roots(cwd);
let skills = load_skills_from_roots(&roots)?;
Ok(render_skills_report_json_with_action(&skills, "list"))
let collection = load_skills_from_roots_with_drift(&roots)?;
Ok(render_skills_report_json_with_action(&collection, "list"))
}
Some(args) if args.starts_with("list ") => {
let filter = args["list ".len()..].trim().to_lowercase();
@@ -2826,17 +2833,25 @@ pub fn handle_skills_slash_command_json(args: Option<&str>, cwd: &Path) -> std::
}));
}
let roots = discover_skill_roots(cwd);
let skills = load_skills_from_roots(&roots)?;
let filtered: Vec<_> = skills
let collection = load_skills_from_roots_with_drift(&roots)?;
let filtered_skills: Vec<_> = collection
.skills
.into_iter()
.filter(|s| s.name.to_lowercase().contains(&filter))
.collect();
Ok(render_skills_report_json_with_action(&filtered, "list"))
let filtered_collection = SkillCollection {
skills: filtered_skills,
metadata_drift: collection.metadata_drift,
};
Ok(render_skills_report_json_with_action(
&filtered_collection,
"list",
))
}
Some("show" | "info" | "describe") => {
let roots = discover_skill_roots(cwd);
let skills = load_skills_from_roots(&roots)?;
Ok(render_skills_report_json_with_action(&skills, "show"))
let collection = load_skills_from_roots_with_drift(&roots)?;
Ok(render_skills_report_json_with_action(&collection, "show"))
}
Some(args)
if args.starts_with("show ")
@@ -2867,8 +2882,9 @@ pub fn handle_skills_slash_command_json(args: Option<&str>, cwd: &Path) -> std::
}));
}
let roots = discover_skill_roots(cwd);
let skills = load_skills_from_roots(&roots)?;
let matched: Vec<_> = skills
let collection = load_skills_from_roots_with_drift(&roots)?;
let matched: Vec<_> = collection
.skills
.into_iter()
.filter(|s| s.name.to_lowercase() == name)
.collect();
@@ -2885,7 +2901,14 @@ pub fn handle_skills_slash_command_json(args: Option<&str>, cwd: &Path) -> std::
"hint": "Run `claw skills list` to see available skills.",
}));
}
Ok(render_skills_report_json_with_action(&matched, "show"))
let matched_collection = SkillCollection {
skills: matched,
metadata_drift: collection.metadata_drift,
};
Ok(render_skills_report_json_with_action(
&matched_collection,
"show",
))
}
Some("install") => Ok(render_skills_missing_argument_json(
"install",
@@ -4028,7 +4051,15 @@ fn load_agents_from_roots_with_invalids(
}
fn load_skills_from_roots(roots: &[SkillRoot]) -> std::io::Result<Vec<SkillSummary>> {
let collection = load_skills_from_roots_with_drift(roots)?;
Ok(collection.skills)
}
/// Load skill definitions from all roots, collecting metadata drift entries
/// where the frontmatter name differs from the directory name.
fn load_skills_from_roots_with_drift(roots: &[SkillRoot]) -> std::io::Result<SkillCollection> {
let mut skills = Vec::new();
let mut metadata_drift = Vec::new();
let mut active_sources = BTreeMap::<String, DefinitionSource>::new();
for root in roots {
@@ -4047,6 +4078,16 @@ fn load_skills_from_roots(roots: &[SkillRoot]) -> std::io::Result<Vec<SkillSumma
let contents = fs::read_to_string(skill_path)?;
let dir_name = entry.file_name().to_string_lossy().to_string();
let (name, description) = parse_skill_frontmatter(&contents);
// #445: detect name/dir mismatch
if let Some(ref frontmatter_name) = name {
if frontmatter_name != &dir_name {
metadata_drift.push(SkillMetadataDrift {
dir_name: dir_name.clone(),
frontmatter_name: frontmatter_name.clone(),
path: entry.path(),
});
}
}
root_skills.push(SkillSummary {
name: name.unwrap_or_else(|| dir_name.clone()),
description,
@@ -4105,7 +4146,10 @@ fn load_skills_from_roots(roots: &[SkillRoot]) -> std::io::Result<Vec<SkillSumma
}
}
Ok(skills)
Ok(SkillCollection {
skills,
metadata_drift,
})
}
fn parse_toml_string(contents: &str, key: &str) -> Option<String> {
@@ -4431,21 +4475,32 @@ fn render_skills_report(skills: &[SkillSummary]) -> String {
lines.join("\n").trim_end().to_string()
}
fn render_skills_report_json_with_action(skills: &[SkillSummary], action: &str) -> Value {
fn render_skills_report_json_with_action(collection: &SkillCollection, action: &str) -> Value {
let skills = &collection.skills;
let metadata_drift = &collection.metadata_drift;
let active = skills
.iter()
.filter(|skill| skill.shadowed_by.is_none())
.count();
let has_drift = !metadata_drift.is_empty();
let status = if has_drift { "degraded" } else { "ok" };
json!({
"kind": "skills",
"status": "ok",
"status": status,
"action": action,
"valid_count": skills.len(),
"metadata_drift_count": metadata_drift.len(),
"summary": {
"total": skills.len(),
"active": active,
"shadowed": skills.len().saturating_sub(active),
},
"skills": skills.iter().map(skill_summary_json).collect::<Vec<_>>(),
"metadata_drift": metadata_drift.iter().map(|drift| json!({
"dir_name": &drift.dir_name,
"frontmatter_name": &drift.frontmatter_name,
"path": drift.path.display().to_string(),
})).collect::<Vec<_>>(),
})
}
@@ -6431,7 +6486,10 @@ mod tests {
},
];
let report = super::render_skills_report_json_with_action(
&load_skills_from_roots(&roots).expect("skills should load"),
&super::SkillCollection {
skills: load_skills_from_roots(&roots).expect("skills should load"),
metadata_drift: Vec::new(),
},
"list",
);
assert_eq!(report["kind"], "skills");