fix: bound parent memory discovery

Generated with https://github.com/Yeachan-Heo/gajae-code Co-authored-by: Gajae Code <dev@gajae-code.com>
2026-07-20 05:48:27 +08:00 · 2026-06-04 17:07:00 +09:00
parent 5b22bc0480
commit 10fe72498a
6 changed files with 264 additions and 25 deletions
@@ -6392,7 +6392,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 438. **DONE — memory discovery loads `CLAUDE.md`, `CLAW.md`, and `AGENTS.md` with structured provenance** — fixed 2026-06-04 in `fix: load Claw and Agents memory files`. Project memory discovery now checks root instruction files in `CLAUDE.md`, `CLAW.md`, then `AGENTS.md` order for each discovered directory, preserves existing scoped `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, `.claw/instructions.md`, and rules-directory imports, and exposes each loaded file's `path`, `source`, `chars`, and `contributes` in `status --output-format json` as `workspace.memory_files[]`. `system-prompt --output-format json` returns the same memory metadata alongside the rendered `message`/`sections`, and all non-duplicate loaded files contribute to the prompt so CLAUDE/CLAW/AGENTS markers are visible together. `claw doctor --output-format json` now includes a dedicated `memory` check with loaded memory metadata and `unloaded_memory_files[]` warnings for present `CLAW.md`/`AGENTS.md` candidates that were skipped (for example empty or duplicate-content variants). Docs in `USAGE.md` and `rust/README.md` describe the priority and JSON contracts. Regression coverage: `discovers_claude_claw_agents_and_dot_claude_instruction_files_together`, `memory_files_load_claude_claw_agents_and_surface_json_438`, and `memory_health_surfaces_loaded_and_unloaded_files_438`.


-439. **Memory file discovery walks ALL ancestor directories up to `$HOME` boundary, silently loading any `CLAUDE.md` it finds — `/tmp/CLAUDE.md` left from a previous test silently bleeds into every project under `/tmp/*/`; no `--no-parent-memory` flag, no `.no-claude-md-boundary` marker file to limit discovery scope** — dogfooded 2026-05-11 by Jobdori on `f4a96740` in response to Clawhip pinpoint nudge at `1503335892461293675`. Reproduction: create three nested `CLAUDE.md` files with unique markers — `/tmp/claw-nested-probe/CLAUDE.md` (`PARENT_CLAUDE`), `subproj/CLAUDE.md` (`CHILD_CLAUDE`), `subproj/deep/CLAUDE.md` (`DEEP_CLAUDE`). Run `claw system-prompt --output-format json` from `subproj/deep/nest/` (note: `nest` has no `CLAUDE.md`). The `message` field contains **all three markers** (PARENT + CHILD + DEEP) and `status --output-format json` reports `memory_file_count: 3`. Boundary tests: (a) `$HOME/CLAUDE.md` is NOT picked up from `/tmp/no-claude-dir` (discovery stops at `$HOME` boundary, good); (b) From `/tmp/deep` (no nested CLAUDE.md), `/tmp/CLAUDE.md` IS picked up (count: 1); (c) git-root is NOT a discovery boundary — running from a git subdir still walks above the git root. **Ambient-context-bleed footgun:** any stale `/tmp/CLAUDE.md` (or `/home/<user>/projects/CLAUDE.md`, or any ancestor-path CLAUDE.md left over from a previous experiment, copy-paste, or AI-generated example) silently bleeds into every workspace nested below it. The user has no signal in `status --output-format json` indicating which ancestor file is contributing — only the aggregate `memory_file_count`. **Three required fixes:** (a) **expose discovery list**: `status --output-format json` and `system-prompt --output-format json` must include `memory_files:[{path, source:"workspace"|"ancestor"|"parent_dir"|"home", chars, contributes:bool}]` so users can see what's leaking in; (b) **add `--no-parent-memory` flag** to limit discovery to cwd only (no ancestor walk), or add a boundary marker (`.claude-no-walk`, `.claw-root`, or honor `.git` as the boundary by default — most users expect repo-root scope); (c) **`doctor` warns** when ancestor `CLAUDE.md` files are loaded from outside the current git repo (suggests they may be unintentional). **Sibling discovery scope question:** discovery walks up to `$HOME` — but for a user with a project at `/Users/foo/work/proj`, that's `/Users/foo/work/CLAUDE.md` + `/Users/foo/CLAUDE.md` (if it exists) both load. The home boundary is exclusive, but the entire `/Users/foo` tree under home is in scope. **Why this matters:** test workspaces, scratch dirs, AI-generated example projects, and shared `/tmp` workdirs are full of stale `CLAUDE.md` files. The current discovery rule means every claw invocation can silently inherit context from arbitrary ancestor paths. Cross-references #438 (memory discovery only finds CLAUDE.md, not AGENTS.md or CLAW.md), #421 (cwd canonicalization leak — the canonicalized form determines which ancestor walk path is used). Source: Jobdori live dogfood, `f4a96740`, 2026-05-11.
+439. **DONE — memory discovery is git-root bounded and reports memory origins** — fixed 2026-06-04 in `fix: bound parent memory discovery`. Project memory discovery now walks only from the current directory up to the nearest git root when one exists, and otherwise stays cwd-local, so stale parent `CLAUDE.md` files outside the project no longer bleed into scratch workspaces. Loaded memory JSON in both `status --output-format json` and `system-prompt --output-format json` now includes `origin`, `scope_path`, and `outside_project` alongside `path`, `source`, `chars`, and `contributes`; origins distinguish `workspace`, `parent_dir`, `ancestor`, `home`, and `outside_project`. The `memory` doctor check warns if an outside-project memory file is ever loaded, while still listing loaded and skipped memory candidates structurally. Docs in `USAGE.md` and `rust/README.md` describe the git-root boundary and expanded JSON fields. Regression coverage: `discovery_stops_at_git_root_boundary_439`, `discovery_without_git_root_stays_cwd_local_439`, and `memory_discovery_stops_at_git_root_and_reports_origins_439`.


 440. **One invalid `mcpServers` entry blocks ALL OTHER valid MCP servers from loading — `mcp list --output-format json` returns `configured_servers: 0, servers: []` when even one server has a missing/invalid `command` field, despite other servers in the same config being well-formed; sibling: config parser halts on first invalid entry, never reports the remaining invalid entries** — dogfooded 2026-05-11 by Jobdori on `bd126905` in response to Clawhip pinpoint nudge at `1503343442904879156`. Reproduction: write `.claw.json` containing six `mcpServers` entries — one valid (`valid-server: {command:"/bin/echo", args:["hello"]}`) and five with progressive defects (missing-command, empty-command, null-command, wrong-type-command, extra-unknown-field). Run `claw mcp list --output-format json` → `{"action":"list","config_load_error":"/private/tmp/claw-mcp-probe/.claw.json: mcpServers.missing-command-server: missing string field command","configured_servers":0,"kind":"mcp","servers":[],"status":"degraded"}`. The error mentions only `missing-command-server` (the first invalid entry in JSON-object iteration order); the other four invalid entries are never surfaced. The valid `valid-server` entry is silently dropped because the parser bails on the first error. `status --output-format json` correctly propagates the same `config_load_error` and sets `status:"degraded"`, but no field tells automation which servers are valid vs broken — `servers:[]` is the only signal. **Three problems compounded:** (a) **all-or-nothing loading**: ROADMAP product principle #5 says "partial success is first-class," but mcp config loading is binary. One bad server kills the entire MCP plane; (b) **first-error-only reporting**: a `.claw.json` with five invalid entries surfaces only one error message — the user fixes that one and runs again, gets the next error, and so on. Five iterations needed to discover all errors; (c) **no per-server status**: even with the partial-success fix, the JSON envelope needs `servers:[{name, valid:bool, error?, command?, args?}]` so automation can see which entries are usable. **Required fix shape:** (a) the MCP config parser must collect ALL invalid entries into an `invalid_servers:[{name, error_field, reason}]` array and load all valid ones into `servers:[]`; do not abort on first error; (b) `configured_servers` reflects the count of *valid* loaded servers (not zero) when there are valid entries alongside invalid ones; (c) expose `total_configured:int` (count of entries in source `.claw.json`) AND `valid_count:int` (loaded), AND `invalid_count:int` (rejected) — three distinct counts; (d) `doctor --output-format json` adds an `mcp_validation` check that lists each invalid entry with its error message; (e) regression test: `.claw.json` with one valid + one invalid entry results in `configured_servers: 1, invalid_servers: [{name:"...", reason:"..."}]`. **Why this matters:** users iterate on MCP server lists during onboarding — one typo kills the entire plane, including servers they got working previously. The first-error-only reporting forces N iterations through N invalid entries instead of a single fix-everything-at-once pass. Cross-references #407 (config files no load_error per-file), #415 (config section merged_keys count only), #416 (plugins list prose), #428 (default permission mode), and Product Principle #5. Source: Jobdori live dogfood, `bd126905`, 2026-05-11.
@@ -51,7 +51,7 @@ cd rust
 ```

 **Note:** Diagnostic verbs (`doctor`, `status`, `sandbox`, `version`) support `--output-format json` for machine-readable output. Invalid suffix arguments (e.g., `--json`) are now rejected at parse time rather than falling through to prompt dispatch.
-`version --output-format json` reports structured build provenance including full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; JSON keeps the prose report in `human_readable` instead of duplicating it under `message`. `status --output-format json` exposes `workspace.memory_files[]` with `path`, `source`, `chars`, and `contributes` for every loaded project memory file.
+`version --output-format json` reports structured build provenance including full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; JSON keeps the prose report in `human_readable` instead of duplicating it under `message`. `status --output-format json` exposes `workspace.memory_files[]` with `path`, `source`, `origin`, `scope_path`, `outside_project`, `chars`, and `contributes` for every loaded project memory file.

 ### Initialize a repository

@@ -599,7 +599,7 @@ In addition to root instruction files such as `CLAUDE.md`, `CLAW.md`, `AGENTS.md
 - `<repo>/.claw/rules/` (`.md`, `.txt`, `.mdc`) for shared project rules.
 - `<repo>/.claw/rules.local/` for personal local rules; this path is gitignored.

-Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md` for each discovered directory. All loaded files contribute to the system prompt and to `status --output-format json` as `workspace.memory_files:[{path, source, chars, contributes}]`; `claw doctor --output-format json` includes a `memory` check so automation can detect loaded and unexpected unloaded memory-file candidates without parsing prompt text.
+Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md` for each discovered directory. Discovery is bounded to the current git root when one exists, otherwise to the current directory only, so stale parent files outside the project do not silently bleed into the prompt. All loaded files contribute to the system prompt and to `status --output-format json` as `workspace.memory_files:[{path, source, origin, scope_path, outside_project, chars, contributes}]`; `claw doctor --output-format json` includes a `memory` check so automation can detect loaded and unexpected unloaded memory-file candidates without parsing prompt text.

 By default, `claw` also imports detected rules from common AI coding tools such as Cursor (`.cursorrules`, `.cursor/rules/`), GitHub Copilot (`.github/copilot-instructions.md`), Windsurf, Plandex, and Crush. Control this with `rulesImport` in any settings file:

@@ -149,7 +149,7 @@ Top-level commands:
 `claw acp` is a local discoverability surface for editor-first users: it reports the current ACP/Zed status without starting the runtime. As of April 16, 2026, claw-code does **not** ship an ACP/Zed daemon or JSON-RPC entrypoint yet, and `claw acp serve` is only a status alias until the real protocol surface lands. Status queries exit 0 and expose the same machine-readable contract via `--output-format json`; malformed ACP invocations exit 1 with `kind: unsupported_acp_invocation`.
 `--output-format` accepts `text` or `json` in any casing. `CLAW_OUTPUT_FORMAT=json` selects JSON as the default for non-interactive commands, explicit flags override it, repeated flags warn on stderr, and status JSON exposes `format_source`, `format_raw`, and `format_overridden`. Help and doctor output also surface `CLAW_LOG` / `RUST_LOG` as the logging environment knobs.
 `claw version --output-format json` is the provenance probe for automation: it reports full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; the text report is available as `human_readable` instead of a duplicate `message` field.
-`status --output-format json` reports loaded project memory files under `workspace.memory_files[]` with each file's `path`, `source` (`claude_md`, `claw_md`, `agents_md`, or scoped/rule sources), `chars`, and `contributes`; `claw doctor --output-format json` includes a dedicated `memory` check. Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md`, and all non-duplicate loaded files contribute to the rendered system prompt.
+`status --output-format json` reports loaded project memory files under `workspace.memory_files[]` with each file's `path`, `source` (`claude_md`, `claw_md`, `agents_md`, or scoped/rule sources), `origin`, `scope_path`, `outside_project`, `chars`, and `contributes`; `claw doctor --output-format json` includes a dedicated `memory` check. Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md`, discovery is bounded to the current git root when present (otherwise cwd only), and all non-duplicate loaded files contribute to the rendered system prompt.
 Shorthand prompt mode honors the POSIX `--` end-of-flags separator, so `claw -- "-prompt-with-dash"` and unknown dash-prefixed non-flag text stay on the prompt path instead of being treated as CLI options.
 `claw dump-manifests` is self-contained: it emits the Rust resolver inventory for the selected workspace (commands, tools, agents, skills, and bootstrap phases) without requiring an upstream Claude Code TypeScript checkout. Use `--manifests-dir PATH` only to scope resolver discovery to another directory.

@@ -290,12 +290,7 @@ fn discover_instruction_files(
    cwd: &Path,
    rules_import: &RulesImportConfig,
 ) -> std::io::Result<Vec<ContextFile>> {
-    let mut directories = Vec::new();
-    let mut cursor = Some(cwd);
-    while let Some(dir) = cursor {
-        directories.push(dir.to_path_buf());
-        cursor = dir.parent();
-    }
+    let mut directories = instruction_discovery_dirs(cwd);
    directories.reverse();

    let mut files = Vec::new();
@@ -318,6 +313,32 @@ fn discover_instruction_files(
    Ok(dedupe_instruction_files(files))
 }

+fn instruction_discovery_dirs(cwd: &Path) -> Vec<PathBuf> {
+    let boundary = nearest_git_root(cwd).unwrap_or_else(|| cwd.to_path_buf());
+    let mut directories = Vec::new();
+    let mut cursor = Some(cwd);
+    while let Some(dir) = cursor {
+        directories.push(dir.to_path_buf());
+        if dir == boundary {
+            break;
+        }
+        cursor = dir.parent();
+    }
+    directories
+}
+
+fn nearest_git_root(cwd: &Path) -> Option<PathBuf> {
+    let mut cursor = Some(cwd);
+    while let Some(dir) = cursor {
+        let git_marker = dir.join(".git");
+        if git_marker.is_dir() || git_marker.is_file() {
+            return Some(dir.to_path_buf());
+        }
+        cursor = dir.parent();
+    }
+    None
+}
+
 fn push_context_file(files: &mut Vec<ContextFile>, path: PathBuf) -> std::io::Result<()> {
    if path.is_dir() {
        return Ok(());
@@ -812,6 +833,7 @@ mod tests {
        let root = temp_dir();
        let nested = root.join("apps").join("api");
        fs::create_dir_all(nested.join(".claw")).expect("nested claw dir");
+        fs::create_dir(root.join(".git")).expect("git boundary");
        fs::write(root.join("CLAUDE.md"), "root instructions").expect("write root instructions");
        fs::write(root.join("CLAUDE.local.md"), "local instructions")
            .expect("write local instructions");
@@ -926,6 +948,7 @@ mod tests {
        let root = temp_dir();
        let nested = root.join("apps").join("api");
        fs::create_dir_all(&nested).expect("nested dir");
+        fs::create_dir(root.join(".git")).expect("git boundary");
        fs::write(root.join("CLAUDE.md"), "same rules\n\n").expect("write root");
        fs::write(nested.join("CLAUDE.md"), "same rules\n").expect("write nested");

@@ -938,6 +961,50 @@ mod tests {
        fs::remove_dir_all(root).expect("cleanup temp dir");
    }

+    #[test]
+    fn discovery_stops_at_git_root_boundary_439() {
+        let root = temp_dir();
+        let repo = root.join("repo");
+        let nested = repo.join("subproj").join("deep").join("nest");
+        fs::create_dir_all(&nested).expect("nested dir");
+        fs::create_dir(repo.join(".git")).expect("git boundary");
+        fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
+        fs::write(repo.join("CLAUDE.md"), "REPO_CLAUDE").expect("write repo");
+        fs::write(repo.join("subproj").join("CLAUDE.md"), "CHILD_CLAUDE").expect("write child");
+        fs::write(
+            repo.join("subproj").join("deep").join("CLAUDE.md"),
+            "DEEP_CLAUDE",
+        )
+        .expect("write deep");
+
+        let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
+        let rendered = render_instruction_files(&context.instruction_files);
+
+        assert!(!rendered.contains("PARENT_CLAUDE"));
+        assert!(rendered.contains("REPO_CLAUDE"));
+        assert!(rendered.contains("CHILD_CLAUDE"));
+        assert!(rendered.contains("DEEP_CLAUDE"));
+        assert_eq!(context.instruction_files.len(), 3);
+        fs::remove_dir_all(root).expect("cleanup temp dir");
+    }
+
+    #[test]
+    fn discovery_without_git_root_stays_cwd_local_439() {
+        let root = temp_dir();
+        let nested = root.join("scratch");
+        fs::create_dir_all(&nested).expect("nested dir");
+        fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
+        fs::write(nested.join("CLAUDE.md"), "SCRATCH_CLAUDE").expect("write scratch");
+
+        let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
+        let rendered = render_instruction_files(&context.instruction_files);
+
+        assert!(!rendered.contains("PARENT_CLAUDE"));
+        assert!(rendered.contains("SCRATCH_CLAUDE"));
+        assert_eq!(context.instruction_files.len(), 1);
+        fs::remove_dir_all(root).expect("cleanup temp dir");
+    }
+
    #[test]
    fn truncates_large_instruction_content_for_rendering() {
        let rendered = render_instruction_content(&"x".repeat(4500));
@@ -3493,6 +3493,11 @@ fn render_doctor_report(
        config.as_ref().ok(),
        config.as_ref().err().map(ToString::to_string).as_deref(),
    );
+    let memory_files = memory_file_summaries_for(
+        &cwd,
+        project_root.as_deref(),
+        &project_context.instruction_files,
+    );
    let context = StatusContext {
        cwd: cwd.clone(),
        session_path: None,
@@ -3502,10 +3507,11 @@ fn render_doctor_report(
            .map_or(0, |runtime_config| runtime_config.loaded_entries().len()),
        discovered_config_files: discovered_config.len(),
        memory_file_count: project_context.instruction_files.len(),
-        memory_files: memory_file_summaries(&project_context.instruction_files),
+        memory_files: memory_files.clone(),
        unloaded_memory_files: unloaded_memory_candidates(
            &cwd,
-            &memory_file_summaries(&project_context.instruction_files),
+            project_root.as_deref(),
+            &memory_files,
        ),
        project_root,
        git_branch,
@@ -4048,6 +4054,7 @@ fn check_workspace_health(context: &StatusContext) -> DiagnosticCheck {

 fn check_memory_health(context: &StatusContext) -> DiagnosticCheck {
    let has_unloaded = !context.unloaded_memory_files.is_empty();
+    let has_outside_project = context.memory_files.iter().any(|file| file.outside_project);
    let mut details = vec![format!("Loaded files     {}", context.memory_file_count)];
    details.extend(context.memory_files.iter().map(|file| {
        format!(
@@ -4064,18 +4071,22 @@ fn check_memory_health(context: &StatusContext) -> DiagnosticCheck {

    DiagnosticCheck::new(
        "Memory",
-        if has_unloaded {
+        if has_unloaded || has_outside_project {
            DiagnosticLevel::Warn
        } else {
            DiagnosticLevel::Ok
        },
-        if has_unloaded {
+        if has_outside_project {
+            "memory files outside the current git project are loaded".to_string()
+        } else if has_unloaded {
            "some workspace memory files exist but were not loaded".to_string()
        } else {
            format!("{} workspace memory files loaded", context.memory_file_count)
        },
    )
-    .with_hint(if has_unloaded {
+    .with_hint(if has_outside_project {
+        "Inspect workspace.memory_files in `claw status --output-format json`; move unintended ancestor instructions inside the git project or run from the intended workspace root."
+    } else if has_unloaded {
        "Move instructions into CLAUDE.md, CLAW.md, or AGENTS.md within the current workspace ancestry, or inspect workspace.memory_files in `claw status --output-format json`."
    } else {
        ""
@@ -4499,7 +4510,13 @@ fn print_system_prompt(
        "unknown",
        model_family_identity_for(model),
    )?;
-    let memory_files = memory_file_summaries(&project_context.instruction_files);
+    let (project_root, _) =
+        parse_git_status_metadata_for(&project_context.cwd, project_context.git_status.as_deref());
+    let memory_files = memory_file_summaries_for(
+        &project_context.cwd,
+        project_root.as_deref(),
+        &project_context.instruction_files,
+    );
    let message = sections.join(
        "

@@ -4759,6 +4776,9 @@ struct MemoryFileSummary {
    path: String,
    source: String,
    chars: usize,
+    origin: String,
+    scope_path: String,
+    outside_project: bool,
    contributes: bool,
 }

@@ -4768,34 +4788,103 @@ impl MemoryFileSummary {
            "path": self.path,
            "source": self.source,
            "chars": self.chars,
+            "origin": self.origin,
+            "scope_path": self.scope_path,
+            "outside_project": self.outside_project,
            "contributes": self.contributes,
        })
    }
 }

-fn memory_file_summaries(files: &[ContextFile]) -> Vec<MemoryFileSummary> {
+fn memory_file_summaries_for(
+    cwd: &Path,
+    project_root: Option<&Path>,
+    files: &[ContextFile],
+) -> Vec<MemoryFileSummary> {
+    let cwd = cwd.canonicalize().unwrap_or_else(|_| cwd.to_path_buf());
+    let project_root =
+        project_root.map(|path| path.canonicalize().unwrap_or_else(|_| path.to_path_buf()));
    files
        .iter()
-        .map(|file| MemoryFileSummary {
-            path: file.path.display().to_string(),
-            source: file.source().to_string(),
-            chars: file.char_count(),
-            contributes: true,
+        .map(|file| {
+            let path = file
+                .path
+                .canonicalize()
+                .unwrap_or_else(|_| file.path.clone());
+            let scope_path = memory_scope_path(&path);
+            let origin = memory_origin(&cwd, project_root.as_deref(), &scope_path);
+            let outside_project = project_root
+                .as_ref()
+                .is_some_and(|root| !path.starts_with(root));
+            MemoryFileSummary {
+                path: file.path.display().to_string(),
+                source: file.source().to_string(),
+                origin: origin.to_string(),
+                scope_path: scope_path.display().to_string(),
+                chars: file.char_count(),
+                outside_project,
+                contributes: true,
+            }
        })
        .collect()
 }

+fn memory_scope_path(path: &Path) -> PathBuf {
+    let Some(parent) = path.parent() else {
+        return PathBuf::from(".");
+    };
+    let parent_name = parent.file_name().and_then(|name| name.to_str());
+    if matches!(parent_name, Some(".claw" | ".claude")) {
+        return parent.parent().unwrap_or(parent).to_path_buf();
+    }
+    if matches!(parent_name, Some("rules" | "rules.local")) {
+        if let Some(grandparent) = parent.parent() {
+            if grandparent.file_name().and_then(|name| name.to_str()) == Some(".claw") {
+                return grandparent.parent().unwrap_or(grandparent).to_path_buf();
+            }
+        }
+    }
+    parent.to_path_buf()
+}
+
+fn memory_origin(cwd: &Path, project_root: Option<&Path>, scope_path: &Path) -> &'static str {
+    if scope_path == cwd {
+        return "workspace";
+    }
+    if project_root.is_some_and(|root| !scope_path.starts_with(root)) {
+        return "outside_project";
+    }
+    if let Some(home) = env::var_os("HOME").map(PathBuf::from) {
+        let home = home.canonicalize().unwrap_or(home);
+        if scope_path == home {
+            return "home";
+        }
+    }
+    if cwd.parent().is_some_and(|parent| parent == scope_path) {
+        return "parent_dir";
+    }
+    if cwd.starts_with(scope_path) {
+        return "ancestor";
+    }
+    "workspace"
+}
+
 fn memory_files_json(files: &[MemoryFileSummary]) -> Vec<serde_json::Value> {
    files.iter().map(MemoryFileSummary::json_value).collect()
 }

-fn unloaded_memory_candidates(cwd: &Path, files: &[MemoryFileSummary]) -> Vec<String> {
+fn unloaded_memory_candidates(
+    cwd: &Path,
+    project_root: Option<&Path>,
+    files: &[MemoryFileSummary],
+) -> Vec<String> {
    let mut loaded = files
        .iter()
        .map(|file| PathBuf::from(&file.path))
        .collect::<Vec<_>>();
    loaded.sort();

+    let boundary = project_root.unwrap_or(cwd);
    let mut missing = Vec::new();
    let mut cursor = Some(cwd);
    while let Some(dir) = cursor {
@@ -4805,6 +4894,9 @@ fn unloaded_memory_candidates(cwd: &Path, files: &[MemoryFileSummary]) -> Vec<St
                missing.push(candidate.display().to_string());
            }
        }
+        if dir == boundary {
+            break;
+        }
        cursor = dir.parent();
    }
    missing.sort();
@@ -8888,16 +8980,22 @@ fn status_context(
        runtime_config.as_ref().ok(),
        config_load_error.as_deref(),
    );
+    let memory_files = memory_file_summaries_for(
+        &cwd,
+        project_root.as_deref(),
+        &project_context.instruction_files,
+    );
    Ok(StatusContext {
        cwd: cwd.clone(),
        session_path: session_path.map(Path::to_path_buf),
        loaded_config_files,
        discovered_config_files,
        memory_file_count: project_context.instruction_files.len(),
-        memory_files: memory_file_summaries(&project_context.instruction_files),
+        memory_files: memory_files.clone(),
        unloaded_memory_files: unloaded_memory_candidates(
            &cwd,
-            &memory_file_summaries(&project_context.instruction_files),
+            project_root.as_deref(),
+            &memory_files,
        ),
        project_root,
        git_branch,
@@ -16447,6 +16545,9 @@ mod tests {
                memory_files: vec![super::MemoryFileSummary {
                    path: "/tmp/project/CLAUDE.md".to_string(),
                    source: "claude_md".to_string(),
+                    origin: "workspace".to_string(),
+                    scope_path: "/tmp/project".to_string(),
+                    outside_project: false,
                    chars: 42,
                    contributes: true,
                }],
@@ -16649,6 +16750,9 @@ mod tests {
            memory_files: vec![super::MemoryFileSummary {
                path: "/tmp/project/CLAUDE.md".to_string(),
                source: "claude_md".to_string(),
+                origin: "workspace".to_string(),
+                scope_path: "/tmp/project".to_string(),
+                outside_project: false,
                chars: 12,
                contributes: true,
            }],
@@ -1309,6 +1309,15 @@ fn memory_files_load_claude_claw_agents_and_surface_json_438() {
    assert!(memory_files
        .iter()
        .all(|file| file["contributes"].as_bool() == Some(true)));
+    assert!(memory_files
+        .iter()
+        .all(|file| file["origin"].as_str() == Some("workspace")));
+    assert!(memory_files
+        .iter()
+        .all(|file| file["scope_path"].as_str().is_some()));
+    assert!(memory_files
+        .iter()
+        .all(|file| file["outside_project"].as_bool() == Some(false)));

    let prompt =
        assert_json_command_with_env(&root, &["--output-format", "json", "system-prompt"], &envs);
@@ -1335,6 +1344,65 @@ fn memory_files_load_claude_claw_agents_and_surface_json_438() {
        .is_empty());
 }

+#[test]
+fn memory_discovery_stops_at_git_root_and_reports_origins_439() {
+    let root = unique_temp_dir("memory-boundary-439");
+    let repo = root.join("repo");
+    let nested = repo.join("subproj").join("deep").join("nest");
+    let config_home = root.join("config-home");
+    let home = root.join("home");
+    fs::create_dir_all(&nested).expect("nested dir should exist");
+    fs::create_dir_all(&config_home).expect("config home should exist");
+    fs::create_dir_all(&home).expect("home should exist");
+    Command::new("git")
+        .args(["init", "-q"])
+        .current_dir(&repo)
+        .output()
+        .expect("git init should launch");
+    fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
+    fs::write(repo.join("CLAUDE.md"), "REPO_CLAUDE").expect("write repo");
+    fs::write(repo.join("subproj").join("CLAUDE.md"), "CHILD_CLAUDE").expect("write child");
+    fs::write(
+        repo.join("subproj").join("deep").join("CLAUDE.md"),
+        "DEEP_CLAUDE",
+    )
+    .expect("write deep");
+    let envs = [
+        (
+            "CLAW_CONFIG_HOME",
+            config_home.to_str().expect("utf8 config home"),
+        ),
+        ("HOME", home.to_str().expect("utf8 home")),
+    ];
+
+    let status =
+        assert_json_command_with_env(&nested, &["--output-format", "json", "status"], &envs);
+    assert_eq!(status["workspace"]["memory_file_count"], 3);
+    let memory_files = status["workspace"]["memory_files"]
+        .as_array()
+        .expect("memory files");
+    let origins = memory_files
+        .iter()
+        .map(|file| file["origin"].as_str().expect("origin"))
+        .collect::<Vec<_>>();
+    assert_eq!(origins, vec!["ancestor", "ancestor", "parent_dir"]);
+    let serialized = serde_json::to_string(memory_files).expect("memory files serialize");
+    assert!(!serialized.contains("PARENT_CLAUDE"));
+    assert!(!serialized.contains(root.join("CLAUDE.md").to_str().expect("parent path")));
+
+    let prompt = assert_json_command_with_env(
+        &nested,
+        &["--output-format", "json", "system-prompt"],
+        &envs,
+    );
+    let message = prompt["message"].as_str().expect("prompt message");
+    assert!(!message.contains("PARENT_CLAUDE"));
+    assert!(message.contains("REPO_CLAUDE"));
+    assert!(message.contains("CHILD_CLAUDE"));
+    assert!(message.contains("DEEP_CLAUDE"));
+    assert_eq!(prompt["memory_files"][0]["origin"], "ancestor");
+}
+
 #[test]
 fn dump_manifests_and_init_emit_json_when_requested() {
    let root = unique_temp_dir("manifest-init-json");