Skip to content

fs.glob: return in cache-seen check aborts sibling processing, makes test-fs-glob.mjs flaky #62897

@bartlomieju

Description

@bartlomieju

Version

Reproduced on Node.js v25.x (current main), present since the glob implementation was added.

Platform

Linux (ext4/tmpfs) -- most reproducible there due to non-deterministic readdir ordering, but the bug exists on all platforms.

Subsystem

fs

What steps will reproduce the bug?

The bug is in lib/internal/fs/glob.js, in both #addSubpatterns and #iterateSubpatterns. The cache.seen check inside the children iteration loop uses return, which exits the entire method instead of skipping just the current child:

for (let i = 0; i < children.length; i++) {
  // ...
  for (const index of pattern.indexes) {
    if (this.#cache.seen(entryPath, pattern, index) || this.#cache.seen(entryPath, pattern, index + 1)) {
      return; // <-- exits the entire method, skipping ALL remaining children
    }

This can be triggered with */../ patterns. Minimal reproduction:

const { mkdirSync, mkdtempSync, writeFileSync, globSync, rmSync } = require('fs');
const { tmpdir } = require('os');
const { join } = require('path');

const tmp = mkdtempSync(join(tmpdir(), 'glob-bug-'));

// a/b/c/d  (nested dirs)
// a/c/d/c/ (child "c" of a/c/d triggers ".." back to a/c, creating a duplicate queue entry)
// a/x      (file)
// a/z      (file)
const a = join(tmp, 'a');
mkdirSync(join(a, 'b', 'c', 'd'), { recursive: true });
mkdirSync(join(a, 'c', 'd', 'c'), { recursive: true });
writeFileSync(join(a, 'x'), '');
writeFileSync(join(a, 'z'), '');

// Run many times -- result varies depending on readdir ordering
for (let i = 0; i < 100; i++) {
  const results = globSync('a/**/../*', { cwd: tmp }).sort();
  for (const expected of ['a/b', 'a/c', 'a/x', 'a/z']) {
    if (!results.includes(expected)) {
      console.error(`Iteration ${i}: missing "${expected}" from results:`, results);
      process.exit(1);
    }
  }
}
console.log('ok');

rmSync(tmp, { recursive: true, force: true });

On Linux with ext4/tmpfs, this frequently fails with a/c or other entries missing. On macOS (APFS), readdir ordering is more stable so it may require more iterations or a different directory structure.

How often does it reproduce? Is there a required condition?

Depends on readdir ordering, which varies by filesystem. On Linux CI (ext4/tmpfs) it reproduces frequently. The required condition is a directory tree deep enough that the .. handler in the GLOBSTAR branch queues a path that was already queued by a parent directory's processing.

What is the expected behavior? Why is that the expected behavior?

globSync('a/**/../*', { cwd }) should return all entries reachable via **/../* regardless of readdir ordering. The cache.seen check should skip only the already-processed child, not abort processing of all remaining siblings.

What do you see instead?

Depending on readdir ordering, some entries in the parent directory are missing from the results. The test-fs-glob.mjs test case for a/!(symlink)/**/../* is affected -- a/c and a/symlink can be missing.

Additional information

I discovered this in Deno, due to flakiness when running test-fs-glob.mjs.

Root cause

When a pattern contains **/../*:

  1. Processing directory a/c with GLOBSTAR queues three items via the .. handler and #addSubpattern: a/c with {*}, a with {*}, and a/c/d with {**}
  2. The queue is LIFO (ArrayPrototypePop), so a/c/d is popped first. Its child c (directory) triggers the .. handler again, which adds a/c with {*} to subpatterns a second time
  3. This second a/c {*} gets pushed onto the queue after a {*}, so LIFO pops it before a {*}
  4. a/c {*} is processed, adding * to cache["a/c"]
  5. When a {*} is finally processed and iterates children of a/, child c has entryPath = "a/c" which is now in the cache. cache.seen returns true and return exits the method, skipping all remaining siblings (e.g., a/x, a/z, a/symlink)

Which siblings are skipped depends on readdir ordering, hence the flakiness.

Suggested fix

Remove the cache.seen check from the children iteration loop entirely:

-      for (const index of pattern.indexes) {
-        if (this.#cache.seen(entryPath, pattern, index) || this.#cache.seen(entryPath, pattern, index + 1)) {
-          return;
-        }
+      for (const index of pattern.indexes) {

The cache.add call at the top of #addSubpatterns/#iterateSubpatterns already prevents reprocessing and infinite recursion.
The children-level check is a redundant optimization that incorrectly prevents entries from being added to results when they
were previously traversed from a different context (e.g., a/b was recursed into via its own queue entry, but still needs to
appear as a result when the parent a matches * against its children).

This needs to be applied in both #addSubpatterns (sync) and #iterateSubpatterns (async).

Fix PR in Deno: denoland/deno#33372

I used AI to generate this report and the fix in Deno.

Metadata

Metadata

Assignees

No one assigned

    Labels

    confirmed-bugIssues with confirmed bugs.fsIssues and PRs related to the fs subsystem / file system.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions