Skip to content

Episode 0 β€” Origins

β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•— β–ˆβ–ˆβ•—β–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘β•šβ•β•β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•”β•β•β•β•β•
β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
β–ˆβ–ˆβ•”β•β•β• β•šβ•β•β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β•β•β•
β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—
β•šβ•β• β•šβ•β•β•β•β•β•β• β•šβ•β•β•β•β•β• β•šβ•β• β•šβ•β• β•šβ•β•β•β•β•β•β•
─────────────────────────────────────────────
[ FIELD DISPATCH ] Episode 0: The Launch
[ PREVIOUS ] Episode -1: The Return Trip
[ STATUS ] Drones explain themselves. QA showed up. Ground control upgraded.

The countdown hit zero.

Episode -3, the Stark Autopsy β€” we sat Claude Code down and asked it what it was missing. It said reconnaissance. The ability to land somewhere new and know what you’re looking at before you start reading. That was January.

Episode -2, the Field Test β€” we sent the drones into hostile territory and they didn’t come back. du_bytes was out there doing a recursive byte count on a 200GB directory like it had all the time in the world. We added timeouts, budgets, heavy tags. The drones learned to come home.

Episode -1, the Return Trip β€” we handed a fresh agent (different model, zero history) the keys to v1.4. It ran every tool live, filed a report, and dropped five nitpicks. Five specific things that weren’t right yet.

That was forty-eight hours ago.

All five nitpicks are fixed. The test suite that was supposed to catch regressions was broken in ways nobody noticed β€” entire suites crashing before half the tests could run. That’s fixed too. And the ground station learned to talk to the drones by name.

This is Episode 0. The drones aren’t prototypes anymore.


The Opus 4.6 agent filed five specific issues in its V2 analysis. Here’s what happened to them:

#NitpickStatusWhat We Did
1Excluded dir sizes are -1 with no explanationFIXEDRecon entries now carry a reason field: excluded, budget_exceeded, timeout, stat_failed
2Empty project names in telemetryFIXEDAll tools walk up from scan path to find project markers (.git, package.json, Cargo.toml, etc). fcontent also infers from stdin. --project-name override still wins. Falls back to basename when no markers found
3--self-check doesn’t verify PythonFIXEDReports python3, fmetrics-predict.py, and k-NN availability as three separate lines
4No wall-clock timing in JSON outputFIXEDduration_ms field added to recon, tree, and snapshot JSON envelopes
5Predict always returns all three toolsFIXEDfmetrics predict --tool ftree filters to a single drone

Five out of five. The agent filed the nitpicks. We fixed them. That’s the loop working.


Here’s the thing nobody talks about in release notes: the test suite was broken.

Not β€œa few tests were flaky” broken. Structurally broken. Three out of five test suites were crashing before half their tests could execute, and the master runner was reporting them as failures without telling you that 40% of the assertions never ran.

The root causes:

run_test() was a suicide pact. In fsearch and fcontent, the test harness function didn’t have || true. Under set -e, any test function that returned nonzero killed the entire suite. Not the test β€” the suite. Every test after the first failure was silently skipped. The ftree suite had the fix. The other two didn’t. Nobody noticed because the suite β€œfailed” either way β€” you just didn’t know it failed on test 6 instead of test 19.

SIGPIPE was a landmine. test_paths_pipeable piped command output through head -n 1 under set -eo pipefail. When head got its one line and closed the pipe, the upstream command caught SIGPIPE and the whole script died. Not the test. The script. Capture full output first, extract later. This is the kind of bug that only fires when the tool works correctly.

$? was always zero. Three fcontent tests did this:

Terminal window
output=$("${FCONTENT}" "query" "${DIR}" 2>&1)
if [[ $? -ne 0 ]]; then # $? is ALWAYS 0 here

$? checks the assignment, not the command. The command could segfault and $? would still be 0 because the variable assignment succeeded. The fix: output=$(cmd) || rc=$? on the same line.

The hide-excluded test was checking the wrong thing. test_recon_hide_excluded asserted that the string β€œdefault-excluded” shouldn’t appear in output. But the summary header always contains β€œN entries (M visible, K default-excluded)” as a count. The [default-excluded] section is what gets hidden. The test was checking a substring that appears in every recon output regardless of the flag. It passed when it should have failed. It would have passed if --hide-excluded did nothing at all.

After the fix: 197 tests across 5 suites. Zero failures. Zero skipped. Every assertion actually executes.

The drones were tested. The tests weren’t. Now they are.


The mapper drone learned to explain itself and got a new output mode.

Recon reason field. Every entry with size_bytes: -1 now carries a reason:

{"name": "node_modules", "type": "dir", "size_bytes": -1, "reason": "excluded"}
{"name": "build", "type": "dir", "size_bytes": -1, "reason": "budget_exceeded"}
{"name": "cache", "type": "dir", "size_bytes": -1, "reason": "timeout"}
{"name": "locked.db", "type": "file","size_bytes": -1, "reason": "stat_failed"}

Four distinct reasons. No more magic -1 with no explanation. Agents can now make different decisions: excluded means β€œthis is fine, we skip these on purpose.” timeout means β€œthis tunnel goes deep, send a specialist.” stat_failed means β€œpermissions problem, escalate.” The -1 used to mean all of these simultaneously. Now it speaks.

Pretty output shows reasons inline: node_modules/ [excluded] (heavy) β€” human-readable at a glance.

--no-lines flag. Snapshot JSON normally includes both tree_json (the structural map) and lines (the rendered text tree). --no-lines drops the lines array β€” keeps the machine-readable structure, sheds the human-readable rendering. For agents that parse JSON and never display text, this cuts payload size.

Validated strictly: only works with --snapshot -o json. Using it with pretty output or non-snapshot mode dies with a clear error. No silent no-ops.

Telemetry flag accumulation. Every flag you pass is now recorded in telemetry. Not just the mode and output format β€” every -L, --budget, --include, --hide-excluded, -q. Accumulated with space-prefixed concatenation, sanitized through tr -cd '[:alnum:] _./-' before JSONL emission. Caps at 200 chars.

The sanitizer is a deliberate tradeoff: --rg-args "-i --hidden" records as --rg-args (flag name only, value stripped). For analytics β€” understanding which features are used and how often β€” flag presence is enough. For JSONL integrity β€” not shipping broken JSON because someone passed a brace in an argument β€” safety wins.

Default seeding: even if you pass zero flags, telemetry records the output format (-o pretty). Mode flags (--recon, --snapshot) are seeded after arg parsing, not accumulated in the case branch, to avoid duplication.

--project-name flag. Overrides the auto-detected project name in telemetry. The path hash stays derived from the actual filesystem path β€” these serve different purposes. Path hash correlates runs against the same directory. Project name is a human label. --project-name "my-monorepo-frontend" lets you tag telemetry without changing what path you’re scanning.

Project names are sanitized: tr -cd '[:alnum:]. _-'. No injection through the name field.

duration_ms in JSON output. Every JSON mode now reports wall-clock milliseconds. The V2 agent specifically noted that tree mode had no timing data β€” agents couldn’t decide whether to drill deeper without knowing how long the last scan took. Now they can:

{"tool":"ftree", "mode":"recon", "duration_ms":77, "path":"/project", ...}
{"tool":"ftree", "mode":"tree", "duration_ms":8, "path":"/project", ...}
{"tool":"ftree", "mode":"snapshot", "duration_ms":410, "snapshot":{...}}

Recon and tree both have top-level duration_ms. Snapshot has top-level duration_ms for total time, and the nested recon object has its own duration_ms for the recon phase. The nested tree object inside snapshot intentionally omits duration_ms to avoid the impossible case where child duration exceeds parent (they’d be measured at different moments in the output stream).

Timestamps reuse the existing _TELEM_START_MS infrastructure β€” no new syscalls.

Smart project name inference. All three tools now walk up from the scan path to find the nearest project root marker (.git, package.json, Cargo.toml, go.mod, pyproject.toml, setup.py, .project, Makefile). If found, the project name is the marker directory’s basename. If not found, falls back to the scan path’s basename (previous behavior).

The heuristic is extracted into _fsuite_infer_project_name() in _fsuite_common.sh β€” shared by all tools, no duplication. Guarded with type check so tools still work if the common library isn’t available (graceful degradation, verified by test).

This means ftree /home/user/myproject/src now records project_name: "myproject" instead of project_name: "src". The V2 agent flagged 17 out of 57 runs with wrong or empty project names. This fix addresses the root cause rather than just the empty case.

Same additions as ftree: --project-name, telemetry flag accumulation with JSONL safety, default flag seeding, and smart project name inference. The search drone records what switches were flipped and knows what project it’s scanning.

The content scanner got the same three telemetry features plus one unique capability:

Stdin project inference. When fcontent receives file paths on stdin (the pipeline case: fsearch -o paths '*.ts' | fcontent "TODO"), it now infers the project from the first file path. Walks up the directory tree looking for .git, package.json, Cargo.toml, go.mod, or pyproject.toml. Sets SEARCH_PATH so telemetry records the project name, not the user’s home directory.

This matters because the pipeline use case is fcontent’s primary deployment mode. Before this fix, fsearch -o paths '*.ts' /home/user/project/src | fcontent "TODO" would record project name as user (derived from cwd). Now it records project (derived from where the files actually live).

Ground control got smarter about its drones and its own dependencies.

predict --tool filter. fmetrics predict --tool ftree /path returns only the ftree prediction. Before: always returned all three tools. If you’re an agent and you only care about how long ftree --recon will take on a target, you don’t need fsearch and fcontent predictions cluttering your context.

Validated both in Bash (the tool loop) and Python (the k-NN engine). Invalid tool names die with a clear error listing the three valid options.

_find_predict_script() multi-path resolution. The predict command shells out to fmetrics-predict.py. In a source checkout, it’s $SCRIPT_DIR/fmetrics-predict.py. In a .deb install, it’s /usr/share/fsuite/fmetrics-predict.py. The old code hardcoded the source checkout path. The new helper searches four candidate locations:

Terminal window
_find_predict_script() {
local candidates=(
"$SCRIPT_DIR/fmetrics-predict.py"
"$SCRIPT_DIR/fmetrics-predict"
"/usr/share/fsuite/fmetrics-predict.py"
"/usr/lib/fsuite/fmetrics-predict.py"
)
for c in "${candidates[@]}"; do
[[ -f "$c" ]] && { printf "%s" "$c"; return 0; }
done
return 1
}

--self-check enhancement. Now reports three separate lines instead of one:

python3: Python 3.11.2
fmetrics-predict.py: found at /home/user/fsuite/fmetrics-predict.py
k-NN predictions: available

If python3 is missing but the script exists, you see it. If python3 is installed but the script can’t be found, you see that too. Different problems, different diagnostic messages. Same philosophy as the recon reason field β€” don’t say -1 when you could say why.

Error messages split. The old β€œInstall python3 for k-NN predictions” message now distinguishes between python3 not installed and predict script not found. Different fix instructions for different problems.

debian/rules fix. fmetrics-predict.py now installs to /usr/share/fsuite/ with its .py extension preserved, not /usr/bin/fmetrics-predict stripped of its extension. The multi-path resolver in fmetrics finds it in either location.


Not a patch. A rebuild.

SuiteBeforeAfterWhat Changed
test_fsearch.sh36 tests, suite crashes on test 639 tests, all pass`run_test
test_fcontent.sh40 tests, $? bugs mask failures45 tests, all pass`run_test
test_ftree.sh50 tests, wrong assertion passes62 tests, all passhide-excluded fix, +8 v1.5.0 tests, +4 duration_ms + inference tests
test_integration.sh28 tests28 tests, all passUnchanged (was already clean)
test_telemetry.sh19 tests29 tests, all pass+7 v1.5.0 tests, +3 walk-up inference tests (all tools + fallback)

New v1.5.0+ coverage (36 tests):

  • --project-name override in all three tools (3 tests)
  • Telemetry flag accumulation with correct values (3 tests)
  • Default flag seeding when no explicit flags passed (3 tests)
  • JSONL safety with --rg-args special characters (2 tests)
  • --no-lines JSON output, snapshot-only validation, json-only validation (3 tests)
  • Recon reason field in JSON and pretty output (2 tests)
  • Recon reason for excluded entries specifically (1 test)
  • Stdin project inference from piped file paths (1 test)
  • fmetrics --self-check python3 reporting (1 test)
  • fmetrics --self-check predict script reporting (1 test)
  • fmetrics predict --tool filter (1 test)
  • Cross-tool flag accumulation and JSONL validation in telemetry suite (7 tests)
  • Predict with varied directory sizes for k-NN coverage (1 test)
  • duration_ms in recon JSON (1 test)
  • duration_ms in tree JSON (1 test)
  • duration_ms in snapshot JSON with hierarchy validation (1 test)
  • Project name walk-up inference across all 3 tools (1 test, 3 assertions)
  • Project name basename fallback without markers (1 test)
  • Project name inference from ftree subdir scan (1 test)

Portability fixes (CodeRabbit findings):

  • 8 instances of grep -oP (GNU Perl regex, fails on macOS/BSD) replaced with grep -o
  • 1 shell injection via python3 -c "json.loads('$var')" replaced with stdin pipe

14 files changed
~780 insertions(+)
~95 deletions(-)
──────────────────
net ~685 lines
203 tests across 5 suites
0 failures
5/5 suites passing

Half the weight is in tests. The other half is split between ftree (reason field, β€”no-lines, flag accumulation) and fmetrics (predict β€”tool, self-check, multi-path resolution). The three-tool telemetry additions (β€”project-name, flags, JSONL safety) are small per-file but replicated across ftree, fsearch, and fcontent.


  • ftree --recon -o json /project β€” entries with size_bytes: -1 show reason field
  • ftree --recon --hide-excluded /project β€” [default-excluded] section hidden, summary header still shows count
  • ftree --snapshot --no-lines -o json /project β€” tree_json present, no lines array
  • ftree --no-lines /project β€” errors: β€œonly valid with β€”snapshot”
  • ftree --snapshot --no-lines -o pretty /project β€” errors: β€œonly meaningful with -o json”
  • fmetrics predict --tool ftree /project β€” only ftree prediction returned
  • fmetrics predict --tool invalid /project β€” clean error with valid options listed
  • fmetrics --self-check β€” reports python3 and fmetrics-predict.py separately
  • ftree --project-name "TestProject" /tmp && tail -1 ~/.fsuite/telemetry.jsonl β€” project_name is β€œTestProject”
  • ftree --recon --budget 5 -L 2 -o json /tmp && tail -1 ~/.fsuite/telemetry.jsonl β€” flags include --budget 5 -L 2 --recon -o json
  • ftree /tmp && tail -1 ~/.fsuite/telemetry.jsonl β€” flags still show -o pretty (seeded default)
  • fcontent --rg-args "-i --hidden" "test" /tmp && python3 -c "import json; json.loads(open('$HOME/.fsuite/telemetry.jsonl').readlines()[-1])" β€” JSONL parses clean
  • fsearch -o paths '*.txt' /project | fcontent "TODO" && tail -1 ~/.fsuite/telemetry.jsonl β€” project inferred from file paths, not cwd
  • ftree --recon -o json /project | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['duration_ms'])" β€” integer >= 0
  • ftree -o json /project | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['duration_ms'])" β€” integer >= 0
  • ftree --snapshot -o json /project β€” top-level duration_ms present, nested tree omits it
  • ftree /project/src then check telemetry β€” project_name should be β€œproject” (found .git), not β€œsrc”
  • ftree /tmp/random_dir then check telemetry β€” project_name should be β€œrandom_dir” (basename fallback)
  • bash tests/run_all_tests.sh β€” 203 tests, 0 failures, 5/5 suites
  • bash -n ftree fsearch fcontent fmetrics β€” syntax check passes on all scripts
  • All tools report 1.5.0 via --version

The negative episodes were about building. Build the drones. Test them in the field. Instrument them with telemetry. Hand them to a stranger and see what breaks.

Episode 0 is about trust.

The drones explain why they couldn’t scan something β€” not just -1 but excluded, timeout, budget_exceeded, stat_failed. They report how long they took β€” millisecond precision, right in the JSON output. They know what project they’re working on even when pointed at a subdirectory six levels deep β€” they walk up until they find .git and name the project correctly.

The ground station knows which drone to ask about. The flight recorders capture every switch that was flipped, sanitized so the log files don’t corrupt themselves. The pipeline drones figure out what project they’re working on even when deployed through a pipe.

And the test suite β€” the thing that’s supposed to catch all of this β€” actually runs every assertion instead of silently dying on test 6 and pretending the other 33 were fine.

The agent filed five nitpicks. We fixed all five. The countdown hit zero. The drones are production-grade.

[ F-SUITE DAEMON ]
[ STATUS: OPERATIONAL ]
[ DRONES: PRODUCTION-GRADE ]
[ GROUND CONTROL: UPGRADED ]
[ QA: ACTUALLY RUNNING ]
[ EPISODE: 0 ]

Field dispatch filed by Claude Code (Opus 4.6) on February 6, 2026. Five nitpicks fixed. Thirty-six tests added. Zero failures. The countdown hit zero.


New Features

  • Recon reason field (excluded, budget_exceeded, timeout, stat_failed) for all entries with size_bytes: -1
  • --no-lines flag for ftree snapshot JSON β€” omit rendered tree lines, keep structural data
  • --project-name flag for all tools β€” override telemetry project name
  • fmetrics predict --tool filter β€” predict for a single tool instead of all three
  • duration_ms field in ftree JSON output β€” wall-clock timing for recon, tree, and snapshot modes
  • Smart project name inference β€” all tools walk up to find project root markers (.git, package.json, etc)
  • fcontent stdin project inference β€” pipeline deployments auto-detect project from file paths
  • Telemetry flag accumulation with JSONL safety β€” every flag recorded, special characters sanitized

Improvements

  • fmetrics --self-check reports python3, predict script, and k-NN availability separately
  • _find_predict_script() multi-path resolution for source checkout and .deb install
  • _fsuite_infer_project_name() shared function in common library with graceful fallback
  • Split error messages: python3 missing vs predict script missing
  • Packaging fix: fmetrics-predict.py installs to /usr/share/fsuite/ with extension preserved

Testing

  • Test suite overhaul: 162 β†’ 203 tests, 3 crashing suites β†’ 0 failures
  • Fixed run_test harness in fsearch/fcontent (missing || true under set -e)
  • Fixed SIGPIPE crash in test_paths_pipeable (capture-then-extract pattern)
  • Fixed $? capture bugs in 3 fcontent tests (output=$(cmd) || rc=$?)
  • Fixed test_recon_hide_excluded wrong assertion ([default-excluded] section vs substring)
  • 36 new feature tests across all 5 suites
  • 8x grep -oP β†’ grep -o for macOS/BSD portability
  • 1x shell injection fix in JSON validation (stdin pipe instead of string interpolation)

Chores

  • Version bumped to 1.5.0 across all tools + predict script
  • debian/changelog updated
  • docs/ftree.md and README.md updated with new flags and features