Episode 0 β Origins
βββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββββ ββββββ βββββββββββ ββββββ βββ ββββββ ββββββ βββββββββββ ββββββ βββ ββββββ βββ ββββββββββββββββββββ βββ ββββββββ βββ ββββββββ βββββββ βββ βββ ββββββββ βββββββββββββββββββββββββββββββββββββββββββββ [ FIELD DISPATCH ] Episode 0: The Launch [ PREVIOUS ] Episode -1: The Return Trip [ STATUS ] Drones explain themselves. QA showed up. Ground control upgraded.Mission Context
Section titled βMission ContextβThe countdown hit zero.
Episode -3, the Stark Autopsy β we sat Claude Code down and asked it what it was missing. It said reconnaissance. The ability to land somewhere new and know what youβre looking at before you start reading. That was January.
Episode -2, the Field Test β we sent the drones into hostile territory and they didnβt come back. du_bytes was out there doing a recursive byte count on a 200GB directory like it had all the time in the world. We added timeouts, budgets, heavy tags. The drones learned to come home.
Episode -1, the Return Trip β we handed a fresh agent (different model, zero history) the keys to v1.4. It ran every tool live, filed a report, and dropped five nitpicks. Five specific things that werenβt right yet.
That was forty-eight hours ago.
All five nitpicks are fixed. The test suite that was supposed to catch regressions was broken in ways nobody noticed β entire suites crashing before half the tests could run. Thatβs fixed too. And the ground station learned to talk to the drones by name.
This is Episode 0. The drones arenβt prototypes anymore.
The Nitpick Report Card
Section titled βThe Nitpick Report CardβThe Opus 4.6 agent filed five specific issues in its V2 analysis. Hereβs what happened to them:
| # | Nitpick | Status | What We Did |
|---|---|---|---|
| 1 | Excluded dir sizes are -1 with no explanation | FIXED | Recon entries now carry a reason field: excluded, budget_exceeded, timeout, stat_failed |
| 2 | Empty project names in telemetry | FIXED | All tools walk up from scan path to find project markers (.git, package.json, Cargo.toml, etc). fcontent also infers from stdin. --project-name override still wins. Falls back to basename when no markers found |
| 3 | --self-check doesnβt verify Python | FIXED | Reports python3, fmetrics-predict.py, and k-NN availability as three separate lines |
| 4 | No wall-clock timing in JSON output | FIXED | duration_ms field added to recon, tree, and snapshot JSON envelopes |
| 5 | Predict always returns all three tools | FIXED | fmetrics predict --tool ftree filters to a single drone |
Five out of five. The agent filed the nitpicks. We fixed them. Thatβs the loop working.
The Real Story: The Tests Were Lying
Section titled βThe Real Story: The Tests Were LyingβHereβs the thing nobody talks about in release notes: the test suite was broken.
Not βa few tests were flakyβ broken. Structurally broken. Three out of five test suites were crashing before half their tests could execute, and the master runner was reporting them as failures without telling you that 40% of the assertions never ran.
The root causes:
run_test() was a suicide pact. In fsearch and fcontent, the test harness function didnβt have || true. Under set -e, any test function that returned nonzero killed the entire suite. Not the test β the suite. Every test after the first failure was silently skipped. The ftree suite had the fix. The other two didnβt. Nobody noticed because the suite βfailedβ either way β you just didnβt know it failed on test 6 instead of test 19.
SIGPIPE was a landmine. test_paths_pipeable piped command output through head -n 1 under set -eo pipefail. When head got its one line and closed the pipe, the upstream command caught SIGPIPE and the whole script died. Not the test. The script. Capture full output first, extract later. This is the kind of bug that only fires when the tool works correctly.
$? was always zero. Three fcontent tests did this:
output=$("${FCONTENT}" "query" "${DIR}" 2>&1)if [[ $? -ne 0 ]]; then # $? is ALWAYS 0 here$? checks the assignment, not the command. The command could segfault and $? would still be 0 because the variable assignment succeeded. The fix: output=$(cmd) || rc=$? on the same line.
The hide-excluded test was checking the wrong thing. test_recon_hide_excluded asserted that the string βdefault-excludedβ shouldnβt appear in output. But the summary header always contains βN entries (M visible, K default-excluded)β as a count. The [default-excluded] section is what gets hidden. The test was checking a substring that appears in every recon output regardless of the flag. It passed when it should have failed. It would have passed if --hide-excluded did nothing at all.
After the fix: 197 tests across 5 suites. Zero failures. Zero skipped. Every assertion actually executes.
The drones were tested. The tests werenβt. Now they are.
Technical Changes
Section titled βTechnical Changesβftree v1.4.0 β v1.5.0
Section titled βftree v1.4.0 β v1.5.0βThe mapper drone learned to explain itself and got a new output mode.
Recon reason field. Every entry with size_bytes: -1 now carries a reason:
{"name": "node_modules", "type": "dir", "size_bytes": -1, "reason": "excluded"}{"name": "build", "type": "dir", "size_bytes": -1, "reason": "budget_exceeded"}{"name": "cache", "type": "dir", "size_bytes": -1, "reason": "timeout"}{"name": "locked.db", "type": "file","size_bytes": -1, "reason": "stat_failed"}Four distinct reasons. No more magic -1 with no explanation. Agents can now make different decisions: excluded means βthis is fine, we skip these on purpose.β timeout means βthis tunnel goes deep, send a specialist.β stat_failed means βpermissions problem, escalate.β The -1 used to mean all of these simultaneously. Now it speaks.
Pretty output shows reasons inline: node_modules/ [excluded] (heavy) β human-readable at a glance.
--no-lines flag. Snapshot JSON normally includes both tree_json (the structural map) and lines (the rendered text tree). --no-lines drops the lines array β keeps the machine-readable structure, sheds the human-readable rendering. For agents that parse JSON and never display text, this cuts payload size.
Validated strictly: only works with --snapshot -o json. Using it with pretty output or non-snapshot mode dies with a clear error. No silent no-ops.
Telemetry flag accumulation. Every flag you pass is now recorded in telemetry. Not just the mode and output format β every -L, --budget, --include, --hide-excluded, -q. Accumulated with space-prefixed concatenation, sanitized through tr -cd '[:alnum:] _./-' before JSONL emission. Caps at 200 chars.
The sanitizer is a deliberate tradeoff: --rg-args "-i --hidden" records as --rg-args (flag name only, value stripped). For analytics β understanding which features are used and how often β flag presence is enough. For JSONL integrity β not shipping broken JSON because someone passed a brace in an argument β safety wins.
Default seeding: even if you pass zero flags, telemetry records the output format (-o pretty). Mode flags (--recon, --snapshot) are seeded after arg parsing, not accumulated in the case branch, to avoid duplication.
--project-name flag. Overrides the auto-detected project name in telemetry. The path hash stays derived from the actual filesystem path β these serve different purposes. Path hash correlates runs against the same directory. Project name is a human label. --project-name "my-monorepo-frontend" lets you tag telemetry without changing what path youβre scanning.
Project names are sanitized: tr -cd '[:alnum:]. _-'. No injection through the name field.
duration_ms in JSON output. Every JSON mode now reports wall-clock milliseconds. The V2 agent specifically noted that tree mode had no timing data β agents couldnβt decide whether to drill deeper without knowing how long the last scan took. Now they can:
{"tool":"ftree", "mode":"recon", "duration_ms":77, "path":"/project", ...}{"tool":"ftree", "mode":"tree", "duration_ms":8, "path":"/project", ...}{"tool":"ftree", "mode":"snapshot", "duration_ms":410, "snapshot":{...}}Recon and tree both have top-level duration_ms. Snapshot has top-level duration_ms for total time, and the nested recon object has its own duration_ms for the recon phase. The nested tree object inside snapshot intentionally omits duration_ms to avoid the impossible case where child duration exceeds parent (theyβd be measured at different moments in the output stream).
Timestamps reuse the existing _TELEM_START_MS infrastructure β no new syscalls.
Smart project name inference. All three tools now walk up from the scan path to find the nearest project root marker (.git, package.json, Cargo.toml, go.mod, pyproject.toml, setup.py, .project, Makefile). If found, the project name is the marker directoryβs basename. If not found, falls back to the scan pathβs basename (previous behavior).
The heuristic is extracted into _fsuite_infer_project_name() in _fsuite_common.sh β shared by all tools, no duplication. Guarded with type check so tools still work if the common library isnβt available (graceful degradation, verified by test).
This means ftree /home/user/myproject/src now records project_name: "myproject" instead of project_name: "src". The V2 agent flagged 17 out of 57 runs with wrong or empty project names. This fix addresses the root cause rather than just the empty case.
fsearch v1.4.0 β v1.5.0
Section titled βfsearch v1.4.0 β v1.5.0βSame additions as ftree: --project-name, telemetry flag accumulation with JSONL safety, default flag seeding, and smart project name inference. The search drone records what switches were flipped and knows what project itβs scanning.
fcontent v1.4.0 β v1.5.0
Section titled βfcontent v1.4.0 β v1.5.0βThe content scanner got the same three telemetry features plus one unique capability:
Stdin project inference. When fcontent receives file paths on stdin (the pipeline case: fsearch -o paths '*.ts' | fcontent "TODO"), it now infers the project from the first file path. Walks up the directory tree looking for .git, package.json, Cargo.toml, go.mod, or pyproject.toml. Sets SEARCH_PATH so telemetry records the project name, not the userβs home directory.
This matters because the pipeline use case is fcontentβs primary deployment mode. Before this fix, fsearch -o paths '*.ts' /home/user/project/src | fcontent "TODO" would record project name as user (derived from cwd). Now it records project (derived from where the files actually live).
fmetrics v1.4.0 β v1.5.0
Section titled βfmetrics v1.4.0 β v1.5.0βGround control got smarter about its drones and its own dependencies.
predict --tool filter. fmetrics predict --tool ftree /path returns only the ftree prediction. Before: always returned all three tools. If youβre an agent and you only care about how long ftree --recon will take on a target, you donβt need fsearch and fcontent predictions cluttering your context.
Validated both in Bash (the tool loop) and Python (the k-NN engine). Invalid tool names die with a clear error listing the three valid options.
_find_predict_script() multi-path resolution. The predict command shells out to fmetrics-predict.py. In a source checkout, itβs $SCRIPT_DIR/fmetrics-predict.py. In a .deb install, itβs /usr/share/fsuite/fmetrics-predict.py. The old code hardcoded the source checkout path. The new helper searches four candidate locations:
_find_predict_script() { local candidates=( "$SCRIPT_DIR/fmetrics-predict.py" "$SCRIPT_DIR/fmetrics-predict" "/usr/share/fsuite/fmetrics-predict.py" "/usr/lib/fsuite/fmetrics-predict.py" ) for c in "${candidates[@]}"; do [[ -f "$c" ]] && { printf "%s" "$c"; return 0; } done return 1}--self-check enhancement. Now reports three separate lines instead of one:
python3: Python 3.11.2fmetrics-predict.py: found at /home/user/fsuite/fmetrics-predict.pyk-NN predictions: availableIf python3 is missing but the script exists, you see it. If python3 is installed but the script canβt be found, you see that too. Different problems, different diagnostic messages. Same philosophy as the recon reason field β donβt say -1 when you could say why.
Error messages split. The old βInstall python3 for k-NN predictionsβ message now distinguishes between python3 not installed and predict script not found. Different fix instructions for different problems.
Packaging
Section titled βPackagingβdebian/rules fix. fmetrics-predict.py now installs to /usr/share/fsuite/ with its .py extension preserved, not /usr/bin/fmetrics-predict stripped of its extension. The multi-path resolver in fmetrics finds it in either location.
The Test Overhaul
Section titled βThe Test OverhaulβNot a patch. A rebuild.
| Suite | Before | After | What Changed |
|---|---|---|---|
| test_fsearch.sh | 36 tests, suite crashes on test 6 | 39 tests, all pass | `run_test |
| test_fcontent.sh | 40 tests, $? bugs mask failures | 45 tests, all pass | `run_test |
| test_ftree.sh | 50 tests, wrong assertion passes | 62 tests, all pass | hide-excluded fix, +8 v1.5.0 tests, +4 duration_ms + inference tests |
| test_integration.sh | 28 tests | 28 tests, all pass | Unchanged (was already clean) |
| test_telemetry.sh | 19 tests | 29 tests, all pass | +7 v1.5.0 tests, +3 walk-up inference tests (all tools + fallback) |
New v1.5.0+ coverage (36 tests):
--project-nameoverride in all three tools (3 tests)- Telemetry flag accumulation with correct values (3 tests)
- Default flag seeding when no explicit flags passed (3 tests)
- JSONL safety with
--rg-argsspecial characters (2 tests) --no-linesJSON output, snapshot-only validation, json-only validation (3 tests)- Recon reason field in JSON and pretty output (2 tests)
- Recon reason for excluded entries specifically (1 test)
- Stdin project inference from piped file paths (1 test)
- fmetrics
--self-checkpython3 reporting (1 test) - fmetrics
--self-checkpredict script reporting (1 test) - fmetrics
predict --toolfilter (1 test) - Cross-tool flag accumulation and JSONL validation in telemetry suite (7 tests)
- Predict with varied directory sizes for k-NN coverage (1 test)
duration_msin recon JSON (1 test)duration_msin tree JSON (1 test)duration_msin snapshot JSON with hierarchy validation (1 test)- Project name walk-up inference across all 3 tools (1 test, 3 assertions)
- Project name basename fallback without markers (1 test)
- Project name inference from ftree subdir scan (1 test)
Portability fixes (CodeRabbit findings):
- 8 instances of
grep -oP(GNU Perl regex, fails on macOS/BSD) replaced withgrep -o - 1 shell injection via
python3 -c "json.loads('$var')"replaced with stdin pipe
The Numbers
Section titled βThe Numbersβ14 files changed~780 insertions(+)~95 deletions(-)ββββββββββββββββββnet ~685 lines
203 tests across 5 suites0 failures5/5 suites passingHalf the weight is in tests. The other half is split between ftree (reason field, βno-lines, flag accumulation) and fmetrics (predict βtool, self-check, multi-path resolution). The three-tool telemetry additions (βproject-name, flags, JSONL safety) are small per-file but replicated across ftree, fsearch, and fcontent.
Test Plan
Section titled βTest Planβ-
ftree --recon -o json /projectβ entries withsize_bytes: -1showreasonfield -
ftree --recon --hide-excluded /projectβ[default-excluded]section hidden, summary header still shows count -
ftree --snapshot --no-lines -o json /projectβtree_jsonpresent, nolinesarray -
ftree --no-lines /projectβ errors: βonly valid with βsnapshotβ -
ftree --snapshot --no-lines -o pretty /projectβ errors: βonly meaningful with -o jsonβ -
fmetrics predict --tool ftree /projectβ only ftree prediction returned -
fmetrics predict --tool invalid /projectβ clean error with valid options listed -
fmetrics --self-checkβ reports python3 and fmetrics-predict.py separately -
ftree --project-name "TestProject" /tmp && tail -1 ~/.fsuite/telemetry.jsonlβ project_name is βTestProjectβ -
ftree --recon --budget 5 -L 2 -o json /tmp && tail -1 ~/.fsuite/telemetry.jsonlβ flags include--budget 5 -L 2 --recon -o json -
ftree /tmp && tail -1 ~/.fsuite/telemetry.jsonlβ flags still show-o pretty(seeded default) -
fcontent --rg-args "-i --hidden" "test" /tmp && python3 -c "import json; json.loads(open('$HOME/.fsuite/telemetry.jsonl').readlines()[-1])"β JSONL parses clean -
fsearch -o paths '*.txt' /project | fcontent "TODO" && tail -1 ~/.fsuite/telemetry.jsonlβ project inferred from file paths, not cwd -
ftree --recon -o json /project | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['duration_ms'])"β integer >= 0 -
ftree -o json /project | python3 -c "import json,sys; d=json.load(sys.stdin); print(d['duration_ms'])"β integer >= 0 -
ftree --snapshot -o json /projectβ top-levelduration_mspresent, nested tree omits it -
ftree /project/srcthen check telemetry β project_name should be βprojectβ (found .git), not βsrcβ -
ftree /tmp/random_dirthen check telemetry β project_name should be βrandom_dirβ (basename fallback) -
bash tests/run_all_tests.shβ 203 tests, 0 failures, 5/5 suites -
bash -n ftree fsearch fcontent fmetricsβ syntax check passes on all scripts - All tools report
1.5.0via--version
Closing Transmission
Section titled βClosing TransmissionβThe negative episodes were about building. Build the drones. Test them in the field. Instrument them with telemetry. Hand them to a stranger and see what breaks.
Episode 0 is about trust.
The drones explain why they couldnβt scan something β not just -1 but excluded, timeout, budget_exceeded, stat_failed. They report how long they took β millisecond precision, right in the JSON output. They know what project theyβre working on even when pointed at a subdirectory six levels deep β they walk up until they find .git and name the project correctly.
The ground station knows which drone to ask about. The flight recorders capture every switch that was flipped, sanitized so the log files donβt corrupt themselves. The pipeline drones figure out what project theyβre working on even when deployed through a pipe.
And the test suite β the thing thatβs supposed to catch all of this β actually runs every assertion instead of silently dying on test 6 and pretending the other 33 were fine.
The agent filed five nitpicks. We fixed all five. The countdown hit zero. The drones are production-grade.
[ F-SUITE DAEMON ][ STATUS: OPERATIONAL ][ DRONES: PRODUCTION-GRADE ][ GROUND CONTROL: UPGRADED ][ QA: ACTUALLY RUNNING ][ EPISODE: 0 ]Field dispatch filed by Claude Code (Opus 4.6) on February 6, 2026. Five nitpicks fixed. Thirty-six tests added. Zero failures. The countdown hit zero.
Summary
Section titled βSummaryβNew Features
- Recon reason field (
excluded,budget_exceeded,timeout,stat_failed) for all entries withsize_bytes: -1 --no-linesflag for ftree snapshot JSON β omit rendered tree lines, keep structural data--project-nameflag for all tools β override telemetry project namefmetrics predict --toolfilter β predict for a single tool instead of all threeduration_msfield in ftree JSON output β wall-clock timing for recon, tree, and snapshot modes- Smart project name inference β all tools walk up to find project root markers (
.git,package.json, etc) - fcontent stdin project inference β pipeline deployments auto-detect project from file paths
- Telemetry flag accumulation with JSONL safety β every flag recorded, special characters sanitized
Improvements
fmetrics --self-checkreports python3, predict script, and k-NN availability separately_find_predict_script()multi-path resolution for source checkout and .deb install_fsuite_infer_project_name()shared function in common library with graceful fallback- Split error messages: python3 missing vs predict script missing
- Packaging fix: fmetrics-predict.py installs to
/usr/share/fsuite/with extension preserved
Testing
- Test suite overhaul: 162 β 203 tests, 3 crashing suites β 0 failures
- Fixed
run_testharness in fsearch/fcontent (missing|| trueunderset -e) - Fixed SIGPIPE crash in
test_paths_pipeable(capture-then-extract pattern) - Fixed
$?capture bugs in 3 fcontent tests (output=$(cmd) || rc=$?) - Fixed
test_recon_hide_excludedwrong assertion ([default-excluded]section vs substring) - 36 new feature tests across all 5 suites
- 8x
grep -oPβgrep -ofor macOS/BSD portability - 1x shell injection fix in JSON validation (stdin pipe instead of string interpolation)
Chores
- Version bumped to 1.5.0 across all tools + predict script
debian/changelogupdateddocs/ftree.mdandREADME.mdupdated with new flags and features