Search as code, shaped by how you actually search. Agents write code over your data, not tool calls — but where others re-derive that code every run, here each accepted trajectory crystallises into a replay-gated, typed call in your tenant's lib/. The interface isn't designed up front; it emerges through usage, per tenant.
/mnt/<dataset> appears as a typed VFS, scoped per tenant + intent.
scripts/answer.ts in visible TypeScript over df.db.*.
df.answer({…}) with evidence and lineage.
df.lib.* grows per tenant; provider reshapes serving from real intents.
An interface that emerges through agentic search. Five stages, one tenanted environment per dataset.
Scored by the benchmark's own evaluator on 126 long-horizon agentic-search tasks. n=126 · 21 families × 6 tiers · iter3-full · 20260512.
Quality of a vanilla agent. At 1 / 172 the cost. With crashes gone.
Hard tier (n=21): +7.9 pp over the vanilla ceiling.