Query plan explorer
Design for a cross-engine execution-plan viewer + analyzer: capture a query’s
EXPLAIN, normalize it, visualize it legibly, analyze it for problems,
and explain it in plain language — including teaching how to read the
metrics. AI narration is optional; the deterministic explanation always works.
Status: design (greenfield — no plan handling exists today beyond
explain analyze appearing in completion keywords).
Goals (from the ask)
- Capture and analyze execution plans.
- Explain the plan in natural language and identify likely bottlenecks.
- Teach how to read metrics such as cost, rows, loops, and elapsed time.
- Visualize the hot path with a tree/flame view.
- Normalize plan data across PG/MySQL/Oracle/SQL Server/ SQLite/Snowflake/Hive/Trino/DuckDB…
The core problem: every engine’s EXPLAIN is different
There is no portable plan format. The design’s spine is a per-engine capture + parser that lowers each native format into one normalized plan IR, after which visualization, analysis, and explanation are engine-agnostic.
| Engine | Capture | Native shape |
|---|---|---|
| PostgreSQL (+ CRDB/Yugabyte/Timescale/Neon) | EXPLAIN (FORMAT JSON[, ANALYZE, BUFFERS, VERBOSE]) <q> | JSON tree, est+actual |
| MySQL / MariaDB / TiDB | EXPLAIN FORMAT=JSON <q> · EXPLAIN ANALYZE <q> (8.0.18+) | JSON / tree text |
| SQL Server | SET SHOWPLAN_XML ON / SET STATISTICS XML ON | XML showplan |
| Oracle | EXPLAIN PLAN FOR <q> + DBMS_XPLAN.DISPLAY, or /*+ GATHER_PLAN_STATISTICS */ + DBMS_XPLAN.DISPLAY_CURSOR | row table / cursor stats |
| SQLite | EXPLAIN QUERY PLAN <q> | simplified tree |
| Snowflake | EXPLAIN USING JSON <q>; profile via GET_QUERY_OPERATOR_STATS() | JSON / operator stats |
| Trino / Presto / Hive | EXPLAIN (FORMAT JSON) <q> · EXPLAIN ANALYZE | JSON / text |
| DuckDB | EXPLAIN [ANALYZE] <q> (json) | JSON tree |
Each engine gets a PlanProvider that knows (a) the EXPLAIN SQL to issue and
(b) how to parse its output into the IR. This reuses irodori-sql’s
trait SqlDialect for identifier rendering and lives next to the existing
dialect/metamodel code.
Normalized plan IR
pub struct QueryPlan {
pub engine: DbEngine,
pub analyzed: bool, // EXPLAIN ANALYZE (actuals) vs estimate-only
pub root: PlanNode,
pub totals: PlanTotals, // total cost, total actual time, planning/exec ms
pub warnings: Vec<String>, // capture caveats (e.g. "estimates only")
}
pub struct PlanNode {
pub op: PlanOp, // canonical operator (below)
pub label: String, // engine's raw node name, preserved
pub relation: Option<String>, // table/index/CTE touched
pub est: NodeEstimate, // rows, cost (start/total), width
pub actual: Option<NodeActual>, // rows, loops, time_ms, returned (ANALYZE only)
pub io: Option<NodeIo>, // buffers/blocks read, spill bytes, network
pub detail: Vec<(String, String)>, // condition, filter, sort key, join type…
pub children: Vec<PlanNode>,
}
pub enum PlanOp { // engine names mapped to one vocabulary
TableScan, IndexScan, IndexOnlyScan, BitmapScan,
NestedLoop, HashJoin, MergeJoin,
Aggregate, GroupAggregate, HashAggregate, WindowAgg,
Sort, IncrementalSort, Limit, Materialize, Memoize,
Gather, Exchange, Subquery, CteScan, Result, ModifyTable, Other(String),
}
The IR carries both the engine’s raw label and the canonical PlanOp, so
visualization stays uniform while node cards can still show native terminology.
Analysis — deterministic heuristics first
A rule pass over the IR produces Finding { severity, node_path, title, explanation, suggestion }. These are engine-aware but deterministic (no AI),
so analysis is reliable and offline. Starter rules:
- Full scan of a large relation where a predicate could be indexed → suggest index on the filter/join column.
- Estimate vs actual skew (
actual_rows / est_rowsfar from 1, ANALYZE only) → planner mis-estimate; suggestANALYZE/stats refresh; explains why the plan shape may be wrong. - Hot subtree: node(s) accounting for the majority of total time/cost → “spend your effort here.”
- Sort/hash spilled to disk (buffers/spill bytes) → raise work_mem / add index to avoid the sort.
- Nested loop over a large outer/inner → likely should be hash/merge join.
- Row explosion across a join (output ≫ inputs) → missing/incorrect join key.
- Redundant materialize/sort, non-sargable predicate (function on indexed
column),
SELECT *width blowup.
Each finding maps to a node so the UI can highlight it on the tree.
Natural-language explanation — two layers
- Deterministic narrative (always on, offline). Template the IR + findings
into prose: “This query scans
orderssequentially (1.2M rows, ~78% of total time) because no index coversstatus, then hash-joins tocustomers. The planner expected 200 rows but got 48,000 — stale statistics. Biggest win: an index onorders(status).” This is the baseline and never depends on a model. - AI narration (opt-in). Feed the normalized plan + findings to a
GrammarModelprovider (the existingirodori-generatestack:LlamaSqlModelembedded,OllamaModel/OpenAiCompatModelover HTTP,CommandModelvia CLI) using unconstrained decoding for fluent prose and a tutoring “how would you fix this?” voice. Per the roadmap, AI is optional and audited — the deterministic narrative stands alone; AI only enriches it.
The IR (not raw EXPLAIN text) is what’s fed to the model, so the prompt is small, engine-neutral, and consistent across providers.
Metric glossary
A small, structured glossary keyed by (engine, metric) powers inline tooltips
and a “how to read this plan” panel:
- cost (PG): arbitrary units,
startup..total; only comparable within one plan, not across engines. - rows: estimate (and, with ANALYZE, actual); the est/actual ratio is the signal, not the absolute.
- loops (PG nested loop): a node’s time is
actual_time × loops. - actual time:
startup..totalms per loop. - buffers / blocks (PG
BUFFERS): cache vs disk reads — IO pressure. - width: bytes per row — wide rows = more memory/IO.
- SQL Server cost %, MySQL
filtered/rows, OracleCardinality/Bytes, etc.
This directly serves the metric glossary: the user learns what each number means and how to act on it, inline where the number appears.
Visualization — make the hot path pop
A new ResultMode = "plan", rendered in the result area alongside
grid/chart/graph/webgl/structure (features/results/components/ ResultBody.tsx dispatches on resultMode). A features/query-plan/ view with:
- Tree / flame view: collapsible operator tree; each node sized/heat-colored by its share of total time (or cost when estimate-only) so the expensive subtree is obvious at a glance.
- Node card: canonical op + raw label, relation, est vs actual rows with a skew badge, time/cost %, IO/spill, and the condition/filter/sort detail.
- Findings rail: the heuristic findings, each clicking-through to highlight its node; severity-colored.
- Explanation panel: the deterministic narrative, with an optional “Explain with AI” button (only if a provider is configured) and the glossary tooltips.
- Estimate ⇄ Actual toggle: estimate (plain
EXPLAIN, safe) vs actual (EXPLAIN ANALYZE, runs the query — see safety).
Safety — EXPLAIN ANALYZE executes the query
Plain EXPLAIN is read-only and always safe. EXPLAIN ANALYZE runs the
statement — dangerous for DML. Rules:
- Classify the statement with the existing read-only guard
(
../irodori-kit/irodori-serverguard::classify, already used for the headless API). - Offer ANALYZE freely for read-only queries.
- For writes: either refuse ANALYZE, or wrap in a transaction the engine can roll
back (
BEGIN; EXPLAIN ANALYZE …; ROLLBACK;on PG) with a loud confirm. Never silently execute a DML to profile it.
Where the code lives
- Query-plan module (new, engine-agnostic). The IR, per-engine EXPLAIN SQL +
parsers, the heuristic rules, and the deterministic narrative + glossary.
Depends on
irodori-sql(dialect) only; no DB driver. Start app-local underapps/desktop/src-tauri/src/query_plan/, then promote to a sibling shared repo only after the IR has a stable release/test boundary. - Capture command. Tauri
explain_query(connection, sql, { analyze }) -> QueryPlaninsrc-tauri/src/db/— issues the engine’s EXPLAIN via the existing pool and hands the output toirodori-plan. Headless:POST /v1/explain. - AI narration. Reuse
irodori-generateproviders via a thin free-textexplain_plan(plan, provider)path (sibling toai_generate_sql), opt-in. - UI.
features/query-plan/(tree/flame view, node cards, findings rail, explanation + glossary), wired asResultMode = "plan"; a “Explain plan” action/keybinding next to Run.
Phased plan
- IR + PostgreSQL provider (richest JSON, est+actual+buffers) end-to-end: capture → IR → tree view → 3–4 heuristics → deterministic narrative + glossary. This proves the whole pipe on one engine.
- More providers: SQLite (
EXPLAIN QUERY PLAN), MySQL JSON, DuckDB — fast wins; then SQL Server XML, Oracle DBMS_XPLAN, Snowflake/Trino. - Flame heat view + findings rail polish; estimate⇄actual toggle + ANALYZE safety guard.
- AI narration behind the existing provider config (opt-in), feeding the IR.
- Headless
/v1/explain+ a CLI verb for plan capture in automation.
Relationship to the data-diff design
Both new capabilities are engine-agnostic analysis crates over the same seams
(dialect, the db read path, the headless API, optional AI providers). irodori-diff
(data-verification-diff.md) and irodori-plan share
the pattern: lower heterogeneous engines into one IR, then analyze/present
uniformly. Worth keeping their crate conventions aligned.