Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Local SQL generation

Irodori Table can generate SQL from natural language with a lightweight local model, designed so correctness comes from a real SQL grammar rather than from model size. A tiny quantized model runs entirely on-device; every token it emits is forced through a schema-specialized SQL grammar, and the result is parsed back and validated against the schema. Deterministic completion stays the default — this is an opt-in, offline generation path.

Why it’s correct by construction

A small model is unreliable on its own. The design removes that risk:

  • Grammar-backed SQL structure — a real, dialect-aware SQL parser + AST + GBNF grammar (irodori-sql: ast.rs, parser.rs, grammar.rs).
  • Grammar-constrained decoding — the model can only sample tokens the GBNF grammar allows, so output is always syntactically valid.
  • Schema projection — table/column names become closed grammar terminals, so a hallucinated relation is literally unsamplable.
  • Planning before decoding — tables and foreign-key joins are resolved deterministically before the model runs, shrinking its job so a 0.5B model suffices.
  • Scoped grammar — on large schemas the GBNF is projected down to just the planned tables (+ their FK neighbors), so the grammar a big DB produces stays small: faster decode, lower memory, and the model can’t even mention an irrelevant table.
  • Validate + repair — output is parsed and every identifier is proven to exist; anything else is rejected, not returned.

Pipeline

NL prompt + connection schema
  → project   (GenSchema → GBNF grammar + SchemaIndex)
  → plan      (mentioned tables + FK joins + compact prompt)
  → decode    (llama.cpp, grammar-constrained, greedy)
  → verify    (parse + schema-validate + canonical re-render)
  → SQL inserted into the editor (never executed)

Crates

  • irodori-sqlast, parser, grammar (the shared syntax tree). Pure Rust, dialect-parameterized.
  • irodori-generateproject, plan, verify, runtime and the orchestrator. The llama feature adds the embedded runtime (llama.rs). Without it (or without a model) the engine returns unsupported/not found.
  • desktop (src-tauri/src/ai) — the ai_generate_sql, ai_engine_status, ai_set_provider, and ai_get_provider commands. The schema comes from the existing completion metadata cache; the dialect from the connection engine.

Runtime & model

  • Embedded llama.cpp via llama-cpp-2, CPU only, behind the llama cargo feature (the C++ build stays opt-in, like duckdb).
  • Default model: Qwen2.5-Coder-0.5B-Instruct Q4_K_M (~0.4 GB), strong at text-to-SQL at its size.
  • Lightness: mmapped weights, model loaded once and reused, a fresh context per request (KV cache freed between calls), small n_ctx, capped threads, and the grammar prunes the token space so fewer decode steps are needed.

Providers — connect any model (extensibility)

Everything talks to one trait, GrammarModel (runtime.rs), so backends are fully pluggable. The verify gate is what makes every provider safe: output is parsed and schema-validated regardless of source, so a backend that can’t honor the GBNF grammar just gets its mistakes rejected instead of prevented — never returned as invalid/hallucinated SQL.

Built-in providers:

ProviderCrate itemFeatureNotes
Embedded llama.cppLlamaSqlModelllamaGBNF-constrained, fully local/offline
OllamaOllamaModelhttpany local Ollama model (7B/14B/32B…)
OpenAI-compatible APIOpenAiCompatModelhttpOpenAI, Azure, OpenRouter, gateways, many self-hosted/Anthropic-compatible
External CLICommandModel(none)Claude Code, Codex, Copilot, or any command — reuses your subscription
Echo (tests)EchoModel(none)deterministic stand-in

The desktop selects a provider at runtime via ai_set_provider / ai_get_provider (AiProviderConfig { kind, model, endpoint, apiKey, program, args }), surfaced in the generate dialog’s “Model provider” section. The http and CLI providers are compiled in by default, so external/subscription models work without the heavy llama build; the embedded model stays opt-in. API keys are held in memory only (persist via the OS keychain through security_store_secret).

Adding another provider (e.g. Anthropic-native, a local server, a queue) is a new impl GrammarModel plus one AiProviderKind arm — no pipeline changes.

Build & use

# Standard build: AI generation compiled out, everything else unaffected.
cargo build -p irodori-table-desktop

# With local generation (compiles llama.cpp; set TMPDIR for the C++ build):
TMPDIR=.irodori-local/cc-tmp cargo build -p irodori-table-desktop --features llama

The embedded model is read only when it already exists in the app data dir under models/. The desktop UI does not start a background download job; preinstall the model for local generation, or use Ollama / API / CLI providers.

Safety

AI generation is opt-in, only ever inserts SQL into the editor (it never runs it), and references only objects that exist in the connected schema. This keeps the A5SQL-style read-only / AI-disabled posture intact.

Follow-ups

  • Optional admin tooling for preinstalling local models outside the desktop UI.
  • Expand the grammar/AST beyond SELECT (DML/DDL) over later iterations.
  • Optional idle-unload timer to drop the model from memory after inactivity.