From bbad46ddcfb460b2c4d731bf30274bed72eaccdf Mon Sep 17 00:00:00 2001 From: Evgeny Gamov Date: Tue, 16 Jun 2026 18:45:11 +0500 Subject: [PATCH] docs: add esmole + monorepo design spec --- .../2026-06-16-esmole-monorepo-design.md | 187 ++++++++++++++++++ 1 file changed, 187 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-16-esmole-monorepo-design.md diff --git a/docs/superpowers/specs/2026-06-16-esmole-monorepo-design.md b/docs/superpowers/specs/2026-06-16-esmole-monorepo-design.md new file mode 100644 index 0000000..8cedce8 --- /dev/null +++ b/docs/superpowers/specs/2026-06-16-esmole-monorepo-design.md @@ -0,0 +1,187 @@ +# esmole + monorepo design + +Date: 2026-06-16 +Status: approved, pending implementation plan + +## Goal + +Add Elasticsearch as a second MCP server (`esmole-mcp`) alongside the existing +`dbmole-mcp` (PostgreSQL/MySQL). Restructure the repo into an npm-workspaces +monorepo so both servers share infrastructure (connection store, manager cache, +SSH tunnel, MCP plumbing) without forcing Elasticsearch into the SQL-shaped +`Driver` interface. + +The agent sees two distinct MCP servers (two entries in its MCP config, each its +own `bin`). They live in one repo and share a private `core` package. + +### Why not reuse the SQL `Driver` for ES + +`Driver` is relational (`query(sql)`, `listDatabases`, `describeTable` with +PK/FK/indexes). Elasticsearch is a document/search store. Mapping `_search`→query, +indices→tables, mappings→describe gives a crippled, dishonest contract. ES gets +its own thin `Backend` abstraction instead; only the generic plumbing is shared. + +### Reference + +`../homelab/es-mcp` (Python/FastMCP) is the tool-surface baseline: a generic REST +passthrough plus four helpers, all returning `{status, body}` and never raising +on 4xx/5xx. esmole keeps that tool surface but swaps the obvious differences: +stdio transport (not HTTP+bearer), multi-connection named connections + store + +SSH tunnel inherited from dbmole (not single-connection from env). + +## Scope decisions + +- **ES versions:** 7.x and 8.x. Passthrough core is version-agnostic; helpers work + on both. No ESQL (`_query` is 8.11+, absent in 7.x). +- **Use cases:** read/debug + full CRUD + cluster ops, all reachable through the + generic passthrough; helpers cover the common read paths. +- **Improvements over the reference (all in scope):** output truncation / token + budget, per-connection `readonly` guard, mapping flatten (field:type list), + search projection (`_source` filter) + aggs-only mode. +- **Restructure:** full workspaces immediately — move existing `src` into + `packages/dbmole-mcp`, extract `core`. + +## Architecture (Approach A: generic core + injected schema) + +`core` owns the hard, backend-agnostic machinery; each leaf package supplies a +thin backend factory, its own connection schema, and its own tool set. + +### §1. Repo layout + +``` +dbmole-mcp/ # repo root, private, workspaces: ["packages/*"] + package.json # shared devDeps + scripts (lint/test/build all) + biome.json # shared + tsconfig.base.json # shared compiler options + packages/ + core/ # @dbmole/core — private, NOT published + dbmole-mcp/ # public npm, bin: dbmole-mcp + esmole-mcp/ # public npm, bin: esmole-mcp +``` + +`core` is **not published**. It is bundled into each leaf package via tsup +(`noExternal: [/@dbmole\/core/]`) so the published packages are self-contained, +with no inter-package version coupling and no third publish. Two public npm names +(`dbmole-mcp` unchanged, `esmole-mcp` new); `core` exists only inside the repo. + +### §2. core public surface (the generic seam) + +The current `registry`/`store`/`sources` import `connectionConfigSchema` directly +and hardcode `dbmole:` log prefixes and `DBMOLE_STORE` / `DBMOLE_CONNECTIONS` +env-var names. Generalization = inject the schema and the +storePath/envVar/logPrefix as dependencies. + +- `createRegistry({ storePath, configPath, env, schema, logPrefix, envVar })` — + schema injected; no direct import of any concrete schema. +- `createManager }>(registry, + { createBackend, createTunnel })` — generic over the backend; does not know + `Driver`. +- `baseConnectionShape` — a zod raw shape **without** `type`. Each package spreads + it, adds its own `type` enum and engine-specific fields, then calls `.strict()`. +- `openTunnel` / `Tunnel` — unchanged (pure TCP). +- `withManaged`, `respond` — unchanged, already generic. +- `registerConnectionTools(server, { manager, registry, schema, ping })` — generic + connection CRUD (list / add / remove / update / test_connection); `ping(backend)` + is the per-backend hook backing `test_connection`. +- format primitives: `clampLimit`, `truncateRows`, `truncateJsonBudget`. + +### §3. Manager generalization + +The manager needs only `dispose()` from a backend. All the hard logic — cache, +rotation, dispose-race handling, tunnel guards, retry-on-stale (`manager.ts:82-129`) +— moves verbatim into core. The only change: `defaultCreateDriver` becomes the +injected `createBackend(target)`. `DriverTarget` → `BackendTarget { config, host, +port }` (generic config type parameter). The `tunnel?.isClosed()` recheck stays. + +The SQL `Driver` interface stays in `dbmole-mcp`. ES implements its own `Backend`. +Both satisfy `{ dispose(): Promise }`, so both ride the same manager — ES +inherits multi-connection, named connections, SSH tunnel, runtime `add_connection`, +and the store for free (an upgrade over the single-connection reference). + +### §4. Connection schema split + +- **base (core):** `name`, `host`, `port?`, `user` (required), `password?`, + `readonly`, `ssh` — exactly dbmole's current fields minus `database`, so dbmole's + behavior is unchanged. (`database` leaves the base — it is SQL-specific.) +- **dbmole:** base + `type: enum(['postgres','mysql'])` + `database?`; + `defaultPort` 5432 / 3306. No override of base fields. +- **esmole:** base with `user` overridden to optional, + + `type: enum(['elasticsearch'])` + + `scheme: enum(['http','https']).default('https')` + + `verifyTls: boolean.default(true)` + `apiKey?` (sent as `Authorization: ApiKey + `), plus a `.refine` requiring user/password **or** apiKey; + `defaultPort` 9200. +- `registry.update`'s engine-switch port-drop (`registry.ts:130`) is already + generic (any `type` change without an explicit `port` drops the old port). + +### §5. esmole backend + tools + +**Backend** = an HTTP client (undici, keep-alive + `dispose()`) bound to +`scheme://tunnelHost:tunnelPort`, with auth (basic or apiKey) and `verifyTls`. +`request(method, path, { body, params })` → `{status, body}`, never throwing on +4xx/5xx; body parsed as JSON when possible, else text. A `string` body is sent +as-is (for NDJSON `_bulk`); dict/list is JSON-serialized. + +**readonly guard** (`es/guard.ts`, role analogous to `sqlGuard`): when the +connection is `readonly`, allow only GET/HEAD plus POST to a read-suffix allowlist +(`_search`, `_msearch`, `_count`, `_field_caps`, `_cat`, `_mapping`, `scroll`). +Block all PUT/DELETE and any other POST. Allowlist (not blocklist) so unknown +endpoints fail safe. + +**Tools (5 ES-specific + connection CRUD from core):** + +| tool | wraps | improvement | +|---|---|---| +| `es_request` | generic passthrough | readonly guard + truncation | +| `es_search` | `POST /{index}/_search` | `_source` projection + aggs-only (size:0) + truncation | +| `es_list_indices` | `GET /_cat/indices` | — | +| `es_get_mapping` | `GET /{index}/_mapping` | flatten to field:type list (default); `raw?` for nested JSON | +| `es_cluster_health` | `GET /_cluster/health` | — | + +Index is always explicit; there is no default index. `es_request` is the primary +tool and covers the entire ES REST surface; helpers are sugar for common reads. + +**truncation:** cap response by byte budget and hit count, set `truncated: true`, +mirroring dbmole's row truncation. + +### §6. Distribution / entry / docker + +- Each leaf package has its own stdio entry and `bin`. `esmole-mcp` → + `dist/index.js`. +- **Docker:** per-package Dockerfile, self-contained (core is bundled in). A root + multi-image build is optional/later. +- **npm:** `dbmole-mcp` (unchanged), `esmole-mcp` (new), both public; `core` + private. + +### §7. Testing + +- Per-package vitest projects: unit (mocked IO) + integration (testcontainers). + esmole integration runs against ES 7.x **and** 8.x containers. Coverage ≥90% + lines/functions per package — thresholds never lowered. +- Manager concurrency tests move into `core` alongside the manager. + +## Defaults + +- ES `readonly` default `false` (matches dbmole). +- Auth: user/password primary, `apiKey` optional. +- `scheme` default `https` (8.x-friendly); 7.x-over-http sets `http` explicitly. + +## Migration order (high level; detailed plan via writing-plans) + +1. Workspaces skeleton; `git mv src` into `packages/dbmole-mcp` (history + preserved); tests green. +2. Extract `core` (config/manager/tunnel/respond/format/connection-tools), inject + the schema; dbmole depends on core; tests green. +3. Scaffold esmole: schema → backend → guard → tools → entry, TDD. +4. Docker + publish config. + +## Out of scope + +- ESQL (`_query`) helper — 8.x-only, deferred. +- OpenSearch-specific testing — passthrough likely works, but helpers are not + validated against it. +- Publishing `core` as a standalone package. +- Cross-server unified config (each server keeps its own store namespace: + `ESMOLE_STORE` / `ESMOLE_CONNECTIONS` vs `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`). + +