# esmole + monorepo design Date: 2026-06-16 Status: approved, pending implementation plan ## Goal Add Elasticsearch as a second MCP server (`esmole-mcp`) alongside the existing `dbmole-mcp` (PostgreSQL/MySQL). Restructure the repo into an npm-workspaces monorepo so both servers share infrastructure (connection store, manager cache, SSH tunnel, MCP plumbing) without forcing Elasticsearch into the SQL-shaped `Driver` interface. The agent sees two distinct MCP servers (two entries in its MCP config, each its own `bin`). They live in one repo and share a private `core` package. ### Why not reuse the SQL `Driver` for ES `Driver` is relational (`query(sql)`, `listDatabases`, `describeTable` with PK/FK/indexes). Elasticsearch is a document/search store. Mapping `_search`→query, indices→tables, mappings→describe gives a crippled, dishonest contract. ES gets its own thin `Backend` abstraction instead; only the generic plumbing is shared. ### Reference `../homelab/es-mcp` (Python/FastMCP) is the tool-surface baseline: a generic REST passthrough plus four helpers, all returning `{status, body}` and never raising on 4xx/5xx. esmole keeps that tool surface but swaps the obvious differences: stdio transport (not HTTP+bearer), multi-connection named connections + store + SSH tunnel inherited from dbmole (not single-connection from env). ## Scope decisions - **ES versions:** 7.x and 8.x. Passthrough core is version-agnostic; helpers work on both. No ESQL (`_query` is 8.11+, absent in 7.x). - **Use cases:** read/debug + full CRUD + cluster ops, all reachable through the generic passthrough; helpers cover the common read paths. - **Improvements over the reference (all in scope):** output truncation / token budget, per-connection `readonly` guard, mapping flatten (field:type list), search projection (`_source` filter) + aggs-only mode. - **Restructure:** full workspaces immediately — move existing `src` into `packages/dbmole-mcp`, extract `core`. ## Architecture (Approach A: generic core + injected schema) `core` owns the hard, backend-agnostic machinery; each leaf package supplies a thin backend factory, its own connection schema, and its own tool set. ### §1. Repo layout ``` dbmole-mcp/ # repo root, private, workspaces: ["packages/*"] package.json # shared devDeps + scripts (lint/test/build all) biome.json # shared tsconfig.base.json # shared compiler options packages/ core/ # @dbmole/core — private, NOT published dbmole-mcp/ # public npm, bin: dbmole-mcp esmole-mcp/ # public npm, bin: esmole-mcp ``` `core` is **not published**. It is bundled into each leaf package via tsup (`noExternal: [/@dbmole\/core/]`) so the published packages are self-contained, with no inter-package version coupling and no third publish. Two public npm names (`dbmole-mcp` unchanged, `esmole-mcp` new); `core` exists only inside the repo. ### §2. core public surface (the generic seam) The current `registry`/`store`/`sources` import `connectionConfigSchema` directly and hardcode `dbmole:` log prefixes and `DBMOLE_STORE` / `DBMOLE_CONNECTIONS` env-var names. Generalization = inject the schema and the storePath/envVar/logPrefix as dependencies. - `createRegistry({ storePath, configPath, env, schema, logPrefix, envVar })` — schema injected; no direct import of any concrete schema. - `createManager }>(registry, { createBackend, createTunnel, resolvePort })` — generic over the backend; does not know `Driver`. `resolvePort(config)` is injected because the manager itself calls `defaultPort(config.type)` today (`manager.ts:67`); ES needs 9200, SQL 5432/3306. - `baseConnectionShape` — a zod raw shape **without** `type`, **including** the `ssh` field (`sshConfigSchema` moves to core; the tunnel is already SQL-free except for the `SshConfig` type at `tunnel.ts:5`). Each package spreads it, adds its own `type` enum and engine-specific fields, then calls `.strict()`. - `openTunnel` / `Tunnel` — unchanged (pure TCP). - `respond` — unchanged, already generic. - `withManaged(manager, name, fn, { isStaleError, formatError })` — generic. It is **not** unchanged: today it imports SQL `DriverDisposedError`, `ManagedConnection`, and `formatDbError(config.type, …)` (`managed.ts:25,34`). Core exports a backend-neutral stale-error class; stale detection and error formatting are injected by each package (or stale-retry moves into the manager behind that neutral error). - `registerConnectionTools(server, { manager, registry, fullSchema, patchSchema, publicView, descriptions, ping, formatError })` — generic connection CRUD (list / add / remove / update / test_connection). The current tools bake in SQL patch fields, the SQL public view (`database`), SQL default-port rendering, SQL descriptions, `serverVersion()`, and SQL error formatting (`connections.ts:21,61,131`); all of these are package-owned and injected. Core only orchestrates registry + manager calls. `ping(backend)` backs `test_connection` (SQL `serverVersion()` vs ES `GET /`). - format split: only truncation / token-budget helpers go to core (`clampLimit`, `truncateRows`, `truncateJsonBudget`). SQL-shaped `normalizeCell` / `formatDbError` (`format.ts:16,36`) stay in dbmole-mcp. ### §3. Manager generalization The manager needs only `dispose()` from a backend. All the hard logic — cache, rotation, dispose-race handling, tunnel guards, retry-on-stale (`manager.ts:82-129`) — moves verbatim into core. Changes: `defaultCreateDriver` becomes the injected `createBackend(target)`; the internal `defaultPort(config.type)` call (`manager.ts:67`) becomes the injected `resolvePort(config)`. The `tunnel?.isClosed()` recheck stays. `DriverTarget` → `BackendTarget { config, connectHost, connectPort, serverName }` (generic config type parameter). `connectHost`/`connectPort` are where the client actually dials — the tunnel's `127.0.0.1:localPort` when tunneled, else the real host/port. `serverName` is the original `config.host`, carried through for TLS SNI / certificate hostname verification. Without this split, HTTPS Elasticsearch over an SSH tunnel fails `verifyTls`, because the cert covers the real host, not `127.0.0.1` (`tunnel.ts:170`). SQL drivers ignore `serverName`; the ES client sets it as the TLS servername. The SQL `Driver` interface stays in `dbmole-mcp`. ES implements its own `Backend`. Both satisfy `{ dispose(): Promise }`, so both ride the same manager — ES inherits multi-connection, named connections, SSH tunnel, runtime `add_connection`, and the store for free (an upgrade over the single-connection reference). ### §4. Connection schema split - **base (core):** `name`, `host`, `port?`, `user` (required), `password?`, `readonly`, `ssh` — exactly dbmole's current fields minus `database`, so dbmole's behavior is unchanged. (`database` leaves the base — it is SQL-specific.) - **dbmole:** base + `type: enum(['postgres','mysql'])` + `database?`; `defaultPort` 5432 / 3306. No override of base fields. - **esmole:** base with `user` overridden to optional, + `type: enum(['elasticsearch'])` + `scheme: enum(['http','https']).default('https')` + `verifyTls: boolean.default(true)` + `apiKey?` (sent as `Authorization: ApiKey `), plus a `.refine` requiring user/password **or** apiKey; `defaultPort` 9200. - `registry.update`'s engine-switch port-drop (`registry.ts:130`) is already generic (any `type` change without an explicit `port` drops the old port). ### §5. esmole backend + tools **Backend** = an HTTP client (undici, keep-alive + `dispose()`) dialing `scheme://connectHost:connectPort` (the tunnel endpoint when tunneled), with auth (basic or apiKey) and `verifyTls`. When `verifyTls` is on and the connection is tunneled, the client sets the TLS servername to `BackendTarget.serverName` (the real ES host) so certificate hostname verification passes (see §3). `request(method, path, { body, params })` → `{status, body}`, never throwing on 4xx/5xx; body parsed as JSON when possible, else text. A `string` body is sent as-is (for NDJSON `_bulk`); dict/list is JSON-serialized. **readonly guard** (`es/guard.ts`, role analogous to `sqlGuard`): a method+path boundary. When the connection is `readonly`, allow GET/HEAD plus POST to a read-suffix allowlist (`_search`, `_msearch`, `_count`, `_field_caps`, `_cat`, `_mapping`, `_search/scroll`, `_pit`) plus DELETE limited to `_pit` and `_search/scroll` (point-in-time / scroll cleanup — read-session teardown, not data mutation). Block all other PUT/DELETE and any other POST. `_sql` is blocked by absence from the allowlist (it can write). Allowlist (not blocklist) so unknown endpoints fail safe. Script content inside a `_search` body is content-level, not method-level, and is out of scope for this guard. **Tools (5 ES-specific + connection CRUD from core):** | tool | wraps | improvement | |---|---|---| | `es_request` | generic passthrough | readonly guard + truncation | | `es_search` | `POST /{index}/_search` | `_source` projection + aggs-only (size:0) + truncation | | `es_list_indices` | `GET /_cat/indices` | — | | `es_get_mapping` | `GET /{index}/_mapping` | flatten to field:type list (default); `raw?` for nested JSON | | `es_cluster_health` | `GET /_cluster/health` | — | Index is always explicit; there is no default index. `es_request` is the primary tool and covers the entire ES REST surface; helpers are sugar for common reads. **truncation:** cap response by byte budget and hit count, set `truncated: true`, mirroring dbmole's row truncation. ### §6. Distribution / entry / docker - Each leaf package has its own stdio entry and `bin`. `esmole-mcp` → `dist/index.js`. - **Docker:** per-package Dockerfile, self-contained (core is bundled in). A root multi-image build is optional/later. - **npm:** `dbmole-mcp` (unchanged), `esmole-mcp` (new), both public; `core` private. ### §7. Testing - Per-package vitest projects: unit (mocked IO) + integration (testcontainers). esmole integration runs against ES 7.x **and** 8.x containers. Coverage ≥90% lines/functions per package — thresholds never lowered. - Manager concurrency tests move into `core` alongside the manager. - Remap / alias test import paths **before** moving files, not after. Integration tests reference the old `src/...` paths; if files move first, the ≥90% gate breaks mid-migration. ## Defaults - ES `readonly` default `false` (matches dbmole). - Auth: user/password primary, `apiKey` optional. - `scheme` default `https` (8.x-friendly); 7.x-over-http sets `http` explicitly. ## Migration order (high level; detailed plan via writing-plans) 1. Workspaces skeleton; `git mv src` into `packages/dbmole-mcp` (history preserved); tests green. 2. Extract `core` (config/manager/tunnel/respond/format/connection-tools), inject the schema; dbmole depends on core; tests green. 3. Scaffold esmole: schema → backend → guard → tools → entry, TDD. 4. Docker + publish config. ## Out of scope - ESQL (`_query`) helper — 8.x-only, deferred. - OpenSearch-specific testing — passthrough likely works, but helpers are not validated against it. - Publishing `core` as a standalone package. - Cross-server unified config (each server keeps its own store namespace: `ESMOLE_STORE` / `ESMOLE_CONNECTIONS` vs `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`).