Files
dbmole-mcp/docs/superpowers/specs/2026-06-16-esmole-monorepo-design.md
T

225 lines
11 KiB
Markdown

# esmole + monorepo design
Date: 2026-06-16
Status: approved, pending implementation plan
## Goal
Add Elasticsearch as a second MCP server (`esmole-mcp`) alongside the existing
`dbmole-mcp` (PostgreSQL/MySQL). Restructure the repo into an npm-workspaces
monorepo so both servers share infrastructure (connection store, manager cache,
SSH tunnel, MCP plumbing) without forcing Elasticsearch into the SQL-shaped
`Driver` interface.
The agent sees two distinct MCP servers (two entries in its MCP config, each its
own `bin`). They live in one repo and share a private `core` package.
### Why not reuse the SQL `Driver` for ES
`Driver` is relational (`query(sql)`, `listDatabases`, `describeTable` with
PK/FK/indexes). Elasticsearch is a document/search store. Mapping `_search`→query,
indices→tables, mappings→describe gives a crippled, dishonest contract. ES gets
its own thin `Backend` abstraction instead; only the generic plumbing is shared.
### Reference
`../homelab/es-mcp` (Python/FastMCP) is the tool-surface baseline: a generic REST
passthrough plus four helpers, all returning `{status, body}` and never raising
on 4xx/5xx. esmole keeps that tool surface but swaps the obvious differences:
stdio transport (not HTTP+bearer), multi-connection named connections + store +
SSH tunnel inherited from dbmole (not single-connection from env).
## Scope decisions
- **ES versions:** 7.x and 8.x. Passthrough core is version-agnostic; helpers work
on both. No ESQL (`_query` is 8.11+, absent in 7.x).
- **Use cases:** read/debug + full CRUD + cluster ops, all reachable through the
generic passthrough; helpers cover the common read paths.
- **Improvements over the reference (all in scope):** output truncation / token
budget, per-connection `readonly` guard, mapping flatten (field:type list),
search projection (`_source` filter) + aggs-only mode.
- **Restructure:** full workspaces immediately — move existing `src` into
`packages/dbmole-mcp`, extract `core`.
## Architecture (Approach A: generic core + injected schema)
`core` owns the hard, backend-agnostic machinery; each leaf package supplies a
thin backend factory, its own connection schema, and its own tool set.
### §1. Repo layout
```
dbmole-mcp/ # repo root, private, workspaces: ["packages/*"]
package.json # shared devDeps + scripts (lint/test/build all)
biome.json # shared
tsconfig.base.json # shared compiler options
packages/
core/ # @dbmole/core — private, NOT published
dbmole-mcp/ # public npm, bin: dbmole-mcp
esmole-mcp/ # public npm, bin: esmole-mcp
```
`core` is **not published**. It is bundled into each leaf package via tsup
(`noExternal: [/@dbmole\/core/]`) so the published packages are self-contained,
with no inter-package version coupling and no third publish. Two public npm names
(`dbmole-mcp` unchanged, `esmole-mcp` new); `core` exists only inside the repo.
### §2. core public surface (the generic seam)
The current `registry`/`store`/`sources` import `connectionConfigSchema` directly
and hardcode `dbmole:` log prefixes and `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`
env-var names. Generalization = inject the schema and the
storePath/envVar/logPrefix as dependencies.
- `createRegistry({ storePath, configPath, env, schema, logPrefix, envVar })`
schema injected; no direct import of any concrete schema.
- `createManager<TBackend extends { dispose(): Promise<void> }>(registry,
{ createBackend, createTunnel, resolvePort })` — generic over the backend; does
not know `Driver`. `resolvePort(config)` is injected because the manager itself
calls `defaultPort(config.type)` today (`manager.ts:67`); ES needs 9200, SQL
5432/3306.
- `baseConnectionShape` — a zod raw shape **without** `type`, **including** the
`ssh` field (`sshConfigSchema` moves to core; the tunnel is already SQL-free
except for the `SshConfig` type at `tunnel.ts:5`). Each package spreads it, adds
its own `type` enum and engine-specific fields, then calls `.strict()`.
- `openTunnel` / `Tunnel` — unchanged (pure TCP).
- `respond` — unchanged, already generic.
- `withManaged<TBackend, TConfig>(manager, name, fn, { isStaleError, formatError })`
— generic. It is **not** unchanged: today it imports SQL `DriverDisposedError`,
`ManagedConnection`, and `formatDbError(config.type, …)` (`managed.ts:25,34`).
Core exports a backend-neutral stale-error class; stale detection and error
formatting are injected by each package (or stale-retry moves into the manager
behind that neutral error).
- `registerConnectionTools(server, { manager, registry, fullSchema, patchSchema,
publicView, descriptions, ping, formatError })` — generic connection CRUD
(list / add / remove / update / test_connection). The current tools bake in SQL
patch fields, the SQL public view (`database`), SQL default-port rendering, SQL
descriptions, `serverVersion()`, and SQL error formatting
(`connections.ts:21,61,131`); all of these are package-owned and injected. Core
only orchestrates registry + manager calls. `ping(backend)` backs
`test_connection` (SQL `serverVersion()` vs ES `GET /`).
- format split: only truncation / token-budget helpers go to core (`clampLimit`,
`truncateRows`, `truncateJsonBudget`). SQL-shaped `normalizeCell` /
`formatDbError` (`format.ts:16,36`) stay in dbmole-mcp.
### §3. Manager generalization
The manager needs only `dispose()` from a backend. All the hard logic — cache,
rotation, dispose-race handling, tunnel guards, retry-on-stale (`manager.ts:82-129`)
— moves verbatim into core. Changes: `defaultCreateDriver` becomes the injected
`createBackend(target)`; the internal `defaultPort(config.type)` call
(`manager.ts:67`) becomes the injected `resolvePort(config)`. The
`tunnel?.isClosed()` recheck stays.
`DriverTarget` → `BackendTarget { config, connectHost, connectPort, serverName }`
(generic config type parameter). `connectHost`/`connectPort` are where the client
actually dials — the tunnel's `127.0.0.1:localPort` when tunneled, else the real
host/port. `serverName` is the original `config.host`, carried through for TLS SNI
/ certificate hostname verification. Without this split, HTTPS Elasticsearch over
an SSH tunnel fails `verifyTls`, because the cert covers the real host, not
`127.0.0.1` (`tunnel.ts:170`). SQL drivers ignore `serverName`; the ES client sets
it as the TLS servername.
The SQL `Driver` interface stays in `dbmole-mcp`. ES implements its own `Backend`.
Both satisfy `{ dispose(): Promise<void> }`, so both ride the same manager — ES
inherits multi-connection, named connections, SSH tunnel, runtime `add_connection`,
and the store for free (an upgrade over the single-connection reference).
### §4. Connection schema split
- **base (core):** `name`, `host`, `port?`, `user` (required), `password?`,
`readonly`, `ssh` — exactly dbmole's current fields minus `database`, so dbmole's
behavior is unchanged. (`database` leaves the base — it is SQL-specific.)
- **dbmole:** base + `type: enum(['postgres','mysql'])` + `database?`;
`defaultPort` 5432 / 3306. No override of base fields.
- **esmole:** base with `user` overridden to optional, +
`type: enum(['elasticsearch'])` +
`scheme: enum(['http','https']).default('https')` +
`verifyTls: boolean.default(true)` + `apiKey?` (sent as `Authorization: ApiKey
<value>`), plus a `.refine` requiring user/password **or** apiKey;
`defaultPort` 9200.
- `registry.update`'s engine-switch port-drop (`registry.ts:130`) is already
generic (any `type` change without an explicit `port` drops the old port).
### §5. esmole backend + tools
**Backend** = an HTTP client (undici, keep-alive + `dispose()`) dialing
`scheme://connectHost:connectPort` (the tunnel endpoint when tunneled), with auth
(basic or apiKey) and `verifyTls`. When `verifyTls` is on and the connection is
tunneled, the client sets the TLS servername to `BackendTarget.serverName` (the
real ES host) so certificate hostname verification passes (see §3).
`request(method, path, { body, params })` → `{status, body}`, never throwing on
4xx/5xx; body parsed as JSON when possible, else text. A `string` body is sent
as-is (for NDJSON `_bulk`); dict/list is JSON-serialized.
**readonly guard** (`es/guard.ts`, role analogous to `sqlGuard`): a method+path
boundary. When the connection is `readonly`, allow GET/HEAD plus POST to a
read-suffix allowlist (`_search`, `_msearch`, `_count`, `_field_caps`, `_cat`,
`_mapping`, `_search/scroll`, `_pit`) plus DELETE limited to `_pit` and
`_search/scroll` (point-in-time / scroll cleanup — read-session teardown, not data
mutation). Block all other PUT/DELETE and any other POST. `_sql` is blocked by
absence from the allowlist (it can write). Allowlist (not blocklist) so unknown
endpoints fail safe. Script content inside a `_search` body is content-level, not
method-level, and is out of scope for this guard.
**Tools (5 ES-specific + connection CRUD from core):**
| tool | wraps | improvement |
|---|---|---|
| `es_request` | generic passthrough | readonly guard + truncation |
| `es_search` | `POST /{index}/_search` | `_source` projection + aggs-only (size:0) + truncation |
| `es_list_indices` | `GET /_cat/indices` | — |
| `es_get_mapping` | `GET /{index}/_mapping` | flatten to field:type list (default); `raw?` for nested JSON |
| `es_cluster_health` | `GET /_cluster/health` | — |
Index is always explicit; there is no default index. `es_request` is the primary
tool and covers the entire ES REST surface; helpers are sugar for common reads.
**truncation:** cap response by byte budget and hit count, set `truncated: true`,
mirroring dbmole's row truncation.
### §6. Distribution / entry / docker
- Each leaf package has its own stdio entry and `bin`. `esmole-mcp` →
`dist/index.js`.
- **Docker:** per-package Dockerfile, self-contained (core is bundled in). A root
multi-image build is optional/later.
- **npm:** `dbmole-mcp` (unchanged), `esmole-mcp` (new), both public; `core`
private.
### §7. Testing
- Per-package vitest projects: unit (mocked IO) + integration (testcontainers).
esmole integration runs against ES 7.x **and** 8.x containers. Coverage ≥90%
lines/functions per package — thresholds never lowered.
- Manager concurrency tests move into `core` alongside the manager.
- Remap / alias test import paths **before** moving files, not after. Integration
tests reference the old `src/...` paths; if files move first, the ≥90% gate
breaks mid-migration.
## Defaults
- ES `readonly` default `false` (matches dbmole).
- Auth: user/password primary, `apiKey` optional.
- `scheme` default `https` (8.x-friendly); 7.x-over-http sets `http` explicitly.
## Migration order (high level; detailed plan via writing-plans)
1. Workspaces skeleton; `git mv src` into `packages/dbmole-mcp` (history
preserved); tests green.
2. Extract `core` (config/manager/tunnel/respond/format/connection-tools), inject
the schema; dbmole depends on core; tests green.
3. Scaffold esmole: schema → backend → guard → tools → entry, TDD.
4. Docker + publish config.
## Out of scope
- ESQL (`_query`) helper — 8.x-only, deferred.
- OpenSearch-specific testing — passthrough likely works, but helpers are not
validated against it.
- Publishing `core` as a standalone package.
- Cross-server unified config (each server keeps its own store namespace:
`ESMOLE_STORE` / `ESMOLE_CONNECTIONS` vs `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`).
</content>
</invoke>