188 lines
8.8 KiB
Markdown
188 lines
8.8 KiB
Markdown
# esmole + monorepo design
|
|
|
|
Date: 2026-06-16
|
|
Status: approved, pending implementation plan
|
|
|
|
## Goal
|
|
|
|
Add Elasticsearch as a second MCP server (`esmole-mcp`) alongside the existing
|
|
`dbmole-mcp` (PostgreSQL/MySQL). Restructure the repo into an npm-workspaces
|
|
monorepo so both servers share infrastructure (connection store, manager cache,
|
|
SSH tunnel, MCP plumbing) without forcing Elasticsearch into the SQL-shaped
|
|
`Driver` interface.
|
|
|
|
The agent sees two distinct MCP servers (two entries in its MCP config, each its
|
|
own `bin`). They live in one repo and share a private `core` package.
|
|
|
|
### Why not reuse the SQL `Driver` for ES
|
|
|
|
`Driver` is relational (`query(sql)`, `listDatabases`, `describeTable` with
|
|
PK/FK/indexes). Elasticsearch is a document/search store. Mapping `_search`→query,
|
|
indices→tables, mappings→describe gives a crippled, dishonest contract. ES gets
|
|
its own thin `Backend` abstraction instead; only the generic plumbing is shared.
|
|
|
|
### Reference
|
|
|
|
`../homelab/es-mcp` (Python/FastMCP) is the tool-surface baseline: a generic REST
|
|
passthrough plus four helpers, all returning `{status, body}` and never raising
|
|
on 4xx/5xx. esmole keeps that tool surface but swaps the obvious differences:
|
|
stdio transport (not HTTP+bearer), multi-connection named connections + store +
|
|
SSH tunnel inherited from dbmole (not single-connection from env).
|
|
|
|
## Scope decisions
|
|
|
|
- **ES versions:** 7.x and 8.x. Passthrough core is version-agnostic; helpers work
|
|
on both. No ESQL (`_query` is 8.11+, absent in 7.x).
|
|
- **Use cases:** read/debug + full CRUD + cluster ops, all reachable through the
|
|
generic passthrough; helpers cover the common read paths.
|
|
- **Improvements over the reference (all in scope):** output truncation / token
|
|
budget, per-connection `readonly` guard, mapping flatten (field:type list),
|
|
search projection (`_source` filter) + aggs-only mode.
|
|
- **Restructure:** full workspaces immediately — move existing `src` into
|
|
`packages/dbmole-mcp`, extract `core`.
|
|
|
|
## Architecture (Approach A: generic core + injected schema)
|
|
|
|
`core` owns the hard, backend-agnostic machinery; each leaf package supplies a
|
|
thin backend factory, its own connection schema, and its own tool set.
|
|
|
|
### §1. Repo layout
|
|
|
|
```
|
|
dbmole-mcp/ # repo root, private, workspaces: ["packages/*"]
|
|
package.json # shared devDeps + scripts (lint/test/build all)
|
|
biome.json # shared
|
|
tsconfig.base.json # shared compiler options
|
|
packages/
|
|
core/ # @dbmole/core — private, NOT published
|
|
dbmole-mcp/ # public npm, bin: dbmole-mcp
|
|
esmole-mcp/ # public npm, bin: esmole-mcp
|
|
```
|
|
|
|
`core` is **not published**. It is bundled into each leaf package via tsup
|
|
(`noExternal: [/@dbmole\/core/]`) so the published packages are self-contained,
|
|
with no inter-package version coupling and no third publish. Two public npm names
|
|
(`dbmole-mcp` unchanged, `esmole-mcp` new); `core` exists only inside the repo.
|
|
|
|
### §2. core public surface (the generic seam)
|
|
|
|
The current `registry`/`store`/`sources` import `connectionConfigSchema` directly
|
|
and hardcode `dbmole:` log prefixes and `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`
|
|
env-var names. Generalization = inject the schema and the
|
|
storePath/envVar/logPrefix as dependencies.
|
|
|
|
- `createRegistry({ storePath, configPath, env, schema, logPrefix, envVar })` —
|
|
schema injected; no direct import of any concrete schema.
|
|
- `createManager<TBackend extends { dispose(): Promise<void> }>(registry,
|
|
{ createBackend, createTunnel })` — generic over the backend; does not know
|
|
`Driver`.
|
|
- `baseConnectionShape` — a zod raw shape **without** `type`. Each package spreads
|
|
it, adds its own `type` enum and engine-specific fields, then calls `.strict()`.
|
|
- `openTunnel` / `Tunnel` — unchanged (pure TCP).
|
|
- `withManaged`, `respond` — unchanged, already generic.
|
|
- `registerConnectionTools(server, { manager, registry, schema, ping })` — generic
|
|
connection CRUD (list / add / remove / update / test_connection); `ping(backend)`
|
|
is the per-backend hook backing `test_connection`.
|
|
- format primitives: `clampLimit`, `truncateRows`, `truncateJsonBudget`.
|
|
|
|
### §3. Manager generalization
|
|
|
|
The manager needs only `dispose()` from a backend. All the hard logic — cache,
|
|
rotation, dispose-race handling, tunnel guards, retry-on-stale (`manager.ts:82-129`)
|
|
— moves verbatim into core. The only change: `defaultCreateDriver` becomes the
|
|
injected `createBackend(target)`. `DriverTarget` → `BackendTarget { config, host,
|
|
port }` (generic config type parameter). The `tunnel?.isClosed()` recheck stays.
|
|
|
|
The SQL `Driver` interface stays in `dbmole-mcp`. ES implements its own `Backend`.
|
|
Both satisfy `{ dispose(): Promise<void> }`, so both ride the same manager — ES
|
|
inherits multi-connection, named connections, SSH tunnel, runtime `add_connection`,
|
|
and the store for free (an upgrade over the single-connection reference).
|
|
|
|
### §4. Connection schema split
|
|
|
|
- **base (core):** `name`, `host`, `port?`, `user` (required), `password?`,
|
|
`readonly`, `ssh` — exactly dbmole's current fields minus `database`, so dbmole's
|
|
behavior is unchanged. (`database` leaves the base — it is SQL-specific.)
|
|
- **dbmole:** base + `type: enum(['postgres','mysql'])` + `database?`;
|
|
`defaultPort` 5432 / 3306. No override of base fields.
|
|
- **esmole:** base with `user` overridden to optional, +
|
|
`type: enum(['elasticsearch'])` +
|
|
`scheme: enum(['http','https']).default('https')` +
|
|
`verifyTls: boolean.default(true)` + `apiKey?` (sent as `Authorization: ApiKey
|
|
<value>`), plus a `.refine` requiring user/password **or** apiKey;
|
|
`defaultPort` 9200.
|
|
- `registry.update`'s engine-switch port-drop (`registry.ts:130`) is already
|
|
generic (any `type` change without an explicit `port` drops the old port).
|
|
|
|
### §5. esmole backend + tools
|
|
|
|
**Backend** = an HTTP client (undici, keep-alive + `dispose()`) bound to
|
|
`scheme://tunnelHost:tunnelPort`, with auth (basic or apiKey) and `verifyTls`.
|
|
`request(method, path, { body, params })` → `{status, body}`, never throwing on
|
|
4xx/5xx; body parsed as JSON when possible, else text. A `string` body is sent
|
|
as-is (for NDJSON `_bulk`); dict/list is JSON-serialized.
|
|
|
|
**readonly guard** (`es/guard.ts`, role analogous to `sqlGuard`): when the
|
|
connection is `readonly`, allow only GET/HEAD plus POST to a read-suffix allowlist
|
|
(`_search`, `_msearch`, `_count`, `_field_caps`, `_cat`, `_mapping`, `scroll`).
|
|
Block all PUT/DELETE and any other POST. Allowlist (not blocklist) so unknown
|
|
endpoints fail safe.
|
|
|
|
**Tools (5 ES-specific + connection CRUD from core):**
|
|
|
|
| tool | wraps | improvement |
|
|
|---|---|---|
|
|
| `es_request` | generic passthrough | readonly guard + truncation |
|
|
| `es_search` | `POST /{index}/_search` | `_source` projection + aggs-only (size:0) + truncation |
|
|
| `es_list_indices` | `GET /_cat/indices` | — |
|
|
| `es_get_mapping` | `GET /{index}/_mapping` | flatten to field:type list (default); `raw?` for nested JSON |
|
|
| `es_cluster_health` | `GET /_cluster/health` | — |
|
|
|
|
Index is always explicit; there is no default index. `es_request` is the primary
|
|
tool and covers the entire ES REST surface; helpers are sugar for common reads.
|
|
|
|
**truncation:** cap response by byte budget and hit count, set `truncated: true`,
|
|
mirroring dbmole's row truncation.
|
|
|
|
### §6. Distribution / entry / docker
|
|
|
|
- Each leaf package has its own stdio entry and `bin`. `esmole-mcp` →
|
|
`dist/index.js`.
|
|
- **Docker:** per-package Dockerfile, self-contained (core is bundled in). A root
|
|
multi-image build is optional/later.
|
|
- **npm:** `dbmole-mcp` (unchanged), `esmole-mcp` (new), both public; `core`
|
|
private.
|
|
|
|
### §7. Testing
|
|
|
|
- Per-package vitest projects: unit (mocked IO) + integration (testcontainers).
|
|
esmole integration runs against ES 7.x **and** 8.x containers. Coverage ≥90%
|
|
lines/functions per package — thresholds never lowered.
|
|
- Manager concurrency tests move into `core` alongside the manager.
|
|
|
|
## Defaults
|
|
|
|
- ES `readonly` default `false` (matches dbmole).
|
|
- Auth: user/password primary, `apiKey` optional.
|
|
- `scheme` default `https` (8.x-friendly); 7.x-over-http sets `http` explicitly.
|
|
|
|
## Migration order (high level; detailed plan via writing-plans)
|
|
|
|
1. Workspaces skeleton; `git mv src` into `packages/dbmole-mcp` (history
|
|
preserved); tests green.
|
|
2. Extract `core` (config/manager/tunnel/respond/format/connection-tools), inject
|
|
the schema; dbmole depends on core; tests green.
|
|
3. Scaffold esmole: schema → backend → guard → tools → entry, TDD.
|
|
4. Docker + publish config.
|
|
|
|
## Out of scope
|
|
|
|
- ESQL (`_query`) helper — 8.x-only, deferred.
|
|
- OpenSearch-specific testing — passthrough likely works, but helpers are not
|
|
validated against it.
|
|
- Publishing `core` as a standalone package.
|
|
- Cross-server unified config (each server keeps its own store namespace:
|
|
`ESMOLE_STORE` / `ESMOLE_CONNECTIONS` vs `DBMOLE_STORE` / `DBMOLE_CONNECTIONS`).
|
|
</content>
|
|
</invoke>
|