Note: For high-level documentation, see the Documentation Index. This document provides implementation details.
Unified ETL (Extract, Transform, Load) pipeline for managing all resource types in midex. Replaces the fragmented content-registry and project-discovery systems with a clean, plugin-based architecture.
server/src/
├── index.ts # Public API
├── manager.ts # ResourceManager (orchestrator)
├── pipeline.ts # ETL pipeline runner
├── types.ts # Core interfaces
│
├── lib/ # Pipeline components
│ ├── extractor.ts # Extract: find & read from filesystem
│ ├── transformer.ts # Transform: parse, validate, normalize
│ ├── loader.ts # Load: persist to database
│ ├── conflict.ts # Conflict resolution
│ ├── hash.ts # SHA-256 hashing
│ ├── validator.ts # Zod validation wrapper
│ └── project-association.ts # Project association manager
│
├── plugins/ # Resource plugins
│ ├── content.ts # Agents/rules/workflows
│ ├── projects.ts # Project discovery/association
│ └── tool-configs/ # AI tool configurations
│ ├── README.md # Comprehensive documentation
│ ├── plugin.ts # Main orchestrator
│ ├── transformer.ts # Config transformation + secret redaction
│ ├── utils.ts # Shared utilities (tool detection, paths)
│ ├── types.ts # Type definitions
│ ├── extractors/ # Tool-specific extractors
│ │ ├── claude-code.ts # Claude Code CLI
│ │ ├── cursor.ts # Cursor IDE
│ │ ├── windsurf.ts # Windsurf (Codeium)
│ │ ├── vscode.ts # VS Code + Copilot
│ │ └── intellij.ts # IntelliJ (Junie)
│ └── __tests__/
│ └── validation.test.ts # Comprehensive test suite
│
└── schemas/ # Validation schemas
├── content-schemas.ts # Content types
└── tool-config-schemas.ts # Tool configurations
Every resource type follows the same three-stage pattern:
- Extract: Find and read resources from filesystem or other sources
- Transform: Parse, validate, and normalize into structured format
- Load: Persist to database with conflict resolution
Each resource type is a plugin that implements the ResourcePlugin interface:
interface ResourcePlugin<T> {
readonly name: string;
readonly resourceType: string;
extract(options: ExtractOptions): Promise<RawResource[]>;
transform(raw: RawResource, options?: TransformOptions): Promise<TransformedResource<T>>;
load(transformed: TransformedResource<T>, options: LoadOptions): Promise<void>;
sync?(context: PipelineContext): Promise<SyncResult>;
}Benefits:
- Self-contained: Each plugin encapsulates all logic for its resource type
- Reusable components: Extractor, transformer, loader shared across plugins
- Easy to extend: Add new plugins without modifying core system
- Type-safe: Full TypeScript support with generics
import { ResourceManager } from '@/src';
import { initDatabase } from '@/database';
const db = await initDatabase();
const manager = await ResourceManager.init({
database: db.connection,
basePath: 'server/content',
});// Sync all plugins (content + projects + tool-configs)
const results = await manager.syncAll();
console.log(`Content: +${results.content.added}, !${results.content.updated}`);
console.log(`Projects: +${results.projects.added}`);
console.log(`Tool Configs: +${results['tool-configs'].added}, !${results['tool-configs'].updated}`);// Sync only content
const result = await manager.sync('content');
console.log(`Added: ${result.added}`);
console.log(`Errors: ${result.errors}`);The MCP server (server/mcp/) uses the shared database tables populated by this pipeline:
- workflows: Read by MCP to list available workflows and retrieve phases
- agents: Read by MCP to get agent personas for workflow steps
The MCP server exposes these resources via a resources-first architecture:
- 7 resources (READ operations)
- 2 tools (WRITE operations:
workflow.start,workflow.next_step)
Workflow state is tracked in dedicated v2 tables (workflow_executions_v2, workflow_steps_v2) managed by the MCP server.
// Query workflows
const workflows = await manager.query('workflow', {
tags: ['security'],
limit: 10,
});
// Get single resource
const agent = await manager.get('agent', 'supervisor');import { ResourcePlugin } from '@/src';
class CustomPlugin implements ResourcePlugin {
readonly name = 'custom';
readonly resourceType = 'custom';
async extract(options) { /* ... */ }
async transform(raw, options) { /* ... */ }
async load(transformed, options) { /* ... */ }
}
manager.registerPlugin(new CustomPlugin());
await manager.sync('custom');Manages agents, rules, and workflows as unified content resources.
Extraction:
- Scans
server/content/agents/,server/content/rules/,server/content/workflows/ - Reads markdown files with frontmatter
- Computes SHA-256 hash for change detection
Transformation:
- Parses frontmatter using
gray-matter - Validates against Zod schemas (
AgentFrontmatterSchema, etc.) - Normalizes data structure
Loading:
- Persists to
agents,rules,workflowstables - Upserts on conflict (by name)
- Stores tags as JSON arrays
Discovers and tracks projects across sessions.
Extraction:
- Scans parent directory for subdirectories
- Detects git repositories (
.gitdirectory) - Reads
package.jsonmetadata when available
Transformation:
- Structures project data (name, path, isGitRepo, metadata)
Loading:
- Persists to
project_associationstable - Updates
last_used_aton access - Upserts on conflict (by path)
Discovers and manages AI coding tool configurations across editors and IDEs.
Supported Tools:
- Claude Code (CLI) - MCP, hooks, slash commands, agent rules
- Cursor - MCP, settings, multi-file rules
- Windsurf (Codeium) - MCP, multi-file rules, ignore patterns
- VS Code (Copilot) - MCP, Copilot instructions, settings
- IntelliJ (Junie) - MCP (limited)
Key Features:
- Official patterns only: All configs based on vendor documentation
- Cross-platform: Windows, macOS, Linux support
- Dual-level: Project and user-level config extraction
- Security: Secret redaction with configurable patterns
- Incremental sync: Hash-based change detection
Configuration Files Extracted:
Claude Code: .mcp.json, .claude/settings.json, .claude/commands/*.md, CLAUDE.md
Cursor: .cursor/mcp.json, .cursor/settings.json, .cursor/rules/*.mdc, .cursorrules
Windsurf: .windsurf/mcp.json, .windsurf/rules/*.md, .windsurfrules, .codeiumignore
VS Code: .github/copilot-instructions.md, *.instructions.md, .vscode/settings.json
IntelliJ: .junie/mcp/mcp.json
For detailed documentation, see: Tool Config Plugin README
Generic extractor for filesystem resources:
- Walks directory tree
- Filters by patterns (e.g.,
*.md) - Excludes files starting with
_or. - Computes content hash
- Captures last modified time
Specialized transformer for markdown resources:
- Parses frontmatter with
gray-matter - Validates frontmatter against Zod schema
- Builds final data structure via callback
- Optional strict/lenient validation modes
Generic loader for database persistence:
- Supports insert or upsert operations
- Dynamic column mapping from data
- Converts arrays/objects to JSON strings
- Uses prepared statements for performance
Resolves conflicts between filesystem and database:
- Strategies:
keep-newest,keep-filesystem,keep-database,manual - Hash-based detection: Compares SHA-256 hashes to detect changes
- Timestamp-based resolution: Uses last modified time for
keep-newest
All content types validated with strict Zod schemas:
{
name: string (1-100 chars)
description: string (max 500 chars)
tags: string[] (max 20 items × 50 chars, default: [])
version?: string (max 20 chars, optional)
}{
name: string (1-100 chars)
description: string (max 500 chars)
globs: string[] (max 50 items × 200 chars, default: [])
alwaysApply: boolean (default: false)
tags: string[] (max 20 items × 50 chars, default: [])
}{
name: string (1-200 chars)
description: string (max 2000 chars)
tags: string[] (max 20 items × 50 chars, default: [])
keywords: string[] (max 50 items × 100 chars, default: [])
complexityHint?: 'simple' | 'moderate' | 'high'
}- Unified: Single system for all resource types
- Consistent: Same ETL pattern across all types
- Easy to extend: Just add a new plugin
- Loosely coupled: Plugin interface decouples types from core
- Type-safe: Full TypeScript with Zod validation
- Incremental sync: Hash-based change detection avoids unnecessary updates
- Prepared statements: Database operations use cached prepared statements
- Parallel extraction: Can extract multiple resource types concurrently
- Streaming: Large files handled efficiently with streaming readers
# Run all tests
npm test
# Run specific test
npx vitest src/manager.test.ts- Watch mode: Filesystem watching for real-time sync
- Remote sources: Extract from APIs, S3, etc.
- Bidirectional sync: Full conflict resolution with manual intervention UI
- Batch operations: Bulk inserts for performance
class ResourceManager {
static async init(options: ResourceManagerOptions): Promise<ResourceManager>
registerPlugin(plugin: ResourcePlugin): void
getPlugin(name: string): ResourcePlugin | undefined
async syncAll(): Promise<Record<string, SyncResult>>
async sync(pluginName: string): Promise<SyncResult>
async query<T>(resourceType: string, options?: QueryOptions): Promise<T[]>
async get<T>(resourceType: string, name: string): Promise<T | null>
close(): void
}class Pipeline {
async run<T>(plugin: ResourcePlugin<T>, context: PipelineContext): Promise<SyncResult>
async sync(plugin: ResourcePlugin, context: PipelineContext): Promise<SyncResult>
async extract(plugin: ResourcePlugin, options: ExtractOptions): Promise<RawResource[]>
async transform<T>(plugin: ResourcePlugin<T>, raw: RawResource, options?: TransformOptions): Promise<TransformedResource<T>>
async load<T>(plugin: ResourcePlugin<T>, transformed: TransformedResource<T>, options: LoadOptions): Promise<void>
}