Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
# Changelog

## [2025-10-17T06:42:38-04:00 (America/New_York)]
### Changed
- Added `entity_label` to triplet CSV rows generated by `scripts/generate_synthetic_dataset.py` and refreshed ingestion
documentation (`docs/retrieval.md`, `README.md`, `docs/operations.md`, `docs/testing.md`, `SETUP.md`) plus planning collateral
(`PROJECT.md`, `PLAN.md`, `ROADMAP.md`, `SOT.md`, `ENVIRONMENT_NEEDS.md`, `NEEDED_FOR_TESTING.md`, `PLANNING_THOUGHTS.md`,
`ISSUES.md`, `TODO.md`, `RESUME_NOTES.md`) so synthetic dataset guidance stays accurate.

## [2025-10-16T22:44:21-04:00 (America/New_York)]
### Changed
- Simplified roadmap section headings in `ROADMAP.md` by removing week estimates from the horizon labels to
emphasise qualitative prioritisation.

## [2025-10-16T21:44:46-04:00 (America/New_York)]
### Added
- Documented a synthetic dataset ingestion workflow in `docs/retrieval.md` (including sample loader code) so benchmarking
runs can hydrate graph drivers without recomputing embeddings.

### Changed
- Expanded operations, setup, and environment guides (`docs/operations.md`, `SETUP.md`, `ENVIRONMENT_NEEDS.md`,
`NEEDED_FOR_TESTING.md`) with batching/verification tips for loading generated JSONL/CSV corpora.
- Updated core documentation and planning artifacts (`README.md`, `PROJECT.md`, `PLAN.md`, `ROADMAP.md`, `SOT.md`,
`RECOMMENDATIONS.md`, `PLANNING_THOUGHTS.md`, `ISSUES.md`, `RESUME_NOTES.md`, `TODO.md`) to reference the ingestion workflow
and capture the follow-up automation task.

## [2025-10-16T20:39:06-04:00 (America/New_York)]
### Added
- Added live integration coverage for Memgraph, Neo4j, and Redis via `meshmind/tests/test_integration_live.py` and configured
Expand Down
5 changes: 4 additions & 1 deletion ENVIRONMENT_NEEDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@
consolidation heuristics and pagination under load. The new
`scripts/generate_synthetic_dataset.py` utility produces JSONL/CSV corpora
(defaults: 10k memories, 20k triplets, 384-dim embeddings) that can be copied to
shared storage for on-demand benchmarking.
shared storage for on-demand benchmarking. Triplet rows now embed `entity_label`,
so pairing the shared datasets with the ingestion workflow documented in
`docs/retrieval.md` lets operators seed environments quickly without recomputing
embeddings or rewriting CSV headers.
- Maintain outbound package download access to PyPI and vendor repositories; this
session confirmed package installation works when the network is open, and future
sessions need the same capability to refresh locks or install new optional
Expand Down
4 changes: 3 additions & 1 deletion ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@

- [ ] Validate the new Docker Compose stacks (root and `meshmind/tests/docker/`) on an environment with container support and document host requirements (ports, resources).
## Low Priority / Nice to Have
- [x] Align synthetic dataset triplet CSV headers with `Triplet` schema (added `entity_label`) and refresh ingestion docs.
- [x] Remove week-based horizon estimates from roadmap headings to avoid implying precise delivery dates in planning docs.
- [x] Offer alternative storage backends (in-memory driver, SQLite, etc.) for easier local development.
- [x] Provide an administrative dashboard or CLI commands for listing namespaces, counts, and maintenance statistics (CLI admin subcommands now expose predicates, telemetry, and graph checks).
- [ ] Publish onboarding guides and troubleshooting FAQs for contributors.
- [ ] Publish onboarding guides and troubleshooting FAQs for contributors (synthetic dataset ingestion docs landed in `docs/retrieval.md`, but a broader newcomer guide is still pending).
- [ ] Explore plugin registration for embeddings and retrieval strategies to reduce manual wiring.
2 changes: 1 addition & 1 deletion NEEDED_FOR_TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
external services are unavailable.
- Use `meshmind/testing` fakes (`FakeMemgraphDriver`, `FakeRedisBroker`, `FakeEmbeddingEncoder`, `FakeLLMClient`) in tests or demos to eliminate external infrastructure requirements. Integration suites marked with `@pytest.mark.integration` exercise live Memgraph/Neo4j/Redis instances and expect the docker stack to be running.
- Invoke `meshmind admin predicates` and `meshmind admin maintenance --max-attempts <n> --base-delay <seconds> --run <task>` during local runs to inspect predicate registries, telemetry, and tune maintenance retries without external services.
- Use the benchmarking utilities in `scripts/` (`evaluate_importance.py`, `consolidation_benchmark.py`, `benchmark_pagination.py`) to validate heuristics and driver performance offline before connecting to live infrastructure. Generate large corpora with `scripts/generate_synthetic_dataset.py` when you need ≥10k memories for stress tests.
- Use the benchmarking utilities in `scripts/` (`evaluate_importance.py`, `consolidation_benchmark.py`, `benchmark_pagination.py`) to validate heuristics and driver performance offline before connecting to live infrastructure. Generate large corpora with `scripts/generate_synthetic_dataset.py` when you need ≥10k memories for stress tests; triplet CSV rows now ship with `entity_label`, so the ingestion workflow in `docs/retrieval.md` can hydrate graph drivers without extra mutation.
- Seed demo data as needed using the `examples/extract_preprocess_store_example.py` script after configuring environment
variables.
- Create a `.env` file storing the environment variables above for consistent local configuration.
Expand Down
5 changes: 4 additions & 1 deletion PLAN.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Plan of Action

Roadmap milestones now reference qualitative horizons (Near/Mid/Long-Term) instead of week estimates to focus this plan on sequencing rather than timeboxing.

## Phase 1 – Stabilize Runtime Basics ✅
1. **Dependency Guards** – Implemented lazy driver factories, optional imports, and clear ImportErrors for missing packages.
2. **Default Encoder Registration** – Bootstraps register encoders/entities automatically and the CLI invokes them on startup.
Expand All @@ -20,7 +22,8 @@
2. **Maintenance Tasks** – Tasks emit telemetry, persist consolidation/compression results, and now retry conflicting writes with
configurable exponential backoff (`MAINTENANCE_MAX_ATTEMPTS`, `MAINTENANCE_BASE_DELAY_SECONDS`). Synthetic benchmark scripts,
the new `scripts/generate_synthetic_dataset.py`, and integration tests against live Memgraph/Neo4j validate behaviour on larger
workloads; next, replay production-like datasets to tune thresholds.
workloads. Fresh documentation in `docs/retrieval.md` and `docs/operations.md` now describes how to ingest those synthetic datasets
(with triplet CSVs that include `entity_label`) into the target backend; next, replay production-like datasets to tune thresholds.
3. **Importance Scoring Improvements** – Heuristic scoring is live, records distribution metrics via telemetry, and ships with
`scripts/evaluate_importance.py` for synthetic/offline evaluation. Next: incorporate real feedback loops or LLM-assisted
ranking to tune weights over time.
Expand Down
2 changes: 1 addition & 1 deletion PLANNING_THOUGHTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
- **Pydantic Model Policy** – Follow the documented plan (target Pydantic 2.12+, refresh locks when 3.13 wheels land, record migration guidance) to avoid resurrecting compatibility shims.

## Upcoming Research
- Benchmark consolidation heuristics on synthetic datasets representing customer scale and capture telemetry snapshots (seed data via `scripts/generate_synthetic_dataset.py`).
- Benchmark consolidation heuristics on synthetic datasets representing customer scale and capture telemetry snapshots (seed data via `scripts/generate_synthetic_dataset.py`—whose triplet CSV now includes `entity_label`—and load it using the ingestion workflow documented in `docs/retrieval.md`).
- Compare graph query latency across in-memory, SQLite, Memgraph, and Neo4j drivers when using pagination and filtering.
- Evaluate rerank quality across LLM providers using a labelled evaluation set to determine optimal default models.
- Investigate options for secure secret storage (e.g., Vault, AWS Secrets Manager) to standardise API key management.
2 changes: 1 addition & 1 deletion PROJECT.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@
- Docker Compose now provisions Memgraph, Neo4j, and Redis; integration-specific stacks (including the Celery worker) live under
`meshmind/tests/docker/`. `pytest -m integration` exercises live services once the stack is running. See `ENVIRONMENT_NEEDS.md`
and `SETUP.md` for enabling optional services locally.
- `scripts/generate_synthetic_dataset.py` produces large JSONL/CSV corpora (defaults: 10k memories, 20k triplets, 384-dim embeddings) to stress retrieval and consolidation flows prior to ingesting real datasets.
- `scripts/generate_synthetic_dataset.py` produces large JSONL/CSV corpora (defaults: 10k memories, 20k triplets, 384-dim embeddings) to stress retrieval and consolidation flows prior to ingesting real datasets. Triplet rows ship with `entity_label` so the ingestion workflow documented in `docs/retrieval.md` hydrates graph drivers without additional preprocessing.

## Roadmap Highlights
- Push graph-backed retrieval deeper into the drivers (vector similarity, structured filters) so the new server-side filtering/pagination evolves into full backend-native ranking.
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,11 @@ Tasks instantiate the driver lazily, emit structured logs/metrics, and persist c
## Benchmarking & Evaluation
- **Synthetic dataset generation** – `scripts/generate_synthetic_dataset.py` creates large JSONL/CSV corpora of
memories/triplets (defaults: 10k memories, 20k triplets, 384-dim embeddings) so you can stress retrieval, consolidation,
and integration flows before ingesting real data.
and integration flows before ingesting real data. Triplet rows now ship with `entity_label` to match
`meshmind.core.types.Triplet`.
- **Synthetic dataset ingestion** – Follow the workflow documented in `docs/retrieval.md` to load the generated JSONL/CSV
payloads into MeshMind via the Python client. The operations guide walks through batching tips and post-ingestion
verification so benchmark runs start from a consistent baseline.
- **Importance scoring** – `scripts/evaluate_importance.py` runs the heuristic against JSON or synthetic datasets and reports
descriptive statistics for quick regression checks.
- **Consolidation throughput** – `scripts/consolidation_benchmark.py` generates synthetic workloads to measure batch merging
Expand Down
4 changes: 3 additions & 1 deletion RECOMMENDATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@

## Documentation & Onboarding
- Keep `README.md`, `SOT.md`, `docs/`, and onboarding guides synchronized with each release; document rerank, retrieval, and
registry flows with diagrams when possible.
registry flows with diagrams when possible. The new synthetic dataset ingestion workflow in `docs/retrieval.md` should be
incorporated into future onboarding materials.
- Keep roadmap horizons qualitative (Near/Mid/Long-Term) instead of week-based estimates so planning docs emphasise sequencing and flexibility.
- Maintain the troubleshooting section for optional tooling (ruff, pyright, typeguard, toml-sort, yamllint) now referenced in
the Makefile and expand it as new developer utilities are introduced. Keep `SETUP.md` synchronized when dependencies change.
- Provide walkthroughs for configuring LLM reranking, including sample prompts and response expectations.
Expand Down
6 changes: 4 additions & 2 deletions RESUME_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@

## Latest Changes

- Removed week-based estimates from roadmap section headings and refreshed planning docs (`PLAN.md`, `SOT.md`, `RECOMMENDATIONS.md`, `ISSUES.md`, `TODO.md`) to emphasise qualitative sequencing.
- Added live integration coverage (`meshmind/tests/test_integration_live.py`) for Memgraph, Neo4j, and Redis, introduced a pytest marker configuration, and documented the workflow across README/SETUP/docs.
- Generated a fresh `uv.lock`, pinned `.python-version` to 3.12, and updated install docs to standardise on `uv sync --all-extras`.
- Created `scripts/generate_synthetic_dataset.py` for large JSONL/CSV corpora and referenced it across benchmarking docs.
- Created `scripts/generate_synthetic_dataset.py` for large JSONL/CSV corpora, added `entity_label` to triplet CSV rows, and referenced it across benchmarking docs.
- Documented the synthetic dataset ingestion workflow across `docs/retrieval.md`, `docs/operations.md`, README, and supporting planning guides so benchmarks can load corpora without recomputing embeddings.
- Updated documentation and planning collateral (README.md, SETUP.md, docs/development.md, docs/testing.md, docs/operations.md, PROJECT.md, PLAN.md, RECOMMENDATIONS.md, ROADMAP.md, ENVIRONMENT_NEEDS.md, NEEDED_FOR_TESTING.md, SOT.md, PLANNING_THOUGHTS.md, DUMMIES.md, TODO.md, RESUME_NOTES.md) to reflect the integration workflow, dataset generation, and the new Pydantic policy.

## Environment State
Expand All @@ -26,5 +28,5 @@
1. Address remaining `TODO.md` priority items (backend-native vector similarity, Celery worker integration, grpcurl end-to-end tests) now that graph services are accessible locally.
2. Automate the integration suite in CI and capture resource requirements for shared infrastructure.
3. Prepare grpcurl-based smoke tests for `meshmind serve-grpc` and plan protobuf client packaging once integration coverage extends beyond the Python stub.
4. Feed findings from large synthetic datasets into retry/backoff defaults and document recommended values in `ENVIRONMENT_NEEDS.md`.
4. Feed findings from large synthetic datasets into retry/backoff defaults and document recommended values in `ENVIRONMENT_NEEDS.md`, validating the new ingestion workflow as part of those runs.
5. Continue tracking shim retirements in `DUMMIES.md` and follow the cleanup plan in `CLEANUP.md` so remaining fakes can be removed when infrastructure allows.
8 changes: 4 additions & 4 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,21 @@
- Support multiple graph backends (in-memory, SQLite, Memgraph, Neo4j) with consistent telemetry, maintenance, and LLM orchestration knobs.
- Provide developers with reproducible tooling, comprehensive documentation, and automation scripts that keep local and CI environments aligned.

## Near-Term (0–2 Weeks)
## Near-Term
- Automate the new integration suite (`pytest -m integration`) in CI so Memgraph/Neo4j/Redis regressions fail fast.
- Finalize maintenance write policies by implementing retry/backoff semantics and measuring consolidation accuracy against representative datasets (now aided by `scripts/generate_synthetic_dataset.py`).
- Finalize maintenance write policies by implementing retry/backoff semantics and measuring consolidation accuracy against representative datasets (now aided by `scripts/generate_synthetic_dataset.py`, whose triplet CSV exposes `entity_label`, and the documented ingestion workflow in `docs/retrieval.md`).
- Publish ROADMAP and PLANNING_THOUGHTS artifacts, and seed the `research/` folder with competitive analysis to ground prioritization discussions.
- Expand automated smoke tests for REST `/memories/counts`, CLI `meshmind admin counts`, and provisioning scripts to ensure guardrails stay trustworthy.
- Capture outstanding shim retirement work (FastAPI tests now live; continue tracking FakeLLM/Fake drivers) in CLEANUP.md with precise acceptance criteria for each removal.

## Mid-Term (2–6 Weeks)
## Mid-Term
- Run load tests against SQLite and hosted graph backends to tune pagination defaults, consolidation heuristics, and token compression strategies.
- Implement backend-native vector similarity queries and schema indexes so embeddings never leave the database during scoring.
- Finalise the gRPC surface by building on the new asyncio server helpers—exercise the `meshmind serve-grpc` CLI entry point within Docker Compose, publish generated clients (Python + additional languages), and add integration smoke tests so external agents can integrate without the in-process stub.
- Instrument observability exports (Prometheus/OpenTelemetry) and wire dashboards/alerts for ingestion latency, queue depth, and error rates.
- Replace compatibility shims with official Pydantic/FastAPI packages once dependency constraints are lifted, and backfill validation coverage.

## Long-Term (6+ Weeks)
## Long-Term
- Build evaluation loops—analytics dashboards and LLM-assisted reviews—that continuously score memory importance heuristics and rerank quality.
- Introduce human-in-the-loop tooling for conflict resolution, allowing operators to approve merges or override automated maintenance plans.
- Explore federated deployments that synchronise multiple MeshMind instances, including replication strategies and eventual-consistency guarantees.
Expand Down
5 changes: 4 additions & 1 deletion SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,10 @@ docker compose -f meshmind/tests/docker/memgraph.yml up -d
```

> Need synthetic load? Run `python scripts/generate_synthetic_dataset.py build/datasets/benchmark`
> to seed JSONL/CSV fixtures before loading them into Memgraph/Neo4j for stress tests.
> to seed JSONL/CSV fixtures before loading them into Memgraph/Neo4j for stress tests. Triplet rows
> now include `entity_label`, so the ingestion workflow in `docs/retrieval.md` can materialize
> `Triplet` models without mutating CSV fields. Follow the ingestion steps when copying fixtures so
> benchmarks reuse the same namespace/layout.

### 3.2 Cleaning up

Expand Down
3 changes: 2 additions & 1 deletion SOT.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,12 @@ Supporting assets:
- `SETUP.md`: End-to-end provisioning instructions covering Python deps, environment variables, and Compose workflows.
- `run/install_setup.sh`, `run/maintenance_setup.sh`: Automation scripts for provisioning fresh environments and refreshing cached workspaces.
- `scripts/evaluate_importance.py`, `scripts/consolidation_benchmark.py`, `scripts/benchmark_pagination.py`: Evaluation and benchmarking tools for importance heuristics, consolidation throughput, and driver pagination performance.
- `scripts/generate_synthetic_dataset.py`: Produces large JSONL/CSV corpora (defaults: 10k memories, 20k triplets, 384-dim embeddings) for integration and benchmark scenarios.
- `scripts/generate_synthetic_dataset.py`: Produces large JSONL/CSV corpora (defaults: 10k memories, 20k triplets, 384-dim embeddings) for integration and benchmark scenarios. Triplet rows include `entity_label`, so the ingestion workflow in `docs/retrieval.md` stores the generated payloads without recomputing embeddings or mutating CSV fields.
- `.github/workflows/ci.yml`: GitHub Actions workflow running linting/formatting checks and pytest.
- `pyproject.toml`: Project metadata and dependency list (pins Python `>=3.11,<3.13`; see compatibility notes in `ISSUES.md`).
- Documentation (`PROJECT.md`, `PLAN.md`, `SOT.md`, `README.md`, etc.) describing the system and roadmap.
- Strategic context (`ROADMAP.md`, `PLANNING_THOUGHTS.md`, `research/overview.md`) summarising milestones, planning questions, and competitor analysis.
Roadmap horizons now use qualitative labels (Near/Mid/Long-Term) without week estimates to emphasise sequencing over exact timing.
- `DUMMIES.md`: Catalog of temporary shims (REST/gRPC stubs, Celery dummies, fake drivers) with removal guidance and a retired
section for historical compatibility layers.

Expand Down
Loading
Loading