Skip to content

Releases: SoundMindsAI/relyloop

RelyLoop v0.1.3 — MVP1 backlog fully drained

29 May 10:50
6f3179c

Choose a tag to compare

Docs-only milestone release. The MVP1 actionable backlog is now fully drained — the 01_mvp1/ planned-features bucket is empty.

What landed

  • PR #310 — the two remaining deferred-by-design MVP1 folders were reclassified out of 01_mvp1/: chore_demo_reseed_stale_recovery_atomic_cas99_backlog/ (defense-in-depth; already Priority: Backlog) and infra_agent_sibling_worktree_isolation99_backlog/ (phases 1+2 shipped; only phase3 remains, defer-until-incident).
  • PR #311 — refreshed the compressed-context docs for the post-MVP1 reality:
    • state.md: updated Last-5-merges, rewrote In-flight/Queued (next stop is the 02_mvp2/ bucket), and marked infra_ci_smoke_makeup + chore_starlette_422_deprecation resolved.
    • CLAUDE.md: fixed a stale Next.js 14 → 16 reference in the Frontend Conventions stack.

Notes

  • No code or schema changes since v0.1.2. Alembic head remains 0020_studies_baseline_trial.
  • Next stop: MVP2 / v0.2 — "Three-Engine + Real Signals" (Apache Solr adapter + UBI judgments).

🤖 Generated with Claude Code

RelyLoop v0.1.1 — MVP1 alpha feature-complete

14 May 16:30
3a0ed58

Choose a tag to compare

Patch release on top of v0.1.0. The MVP1 dashboard reads Path to MVP1: 0, 36/36 scoped done (100%) — the MVP1 alpha is feature-complete. The two remaining held items are correctly classified for MVP2 and visible on the new MVP2 dashboard.

This release lands the post-launch polish that accumulated since the v0.1.0 alpha cut: one new user-visible feature, two new backend surfaces, a wave of CI / tooling hardening, and three idea drops that drained the backlog to zero actionable items.

What's new since v0.1.0

Features

  • feat_query_inline_crud (PR #101) — PATCH / DELETE / GET endpoints on /api/v1/query-sets/{id}/queries + inline editable table on /query-sets/[id] page. 12 stories, zero migrations.
  • feat_judgments_periodic_resume_sweep (PR #104) — in-worker resume_stuck_judgment_lists Arq cron that re-enqueues every judgment_lists.status='generating' row every 15 minutes via deterministic Arq _job_id dedup. Replaces the old boot-time-only sweep with continuous coverage.
  • chore_chat_last_message_preview (PR #117) — adds last_message_preview (truncated to 120 chars) and last_message_at to ConversationSummary. /chat list page renders the preview under the title and shows last-touch time instead of created-at. LATERAL JOIN against the existing messages_conversation_idx — no migration.

Tooling + infrastructure

  • Per-release dashboards + top-level roadmap roll-up (PR #119) — docs/00_overview/DASHBOARD.md + dashboard.html index over the canonical release matrix (MVP1 → MVP2 → MVP3 → MVP4 → GA v1 → v2+). Per-release dashboards auto-discovered via _mvpN folder suffix; bidirectional navigation between roadmap and per-release detail.
  • env-defense workflow + gitleaks (PR #94, PR #99) — .env* filename guard CI workflow + content-scan step, surfaced after a local .env corruption incident.
  • make backend-* sub-targets (PR #110) — make backend-fmt / backend-lint / backend-typecheck for Node-18 contributors who can't run the bundled make fmt (which gates on pnpm's Node ≥20.18 engine).
  • structlog test helpers (PR #114) — backend/tests/_log_helpers.py factors assert_log_level, find_log_events, and RecordingLogger after a two-CI-run debugging arc on PR #112.
  • Dashboard regen idempotency + relative-link rewriting (PR #108) — pre-commit hook no longer churns on no-op writes; one-liner links extracted from idea files resolve correctly when embedded in the dashboard.

Bug fixes

  • bug_query_inline_crud_since_filter_uuidv7_ms_collision (PR #106) — 10ms sleep in _seed_set helpers to avoid UUIDv7 ms-collision flake.
  • chore_digest_worker_narrow_except (PR #112) — narrowed except Exception in the digest worker's optuna.importance.get_param_importances call so future dep-regressions like the PR #92 sklearn ImportError surface at ERROR level on day one instead of silently shipping empty importance maps for days.

Idea drops (won't-do)

  • chore_cluster_run_query_history (PR #103) — superseded by feat_chat_agent's run_query tool.
  • chore_studies_ui_shadcn_polish (PR #116) — ClusterFilterSelect precedent established native <select> as the project's standard for page-level filter/control surfaces; F1 inconsistency claim retired.
  • chore_demo_recording_mvp3 (PR #119) — single-maintainer alpha base rates make a 4–6 hour record-edit-upload-embed task unlikely to execute; tutorial-first-study.md serves the demo's discovery role; any pre-MVP4 recording would need re-shooting once MVP4 auth UI lands.

Held for MVP2

Two ideas explicitly held for MVP2 on the new MVP2 dashboard:

  • bug_chat_long_conversation_truncation_mvp2 — chat agent silently drops load-bearing context past 100-message cap. Latent bug; no operator has hit it. MVP2 timing aligns with Langfuse trace tooling for summarization prompt calibration.
  • infra_arq_subprocess_test_mvp2 — subprocess-driven Arq worker test for narrow Arq-version-regression guard. Trigger-locked at three conditions (arq pin bump, 3rd cron, MVP3 hardening opt-in).

Stack

Unchanged from v0.1.0:

  • Python 3.13 + FastAPI · Next.js 16 (React 19, TypeScript App Router, Turbopack) · Tailwind 4 · Vitest 4
  • Postgres 16 + SQLAlchemy 2.0 async + Alembic
  • Redis 7 + Arq workers
  • Optuna with TPE sampler + RDBStorage · pytrec_eval
  • openai SDK pointed at any OpenAI-compatible endpoint via OPENAI_BASE_URL
  • ElasticAdapter handling both ES 8.11+/9.x and OpenSearch 2.x/3.x
  • Single-tenant, no auth, Docker Compose-only deployment

Try it

git clone https://github.com/SoundMindsAI/relyloop
cd relyloop
git checkout v0.1.1
make up
# Then follow docs/08_guides/tutorial-first-study.md

License: Apache 2.0. Status: alpha. Multi-tenant + SSO arrive at MVP4 / v0.4.

RelyLoop v0.1.0 — MVP1 alpha

13 May 10:59
d099536

Choose a tag to compare

RelyLoop v0.1.0 — MVP1 alpha

What's in MVP1

The full Karpathy loop end-to-end on Elasticsearch and OpenSearch, single-tenant, no auth, Docker Compose:

  • Engine adapter — one SearchAdapter Protocol covering both ES 8.11+/9.x and OpenSearch 2.x/3.x. Cluster registration via UI or API.
  • Optuna optimizer — TPE sampler against a parametrized query template; up to N trials per study; per-trial budget guard; pytrec_eval metrics (ndcg@k, map, precision, recall, mrr, err).
  • LLM-as-judgePOST /api/v1/judgments/generate rates query-document pairs against a rubric. ~$0.01–$0.05 per query set with gpt-4o-mini. Provider-agnostic: works against any OpenAI-compatible endpoint (Ollama / LM Studio / vLLM / TGI) via OPENAI_BASE_URL.
  • Digest — LLM-generated narrative summary of each completed study, plus parameter-importance chart and recommended config.
  • GitHub PR worker — winning configs land as Pull Requests against a central search-config Git repo. Operator's CI deploys.
  • Chat agent — describe the problem in chat; the agent introspects the cluster, proposes a search-space, and queues the study after operator confirmation. 19-tool surface.
  • Operator tutorial + sample data — 1,000 curated Amazon ESCI products + 48 queries + canonical Jinja2 query template. docs/08_guides/tutorial-first-study.md walks git clone → Open PR in under 30 minutes on a fresh laptop.
  • CI smoke gate — every PR runs the full Karpathy loop end-to-end against a fresh stack with a budgeted OpenAI key. Same operator path as the tutorial; no degraded variants.

Full feature list: see docs/02_product/mvp1-user-stories.md.

Audience

Technical evaluators, Relevance Engineers, and search-platform teams considering an open-source query-tuning tool. Not yet production-deployable — see docs/01_architecture/deployment.md for the MVP1 → MVP3 → GA v1 deployment maturity ramp.

How to install

Follow the tutorial: docs/08_guides/tutorial-first-study.md.

Operators build images locally via make up. Pre-built GHCR images ship at MVP3 per the canonical release matrix; until then, make up triggers a local Docker build of relyloop/api and relyloop/ui on first run.

Known limitations

This is alpha. Three operator-visible issues are tracked but not blocking:

  • Long chat sessions silently drop context after 100 messages. The agent prompt-window cap is brute-force; smarter context management ships in MVP2. Tracked: bug_chat_long_conversation_truncation.
  • Query templates created via the API with declared params can't be used for LLM judgment generation. Workaround: use one template with declared_params={} for judgment generation, a separate template with declared params for the optimization study (this is what the tutorial does). Tracked: bug_judgment_template_default_params_contract.
  • Worker may need a manual restart after first-run make migrate. If you make up and immediately fire a study before make migrate completes, the Arq worker dies on Optuna schema init and stays down. Workaround: docker compose restart worker after make migrate. Tracked: bug_worker_optuna_init_race.

How to provide feedback

Roadmap

Release Theme Adds
MVP1 / v0.1.0 (you are here) "The Loop" ES + OpenSearch adapter, OpenAI-compatible LLM, GitHub provider, single-tenant, no auth, Docker Compose, 80% coverage gate
MVP2 / v0.2 "Observable" Langfuse + ClickHouse + SigNoz; canonical event catalog; audit_log + immutability trigger; lineage columns; PII redaction; trace propagation
MVP3 / v0.3 "Production Stacks" Lucidworks Fusion adapter; multi-Git-provider abstraction (GitLab, Bitbucket); production install (TLS via Caddy + Let's Encrypt, managed Postgres/Redis); AWS managed OpenSearch
MVP4 / v0.4 "Multi-tenant, Multi-LLM" tenants + tenant_memberships + users + api_keys; tenant_id columns + backfill; SSO via reverse proxy; native non-OpenAI provider SDKs
GA v1 "Production-ready" LangGraph orchestrator + PostgresSaver; full RFC 7807 errors; Idempotency-Key; Helm chart; container scanning; image signing

Canonical release matrix: docs/01_architecture/tech-stack.md.