Skip to content

feat(sync[parallel]) Sync N repos in parallel via --jobs#546

Draft
tony wants to merge 1 commit intomasterfrom
feat-parallel-sync
Draft

feat(sync[parallel]) Sync N repos in parallel via --jobs#546
tony wants to merge 1 commit intomasterfrom
feat-parallel-sync

Conversation

@tony
Copy link
Copy Markdown
Member

@tony tony commented Apr 25, 2026

Summary

  • Parallel vcspull sync execution: default min(8, CPU*2) workers, opt-out via --jobs 1.
  • Asyncio orchestrator (asyncio.Semaphore + asyncio.as_completed) over per-task daemon threads bridging libvcs's synchronous update_repo. Daemon threads avoid the default ThreadPoolExecutor atexit-join hang we documented for _sync_repo_with_watchdog.
  • Multi-slot indicator: N spinner rows in a fixed active region at the bottom of the terminal; permanent ✓ Synced ... lines scroll into scrollback above as repos finish (cargo / pueue trick). --jobs 1 keeps today's single-spinner UX bit-for-bit.
  • 3-line live-trail panel auto-disables when --jobs > 1 (a shared deque with N concurrent writers reads as noise); each slot's most-recent libvcs progress chunk becomes its row suffix instead.
  • JSON / NDJSON events emit in completion order, constant memory regardless of repo count.
  • --exit-on-error in parallel mode short-circuits queued tasks but lets in-flight tasks complete so their output is captured.

This is a follow-up to #fix-sync-hang-on-credential-prompts. Targeted at that branch so the panel + verbosity + watchdog work lands first.

Test plan

  • uv run --no-sync ruff check . --fix --show-fixes
  • uv run --no-sync ruff format .
  • uv run --no-sync mypy (no issues, 86 source files)
  • uv run --no-sync py.test --reruns 0 (1033 passed)
  • cd docs && uv run --no-sync sphinx-build -b dirhtml . _build/html (build succeeded)
  • Manual TTY check: vcspull sync --workspace ~/study/otel/ --all -- expect 4-8 spinner rows, permanents scroll above as repos finish, ~5x faster than --jobs 1.
  • Manual JSON streaming: vcspull sync --workspace ~/study/otel/ --all --json | head -- events arrive as-completed.
  • Manual Ctrl-C mid-batch: in-flight workers drain, partial summary prints, exit via SIG_DFL re-raise (echo $? -> 130 in bash).
  • Manual --exit-on-error with mixed good/bad repos: in-flight repos complete, queued ones short-circuit, summary + non-zero exit.

Caveats (documented in code comments)

  • Rate limits: GitHub starts to throttle around 60 unauth requests per IP / hour. Default cap of 8 stays polite; users can lower via VCSPULL_JOBS=2 for big bursts.
  • Worktree sync runs sequentially in the as-completed loop after the main sync completes (the main sync was the parallel hot path).
  • Asyncio cancellation can't actually stop a running daemon thread, but libvcs's internal subprocess timeout (added in this branch's prior commits) bounds the work; threads die at process exit.
  • asyncio.to_thread deliberately not used -- its default ThreadPoolExecutor has the atexit-join footgun we already worked around in _sync_repo_with_watchdog.

…(8, CPU*2))

why: vcspull sync is sequential today -- a 50-repo workspace of
already-up-to-date repos still pays N x ~0.7s of subprocess + network
overhead. The plan-build phase already runs concurrent status checks
under DEFAULT_PLAN_CONCURRENCY; generalise the execution phase to match
so the batch's wallclock scales with the slowest few repos rather than
the sum of all of them. ~5-10x wallclock speedup on real workspaces.

what:
- New --jobs N / -j N CLI flag + VCSPULL_JOBS env. Default
  min(8, CPU*2) -- same heuristic as DEFAULT_PLAN_CONCURRENCY but
  capped at 8 to stay polite to per-IP rate limits. Pass --jobs 1 to
  force the legacy serial UX.
- _run_parallel_sync_loop_async: asyncio.Semaphore(jobs) +
  asyncio.as_completed over per-task daemon threads bridging libvcs's
  synchronous update_repo into the loop. Daemon threads (instead of
  asyncio.to_thread) avoid the default ThreadPoolExecutor atexit-join
  footgun documented in _sync_repo_with_watchdog: a wedged libvcs
  subprocess at interpreter shutdown would otherwise hang the process.
- SyncStatusIndicator multi-slot mode: N spinner rows in a fixed
  active region; release_slot(final_line=...) queues the permanent
  line for the next render tick to scroll into scrollback above the
  active region (cargo / pueue trick: write above so a \\n from the
  viewport bottom scrolls one row out). slots=1 keeps today's single-
  spinner UX bit-for-bit.
- The 3-line live-trail panel is disabled when --jobs > 1 (a shared
  deque with N concurrent writers reads as noise); each slot's most-
  recent libvcs progress message becomes the per-row suffix instead.
- JSON / NDJSON events emit in completion order via
  asyncio.as_completed -- matches the streaming model, constant memory.
- --exit-on-error in parallel mode sets a stop event so queued tasks
  short-circuit before starting, but in-flight tasks are allowed to
  complete so their output is captured. Mirrors the serial promise
  that the user sees results for repos that had already started.
- Shared per-result emission (_emit_repo_result, _emit_worktree_results)
  factored out so serial and parallel paths agree on summary keys,
  event shape, and permanent-line formatting.
- tests/conftest.py: autouse fixture pins VCSPULL_JOBS=1 in tests so
  pre-existing order-dependent --exit-on-error fixtures keep their
  serial ordering. New parallel-mode tests override the env var
  inside their own scope.
- Tests cover slot allocation, oversubscription, multi-row render,
  pending-permanents scroll-out, _resolve_jobs precedence, and a
  10-repos x 4-jobs / 20-repos x 3-jobs dispatcher pass that
  asserts the semaphore caps in-flight count.
Base automatically changed from fix-sync-hang-on-credential-prompts to master April 25, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant