Cap jixia concurrency to avoid OOM in load step by Gabrielebattimelli · Pull Request #14 · Kernel-Science/physlibsearch

Gabrielebattimelli · 2026-06-08T11:10:00Z

Why

With the toolchain fix (#13), jixia indexes correctly, but the Load jixia data into PostgreSQL step was cancelled twice, deterministically at exactly 85 modules (~12 min in), while processing the most Mathlib-heavy modules (Relativity/Tensors, PauliMatrices).

Signature: runner lost, post-steps skipped, no error logged, 0 jixia subprocess failures. That's a silent OOM kill — jixia_py defaults to CPUs + 4 parallel workers (8 on the runner), each loading ~2-3 GB of Mathlib, blowing past the 16 GB runner.

What

Make jixia worker count configurable via JIXIA_MAX_WORKERS.
Set JIXIA_MAX_WORKERS: '2' in the workflow so peak memory stays well under 16 GB.

Trade-off: the load step runs longer (fewer parallel workers) but stays within the 6-hour job limit. Local runs are unaffected (defaults to full parallelism when the env var is unset).

The 'Load jixia data into PostgreSQL' step was OOM-killed deterministically ~85 modules in (runner lost, post-steps skipped, no error logged), while processing the most Mathlib-heavy modules. jixia_py defaults to CPUs+4 parallel workers; each loads ~2-3 GB of Mathlib, exceeding the 16 GB runner. Make the worker count configurable via JIXIA_MAX_WORKERS and set it to 2 in the weekly index workflow.

Copilot

Pull request overview

This PR aims to prevent CI out-of-memory kills during the “Load jixia data into PostgreSQL” step by making jixia’s parallelism configurable and capping it in the weekly indexing workflow.

Changes:

Read JIXIA_MAX_WORKERS in database/jixia_db.py and pass it to the jixia batch runner.
Set JIXIA_MAX_WORKERS: '2' in the weekly GitHub Actions workflow environment to reduce peak memory usage.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`database/jixia_db.py`	Adds env-driven worker cap and forwards it into the jixia batch execution call.
`.github/workflows/weekly-index.yml`	Caps jixia worker concurrency in CI to avoid runner OOM during the load step.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        max_workers_env = os.environ.get("JIXIA_MAX_WORKERS")
+        max_workers = int(max_workers_env) if max_workers_env else None


            results = project.batch_run_jixia(
                base_dir=d,
                prefixes=prefixes,
                plugins=["module", "declaration", "symbol"],
+                max_workers=max_workers,
            )


Copilot AI review requested due to automatic review settings June 8, 2026 11:10

Copilot started reviewing on behalf of Gabrielebattimelli June 8, 2026 11:10 View session

Gabrielebattimelli merged commit d079277 into main Jun 8, 2026
2 checks passed

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cap jixia concurrency to avoid OOM in load step#14

Cap jixia concurrency to avoid OOM in load step#14
Gabrielebattimelli merged 1 commit into
mainfrom
add-agent-skill

Gabrielebattimelli commented Jun 8, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		max_workers_env = os.environ.get("JIXIA_MAX_WORKERS")
		max_workers = int(max_workers_env) if max_workers_env else None

Conversation

Gabrielebattimelli commented Jun 8, 2026

Why

What

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants