Skip to content

feat(scheduler): add cost-based job admission#242

Open
worstell wants to merge 1 commit intomainfrom
eworstell/scheduler-job-cost
Open

feat(scheduler): add cost-based job admission#242
worstell wants to merge 1 commit intomainfrom
eworstell/scheduler-job-cost

Conversation

@worstell
Copy link
Copy Markdown
Contributor

Summary

Replace the binary isCloneJob/MaxCloneConcurrency mechanism with a generic cost model. Strategies declare the cost of each job at submit time, and the scheduler tracks total active cost against a configurable budget (max-cost).

Design

The Submit and SubmitPeriodicJob interface methods now accept a cost int parameter. The scheduler admits a job only when activeCost + job.cost <= maxCost (with a safety valve: any job is admitted when nothing else is running, preventing permanent starvation from misconfiguration).

This removes isCloneJob — the scheduler no longer has knowledge of git-specific job types.

Cost constants (git strategy)

Job type Cost Rationale
clone 4 Heavy CPU/IO/network, minutes
snapshot 3 Heavy CPU (zstd), moderate IO
repack 2 Heavy CPU, no network
fetch 1 Lightweight, seconds

Configuration

scheduler {
  max-cost = 16  # default: concurrency * 4
}

With the defaults (concurrency=4, max-cost=16), you can run up to 4 clones, or 1 clone + 3 snapshots + 1 fetch, etc. The worker count remains the hard parallelism cap.

Breaking changes

  • max-clone-concurrency config replaced by max-cost
  • Scheduler.Submit and SubmitPeriodicJob signatures changed (added cost int)

MaxCloneConcurrency int `hcl:"max-clone-concurrency" help:"Maximum number of concurrent clone jobs. Remaining worker slots are reserved for fetch/repack/snapshot jobs. 0 means no limit." default:"0"`
SchedulerDB string `hcl:"scheduler-db" help:"Path to the scheduler state database." default:"${CACHEW_STATE}/scheduler.db"`
Concurrency int `hcl:"concurrency" help:"The maximum number of concurrent jobs to run (0 means number of cores)." default:"4"`
MaxCost int `hcl:"max-cost" help:"Maximum total cost of concurrently running jobs. Each job declares its own cost at submission. 0 means Concurrency * 4." default:"0"`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as a general scheduling improvement, but is this going to work as a replacement for the existing MaxCloneConcurrency? I feel like they're solving slightly different problems.

Copy link
Copy Markdown
Contributor Author

@worstell worstell Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it can, but needs to be reworked slightly

i don't think the submitter needs to care about cloning specifically, but moreso that heavy jobs don't starve light ones/workers are reserved for those light jobs. for example if snapshot repacks are also heavy, max-clone-concurrency doesn't address that. but using the cost model should handle them all uniformly to reserve space for light jobs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh nvm i really can't think of a good way to do this without a bunch of extra knobs. MaxCloneConcurrency seems simplest

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can come up with something that works. We have a bunch of constraints, and I think we need to think about it holistically. I'll have a ponder.

@worstell worstell force-pushed the eworstell/scheduler-job-cost branch from 7c3bdae to 590636c Compare April 1, 2026 18:37
Add a cost field to scheduler jobs so strategies declare resource weight
at submit time. The scheduler tracks total active cost against a
configurable budget (max-cost), providing a general mechanism to limit
total resource pressure from heavy background work.

Additionally, replace positional Submit parameters with a Job struct
that includes an explicit Clone bool, removing the isCloneJob string
matching. MaxCloneConcurrency is preserved as a direct, independent
admission check alongside cost.

Cost constants defined in the git strategy:
  clone=4, snapshot=3, repack=2, fetch=1

Two independent admission checks in takeNextJob:
  1. Cost budget: activeCost + job.Cost <= maxCost
  2. Clone limit: Clone jobs capped at MaxCloneConcurrency

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d4a57-1477-707c-bb89-5543fddff0e7
@worstell worstell force-pushed the eworstell/scheduler-job-cost branch from 590636c to dc0fb39 Compare April 1, 2026 18:49
@worstell worstell marked this pull request as ready for review April 1, 2026 18:55
@worstell worstell requested a review from a team as a code owner April 1, 2026 18:55
@worstell worstell requested review from stuartwdouglas and removed request for a team April 1, 2026 18:55
@worstell worstell changed the title feat(scheduler): replace clone concurrency limit with cost-based admission feat(scheduler): add cost-based job admission Apr 1, 2026
@worstell worstell requested a review from alecthomas April 1, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants