Skip to content

Commit a5f2a40

Browse files
authored
docs: add WorkQueueOps and BatchOps design patterns (#25178)
1 parent a017c68 commit a5f2a40

3 files changed

Lines changed: 460 additions & 0 deletions

File tree

docs/astro.config.mjs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,8 +126,10 @@ export default defineConfig({
126126
'/patterns/siderepoops/': '/gh-aw/patterns/side-repo-ops/',
127127
'/patterns/specops/': '/gh-aw/patterns/spec-ops/',
128128
'/patterns/researchplanassignops/': '/gh-aw/patterns/research-plan-assign-ops/',
129+
'/patterns/batchops/': '/gh-aw/patterns/batch-ops/',
129130
'/patterns/taskops/': '/gh-aw/patterns/task-ops/',
130131
'/patterns/trialops/': '/gh-aw/patterns/trial-ops/',
132+
'/patterns/workqueueops/': '/gh-aw/patterns/workqueue-ops/',
131133
},
132134
integrations: [
133135
sitemap(),
@@ -270,6 +272,7 @@ export default defineConfig({
270272
{
271273
label: 'Design Patterns',
272274
items: [
275+
{ label: 'BatchOps', link: '/patterns/batch-ops/' },
273276
{ label: 'CentralRepoOps', link: '/patterns/central-repo-ops/' },
274277
{ label: 'ChatOps', link: '/patterns/chat-ops/' },
275278
{ label: 'DailyOps', link: '/patterns/daily-ops/' },
@@ -286,6 +289,7 @@ export default defineConfig({
286289
{ label: 'SpecOps', link: '/patterns/spec-ops/' },
287290
{ label: 'TaskOps', link: '/patterns/task-ops/' },
288291
{ label: 'TrialOps', link: '/patterns/trial-ops/' },
292+
{ label: 'WorkQueueOps', link: '/patterns/workqueue-ops/' },
289293
],
290294
},
291295
{
Lines changed: 268 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
---
2+
title: BatchOps
3+
description: Process large volumes of work in parallel or chunked batches using matrix jobs, rate-limit-aware throttling, and result aggregation
4+
sidebar:
5+
badge: { text: 'Batch processing', variant: 'caution' }
6+
---
7+
8+
BatchOps is a pattern for processing large volumes of work items efficiently. Instead of iterating sequentially through hundreds of items in a single workflow run, BatchOps splits work into chunks, parallelizes where possible, handles partial failures gracefully, and aggregates results into a consolidated report.
9+
10+
## When to Use BatchOps vs Sequential Processing
11+
12+
| Scenario | Recommendation |
13+
|----------|----------------|
14+
| < 50 items, order matters | Sequential ([WorkQueueOps](/gh-aw/patterns/workqueue-ops/)) |
15+
| 50–500 items, order doesn't matter | BatchOps with chunked processing |
16+
| > 500 items, high parallelism safe | BatchOps with matrix fan-out |
17+
| Items have dependencies on each other | Sequential (WorkQueueOps) |
18+
| Items are fully independent | BatchOps (any strategy) |
19+
| Strict rate limits or quotas | Rate-limit-aware batching |
20+
21+
## Batch Strategy 1: Chunked Processing
22+
23+
Split work into fixed-size pages using `GITHUB_RUN_NUMBER`. Each run processes one page, picking up the next slice on the next scheduled run. Items must have a stable sort key (creation date, issue number) so pagination is deterministic.
24+
25+
```aw wrap
26+
---
27+
on:
28+
schedule:
29+
- cron: "0 2 * * 1-5" # Weekdays at 2 AM
30+
workflow_dispatch:
31+
32+
tools:
33+
github:
34+
toolsets: [issues]
35+
bash:
36+
- "jq"
37+
- "date"
38+
39+
safe-outputs:
40+
add-labels:
41+
allowed: [stale, needs-triage, archived]
42+
max: 30
43+
add-comment:
44+
max: 30
45+
46+
steps:
47+
- name: compute-page
48+
id: compute-page
49+
run: |
50+
PAGE_SIZE=25
51+
# Use run number mod to cycle through pages; reset every 1000 runs
52+
PAGE=$(( (GITHUB_RUN_NUMBER % 1000) * PAGE_SIZE ))
53+
echo "page_offset=$PAGE" >> "$GITHUB_OUTPUT"
54+
echo "page_size=$PAGE_SIZE" >> "$GITHUB_OUTPUT"
55+
---
56+
57+
# Chunked Issue Processor
58+
59+
This run covers offset ${{ steps.compute-page.outputs.page_offset }} with page size ${{ steps.compute-page.outputs.page_size }}.
60+
61+
1. List issues sorted by creation date (oldest first), skipping the first ${{ steps.compute-page.outputs.page_offset }} and taking ${{ steps.compute-page.outputs.page_size }}.
62+
2. For each issue: add `stale` if last updated > 90 days ago with no recent comments; add `needs-triage` if it has no labels; post a stale warning comment if applicable.
63+
3. Summarize: issues labeled, comments posted, any errors.
64+
```
65+
66+
## Batch Strategy 2: Fan-Out with Matrix
67+
68+
Use GitHub Actions matrix to run multiple batch workers in parallel, each responsible for a non-overlapping shard. Use `fail-fast: false` so one shard failure doesn't cancel the others. Each shard gets its own token and API rate limit quota.
69+
70+
```aw wrap
71+
---
72+
on:
73+
workflow_dispatch:
74+
inputs:
75+
total_shards:
76+
description: "Number of parallel workers"
77+
default: "4"
78+
required: false
79+
80+
jobs:
81+
batch:
82+
strategy:
83+
matrix:
84+
shard: [0, 1, 2, 3]
85+
fail-fast: false # Continue other shards even if one fails
86+
87+
tools:
88+
github:
89+
toolsets: [issues, pull_requests]
90+
91+
safe-outputs:
92+
add-labels:
93+
allowed: [reviewed, duplicate, wontfix]
94+
max: 50
95+
---
96+
97+
# Matrix Batch Worker — Shard ${{ matrix.shard }} of ${{ inputs.total_shards }}
98+
99+
Process only issues where `(issue_number % ${{ inputs.total_shards }}) == ${{ matrix.shard }}` — this ensures no two shards process the same issue.
100+
101+
1. List all open issues (up to 500) and keep only those assigned to this shard.
102+
2. For each issue: check for duplicates (similar title/content); add label `reviewed`; if a duplicate is found, add `duplicate` and reference the original.
103+
3. Report: issues in this shard, how many labeled, any failures.
104+
```
105+
106+
## Batch Strategy 3: Rate-Limit-Aware Batching
107+
108+
Throttle API calls by processing items in small sub-batches with explicit pauses. Slower than unbounded processing but dramatically reduces rate-limit errors. Use [Rate Limiting Controls](/gh-aw/reference/rate-limiting-controls/) for built-in throttling.
109+
110+
```aw wrap
111+
---
112+
on:
113+
workflow_dispatch:
114+
inputs:
115+
batch_size:
116+
description: "Items per sub-batch"
117+
default: "10"
118+
pause_seconds:
119+
description: "Seconds to pause between sub-batches"
120+
default: "30"
121+
122+
tools:
123+
github:
124+
toolsets: [repos, issues]
125+
bash:
126+
- "sleep"
127+
- "jq"
128+
129+
safe-outputs:
130+
add-comment:
131+
max: 100
132+
add-labels:
133+
allowed: [labeled-by-bot]
134+
max: 100
135+
---
136+
137+
# Rate-Limited Batch Processor
138+
139+
Process all open issues in sub-batches of ${{ inputs.batch_size }}, pausing ${{ inputs.pause_seconds }} seconds between batches.
140+
141+
1. Fetch all open issue numbers (paginate if needed).
142+
2. For each sub-batch: read each issue body, determine the correct label, add the label, then pause before the next sub-batch.
143+
3. On HTTP 429: pause 60 seconds and retry once before marking the item as failed.
144+
4. Report: total processed, failed, skipped.
145+
```
146+
147+
## Batch Strategy 4: Result Aggregation
148+
149+
Collect results from multiple batch workers or runs and aggregate them into a single summary issue. Use [cache-memory](/gh-aw/reference/cache-memory/) to store intermediate results when runs span multiple days.
150+
151+
```aw wrap
152+
---
153+
on:
154+
workflow_dispatch:
155+
inputs:
156+
report_issue:
157+
description: "Issue number to aggregate results into"
158+
required: true
159+
160+
tools:
161+
cache-memory: true
162+
github:
163+
toolsets: [issues, repos]
164+
bash:
165+
- "jq"
166+
167+
safe-outputs:
168+
add-comment:
169+
max: 1
170+
update-issue:
171+
body: true
172+
173+
steps:
174+
- name: collect-results
175+
run: |
176+
# Aggregate results from all result files written by previous batch runs
177+
RESULTS_DIR="/tmp/gh-aw/cache-memory/batch-results"
178+
if [ -d "$RESULTS_DIR" ]; then
179+
jq -s '
180+
{
181+
total_processed: (map(.processed) | add // 0),
182+
total_failed: (map(.failed) | add // 0),
183+
total_skipped: (map(.skipped) | add // 0),
184+
runs: length,
185+
errors: (map(.errors // []) | add // [])
186+
}
187+
' "$RESULTS_DIR"/*.json > /tmp/gh-aw/cache-memory/aggregate.json
188+
cat /tmp/gh-aw/cache-memory/aggregate.json
189+
else
190+
echo '{"total_processed":0,"total_failed":0,"total_skipped":0,"runs":0,"errors":[]}' \
191+
> /tmp/gh-aw/cache-memory/aggregate.json
192+
fi
193+
---
194+
195+
# Batch Result Aggregator
196+
197+
Aggregate results from previous batch runs stored in `/tmp/gh-aw/cache-memory/batch-results/` into issue #${{ inputs.report_issue }}.
198+
199+
1. Read `/tmp/gh-aw/cache-memory/aggregate.json` for totals and each individual result file for per-run breakdowns.
200+
2. Update issue #${{ inputs.report_issue }} body with a Markdown table: summary row (processed/failed/skipped) plus per-run breakdown. List any errors requiring manual intervention.
201+
3. Add a comment: "Batch complete ✅" if no failures, or "Batch complete with failures ⚠️" with a list of failed items.
202+
4. For each failed item, create a sub-issue so it can be retried.
203+
```
204+
205+
## Error Handling and Partial Failures
206+
207+
Batch workflows must be resilient to individual item failures.
208+
209+
**Retry pattern**: When using cache-memory queues, track `retry_count` per failed item. Retry items where `retry_count < 3`; after three failures move them to `permanently_failed` for human review. Increment the count and save the queue after each attempt.
210+
211+
**Failure isolation**:
212+
213+
- Use `fail-fast: false` in matrix jobs so one shard failure doesn't cancel others
214+
- Write per-item results before moving to the next item
215+
- Store errors with enough context to diagnose and retry
216+
217+
## Real-World Example: Updating Labels Across 100+ Issues
218+
219+
This example processes a label migration (rename `bug` to `type:bug`) across all open and closed issues.
220+
221+
```aw wrap
222+
---
223+
on:
224+
workflow_dispatch:
225+
inputs:
226+
dry_run:
227+
description: "Preview changes without applying them"
228+
default: "true"
229+
230+
tools:
231+
github:
232+
toolsets: [issues]
233+
bash:
234+
- "jq"
235+
236+
safe-outputs:
237+
add-labels:
238+
allowed: [type:bug]
239+
max: 200
240+
remove-labels:
241+
allowed: [bug]
242+
max: 200
243+
add-comment:
244+
max: 1
245+
246+
concurrency:
247+
group: label-migration
248+
cancel-in-progress: false
249+
---
250+
251+
# Label Migration: `bug` → `type:bug`
252+
253+
Migrate all issues with the label `bug` to use `type:bug`. List all issues (open and closed) with label `bug`, paginating to retrieve all of them.
254+
255+
- If `${{ inputs.dry_run }}` is `true`: report how many issues would be updated and add a preview comment. Make no changes.
256+
- If `${{ inputs.dry_run }}` is `false`: for each issue add `type:bug` then remove `bug`. Process in sub-batches of 20 with 15-second pauses. Track successes and failures.
257+
258+
Add a final comment with totals and a search link to verify no `bug` labels remain.
259+
```
260+
261+
## Related Pages
262+
263+
- [WorkQueueOps](/gh-aw/patterns/workqueue-ops/) — Sequential queue processing with issue checklists, sub-issues, cache-memory, and Discussions
264+
- [TaskOps](/gh-aw/patterns/task-ops/) — Research → Plan → Assign for developer-supervised work
265+
- [Cache Memory](/gh-aw/reference/cache-memory/) — Persistent state storage across workflow runs
266+
- [Repo Memory](/gh-aw/reference/repo-memory/) — Git-committed persistent state
267+
- [Rate Limiting Controls](/gh-aw/reference/rate-limiting-controls/) — Built-in throttling for API-heavy workflows
268+
- [Concurrency](/gh-aw/reference/concurrency/) — Prevent overlapping batch runs

0 commit comments

Comments
 (0)