Skip to content

Commit de6a802

Browse files
PR5: -recursive/-depth with restructured queue lifecycle
- -recursive and -depth <n> flags for recursive enumeration of discovered subdomains. scan.Run is restructured around a dispatcher that tracks outstanding work and closes the queue only when it drains to zero, so resolved subdomains can safely enqueue depth-capped children. A centralized visited set provides loop and duplicate protection; the progress total expands as new work is discovered. - Finalize CHANGELOG: [Unreleased] -> [0.6.0] - 2026-06-03. - ARCHITECTURE updated to its final 0.6.0 shape (dispatcher queue lifecycle, recursive enumeration, rate limiter, record-aware output). Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent d6e41a8 commit de6a802

7 files changed

Lines changed: 254 additions & 50 deletions

File tree

CHANGELOG.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,17 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8-
## [Unreleased]
8+
## [0.6.0] - 2026-06-03
99

1010
### Added
1111
- Resolved records are now captured during scans. `internal/dns` exposes `Resolve` and a `Record{Type, Value}` type; `scan.Event` carries `Records` for each resolved subdomain (A/AAAA today, extensible to CNAME and more).
1212
- `-format text|json|csv` flag (default `text`, byte-for-byte identical to prior output). JSON emits a buffered array of `{"subdomain", "records"}` objects; CSV streams `subdomain,type,value` rows with a header. The `-o` output file honors the selected format. Output formats are CLI-only for now (TUI-pending).
1313
- `-rate <qps>` flag (default 0 = unlimited) caps total DNS queries per second across the worker pool via a shared stdlib ticker gate inside `scan.Run`. The limiter respects context cancellation so `Ctrl+C` stays responsive.
1414
- `-type A,AAAA,CNAME` flag (default `A,AAAA`, preserving prior behavior) performs per-type DNS lookups and filters results to the requested types. The resolved record type is carried in the existing `Record` shape, so the JSON/CSV schema is unchanged.
15+
- `-recursive` and `-depth <n>` flags for recursive enumeration of discovered subdomains. `scan.Run` was restructured around a dispatcher that tracks outstanding work and closes the queue only when it drains to zero, so resolved subdomains can safely enqueue depth-capped children (the previous close-after-feed shape would have panicked on a send to a closed channel). A centralized visited set provides loop and duplicate protection, and the progress total expands as new work is discovered.
16+
17+
### Changed
18+
- Internal: the scan engine's worker queue lifecycle moved from a feed-then-close channel to a dispatcher-owned queue with a pending-work counter.
1519

1620
## [0.5.1] - 2026-06-03
1721

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ Or launch the interactive terminal UI with no flags:
8080
| Output Formats | Emit results as `text`, `json` (array of subdomain plus typed records), or `csv` via `-format` |
8181
| Rate Limiting | Cap total DNS queries per second across the worker pool with `-rate` (context-aware) |
8282
| Record Types | Look up and filter by `A`, `AAAA`, or `CNAME` records with `-type` |
83+
| Recursive Enumeration | Enumerate subdomains of discovered subdomains with `-recursive` and a `-depth` cap, with loop and duplicate protection |
8384
| Interactive TUI | Form-based config and live-scrolling results via `-tui`; session values persisted |
8485

8586
<br>
@@ -204,6 +205,8 @@ make help # list all targets
204205
| `-format <fmt>` | `text` | Output format: `text`, `json`, or `csv` |
205206
| `-rate <qps>` | `0` | Max DNS queries per second across all workers (0 = unlimited) |
206207
| `-type <list>` | `A,AAAA` | Comma-separated record types to look up: `A`, `AAAA`, `CNAME` |
208+
| `-recursive` | `false` | Recursively enumerate subdomains of discovered subdomains |
209+
| `-depth <n>` | `1` | Max recursion depth when `-recursive` is set (1 = no recursion) |
207210
| `-v` | `false` | Verbose output: IPs, timings, per-query detail (stderr) |
208211
| `-progress` | `true` | Live progress line on stderr |
209212
| `-simulate` | `false` | Simulation mode: no real DNS queries |

docs/ARCHITECTURE.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,20 +87,21 @@ internal/tui/config.go — Session persistence (load/save ~/.config/sube
8787
* **Implementation**: The worker pool logic lives in `internal/scan/runner.go` as `scan.Run(ctx, cfg, events)`. Both the CLI (`run()` in `main.go`) and the TUI (`internal/tui`) call this function.
8888
* **`scan.Config`**: A struct carrying all scan parameters (domain, entries slice, concurrency, timeout, DNS server, simulate flag, etc.).
8989
* **`scan.Event` / `scan.EventKind`**: Typed events emitted on a `chan<- scan.Event``EventResult`, `EventProgress`, `EventWildcard`, `EventError`, `EventDone`.
90-
* **`subdomains := make(chan string)`**: An internal channel acts as a work queue. Entries from the pre-loaded wordlist slice are fed into it.
90+
* **Dispatcher and work queue**: A dispatcher goroutine owns an internal `jobs` channel, the queue of pending `job{domain, depth}` items, a visited set, and a pending-work counter. It seeds the queue from the wordlist slice and feeds workers. Workers submit newly discovered children back to the dispatcher over an `enqueue` channel and signal each finished job over a `completed` channel. The dispatcher closes `jobs` only when the pending counter reaches zero (or the context is cancelled). This lifecycle lets resolved subdomains enqueue children safely (recursive mode) without risking a send on a closed channel.
9191
* **`var wg sync.WaitGroup`**: A `sync.WaitGroup` waits for all worker goroutines to finish.
92-
* **Worker Goroutines Loop**: `cfg.Concurrency` goroutines are launched. Each reads prefixes from the channel, constructs the full domain, and calls `dns.ResolveDomainWithRetry()` (or `dns.SimulateResolution()` in simulate mode).
93-
* **Progress ticker**: A separate goroutine fires every second and emits `EventProgress` events so callers can update their display.
92+
* **Worker Goroutines Loop**: `cfg.Concurrency` goroutines are launched. Each reads a job from `jobs`, constructs nothing further (the job already holds the full domain), and calls `dns.ResolveDomainWithRetry()` (or `dns.SimulateResolve()` in simulate mode).
93+
* **Recursive enumeration** (optional): when `cfg.Recursive` is set and a job at depth `d < cfg.Depth` resolves, the worker enqueues one child per wordlist entry at depth `d+1`. The dispatcher's visited set deduplicates domains (loop and duplicate protection), and the progress total grows as new work is admitted.
94+
* **Progress ticker**: A separate goroutine fires every second and emits `EventProgress` events so callers can update their display. The total is read atomically since recursion can expand it mid-scan.
9495
* **Rate limiter** (optional): when `cfg.Rate > 0`, a shared `time.Ticker` gate paces total DNS queries per second across the whole pool. Each worker waits on the gate before issuing a query, selecting on `ctx.Done()` so cancellation stays responsive. `0` means unlimited.
95-
* **Closing the Channel**: After all entries are sent, the channel is closed, signalling workers to exit. `wg.Wait()` blocks until all workers are done, then `EventDone` is emitted.
96+
* **Completion**: `wg.Wait()` blocks until all workers exit (after the dispatcher closes `jobs`), then the progress ticker is stopped and `EventDone` is emitted.
9697
* **Interactions**: `scan.Run` is the single entry point for scanning used by both the CLI output pipeline and the Bubble Tea TUI. It decouples the scan engine from any specific display layer.
9798

9899
### 2.5. Output Formatting (`internal/output`)
99100

100101
* **Purpose**: Thread-safe output that keeps stdout pipe-clean. Resolved subdomains go to stdout; everything else (progress, verbose diagnostics, errors) goes to stderr.
101102
* **Implementation**:
102103
* `output.Writer` struct with mutex-protected methods:
103-
* `Result(domain, records)` - in `text` format prints `Found: <domain>` to stdout (and the output file if configured); in `json` format buffers `{"subdomain", "records"}` objects and writes a single array at completion; in `csv` format streams `subdomain,type,value` rows with a header. The format is selected with `-format text|json|csv` (default `text`, which is byte-for-byte identical to prior behavior). Output formats are CLI-only for now (TUI-pending).
104+
* `Result(domain, records)` - in `text` format prints `Found: <domain>` to stdout (and the output file if configured); in `json` format buffers `{"subdomain", "records"}` objects and writes a single array at completion; in `csv` format streams `subdomain,type,value` rows with a header. The format is selected with `-format text|json|csv` (default `text`, which is byte-for-byte identical to prior behavior). The JSON array is buffered because it is a single document and does not stream; JSONL would be the streaming-friendly alternative if needed. Output formats are CLI-only for now (TUI-pending).
104105
* `Progress(pct, processed, total, found)` — writes a carriage-return progress line to stderr.
105106
* `Info(format, args...)` — writes an informational line to stderr.
106107
* `Error(format, args...)` — writes an error line to stderr.

examples/advanced_usage.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,20 @@ Simulation mode with verbose output shows fake IPs and timings:
151151
./subenum -simulate -hit-rate 25 -v -w examples/sample_wordlist.txt example.com
152152
```
153153

154+
## Recursive Enumeration
155+
156+
Use `-recursive` with a `-depth` cap to enumerate subdomains of discovered subdomains. Each resolved subdomain is re-scanned with the same wordlist, up to the depth limit. A visited set provides loop and duplicate protection, and the progress total grows as new work is discovered:
157+
158+
```bash
159+
./subenum -w wordlist.txt -recursive -depth 2 example.com
160+
```
161+
162+
Combine with simulation mode to see how the work tree expands without any network I/O:
163+
164+
```bash
165+
./subenum -simulate -hit-rate 100 -recursive -depth 3 -w examples/sample_wordlist.txt example.com
166+
```
167+
154168
## Record Types
155169

156170
By default `subenum` looks up `A` and `AAAA` records. Use `-type` to choose which record types to query and treat as a hit. A subdomain counts as found if any requested type resolves:

internal/scan/runner.go

Lines changed: 128 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,17 @@ type Config struct {
2121
Attempts int
2222
Force bool
2323
Verbose bool
24-
Rate int
25-
Types []string
24+
Rate int // max DNS queries per second across all workers (0 = unlimited)
25+
Types []string // record types to look up (A, AAAA, CNAME); empty = A,AAAA
26+
Recursive bool // enumerate subdomains of discovered subdomains
27+
Depth int // max recursion depth (1 = no recursion)
28+
}
29+
30+
// job is a single unit of work: a fully qualified domain to test and its depth
31+
// in the recursion tree (initial entries are depth 1).
32+
type job struct {
33+
domain string
34+
depth int
2635
}
2736

2837
// EventKind categorises a scan event.
@@ -40,7 +49,7 @@ const (
4049
type Event struct {
4150
Kind EventKind
4251
Domain string // EventResult: the resolved subdomain
43-
Records []dns.Record // EventResult: the resolved records
52+
Records []dns.Record // EventResult: the resolved records (A/AAAA/CNAME)
4453
Processed int64 // EventProgress
4554
Total int64 // EventProgress
4655
Found int64 // EventProgress / EventDone
@@ -53,8 +62,13 @@ type Event struct {
5362
func Run(ctx context.Context, cfg Config, events chan<- Event) {
5463
defer close(events)
5564

56-
total := int64(len(cfg.Entries))
57-
var processed, found int64
65+
var total, processed, found int64
66+
atomic.StoreInt64(&total, int64(len(cfg.Entries)))
67+
68+
maxDepth := cfg.Depth
69+
if maxDepth < 1 {
70+
maxDepth = 1
71+
}
5872

5973
// Wildcard detection (skip in simulation mode).
6074
if !cfg.Simulate {
@@ -73,9 +87,17 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) {
7387
}
7488
}
7589

76-
subdomains := make(chan string)
7790
var wg sync.WaitGroup
7891

92+
// Work queue channels. The dispatcher owns the lifecycle: it tracks
93+
// outstanding work and closes jobs only once every enqueued job has
94+
// completed. This lets workers safely enqueue depth-capped children after
95+
// the initial feed, which the old "close right after feeding" shape could
96+
// not do without risking a send on a closed channel.
97+
jobs := make(chan job)
98+
enqueue := make(chan job)
99+
completed := make(chan struct{})
100+
79101
// Optional rate limiter: a shared ticker gate paces total queries per second
80102
// across the whole worker pool. nil means unlimited.
81103
var limiter <-chan time.Time
@@ -104,7 +126,7 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) {
104126
p := atomic.LoadInt64(&processed)
105127
f := atomic.LoadInt64(&found)
106128
select {
107-
case events <- Event{Kind: EventProgress, Processed: p, Total: total, Found: f}:
129+
case events <- Event{Kind: EventProgress, Processed: p, Total: atomic.LoadInt64(&total), Found: f}:
108130
case <-tickerDone:
109131
return
110132
case <-ctx.Done():
@@ -118,52 +140,72 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) {
118140
}
119141
}()
120142

143+
// Dispatcher: owns the queue, the visited set (loop/dup protection), and the
144+
// pending-work counter. It closes jobs when pending reaches zero (all work
145+
// done) or when the context is cancelled.
146+
go func() {
147+
visited := make(map[string]bool, len(cfg.Entries))
148+
queue := make([]job, 0, len(cfg.Entries))
149+
for _, entry := range cfg.Entries {
150+
d := entry + "." + cfg.Domain
151+
if !visited[d] {
152+
visited[d] = true
153+
queue = append(queue, job{domain: d, depth: 1})
154+
}
155+
}
156+
pending := len(queue)
157+
atomic.StoreInt64(&total, int64(pending))
158+
if pending == 0 {
159+
close(jobs)
160+
return
161+
}
162+
for {
163+
var out chan job
164+
var next job
165+
if len(queue) > 0 {
166+
out = jobs
167+
next = queue[0]
168+
}
169+
select {
170+
case <-ctx.Done():
171+
close(jobs)
172+
return
173+
case j := <-enqueue:
174+
// Children candidates arrive here; dedup centrally so workers
175+
// need no shared lock. Only new domains add to pending/total.
176+
if !visited[j.domain] {
177+
visited[j.domain] = true
178+
queue = append(queue, j)
179+
pending++
180+
atomic.AddInt64(&total, 1)
181+
}
182+
case out <- next:
183+
queue = queue[1:]
184+
case <-completed:
185+
pending--
186+
if pending == 0 {
187+
close(jobs)
188+
return
189+
}
190+
}
191+
}
192+
}()
193+
121194
// Worker pool.
122195
for i := 0; i < cfg.Concurrency; i++ {
123196
wg.Add(1)
124197
go func() {
125198
defer wg.Done()
126-
for prefix := range subdomains {
127-
if ctx.Err() != nil {
128-
atomic.AddInt64(&processed, 1)
129-
continue
130-
}
131-
if limiter != nil {
132-
select {
133-
case <-limiter:
134-
case <-ctx.Done():
135-
atomic.AddInt64(&processed, 1)
136-
continue
137-
}
138-
}
139-
fullDomain := prefix + "." + cfg.Domain
140-
var resolved bool
141-
var records []dns.Record
142-
if cfg.Simulate {
143-
records, resolved = dns.SimulateResolve(fullDomain, cfg.HitRate, cfg.Verbose, cfg.Types)
144-
} else {
145-
records, resolved = dns.ResolveDomainWithRetry(ctx, fullDomain, cfg.Timeout, cfg.DNSServer, cfg.Verbose, cfg.Attempts, cfg.Types)
146-
}
147-
if resolved {
148-
atomic.AddInt64(&found, 1)
149-
events <- Event{Kind: EventResult, Domain: fullDomain, Records: records}
199+
for j := range jobs {
200+
processJob(ctx, cfg, j, maxDepth, limiter, events, enqueue, &processed, &found)
201+
select {
202+
case completed <- struct{}{}:
203+
case <-ctx.Done():
150204
}
151-
atomic.AddInt64(&processed, 1)
152205
}
153206
}()
154207
}
155208

156-
// Feed entries into the worker pool.
157-
for _, entry := range cfg.Entries {
158-
select {
159-
case <-ctx.Done():
160-
goto drain
161-
case subdomains <- entry:
162-
}
163-
}
164-
165-
drain:
166-
close(subdomains)
167209
wg.Wait()
168210
// Stop the ticker goroutine and wait for it to fully exit before emitting
169211
// EventDone, so the deferred close(events) can never race an in-flight
@@ -174,7 +216,49 @@ drain:
174216
events <- Event{
175217
Kind: EventDone,
176218
Processed: atomic.LoadInt64(&processed),
177-
Total: total,
219+
Total: atomic.LoadInt64(&total),
178220
Found: atomic.LoadInt64(&found),
179221
}
180222
}
223+
224+
// processJob resolves a single job and, on success, optionally enqueues
225+
// depth-capped children for recursive enumeration.
226+
func processJob(ctx context.Context, cfg Config, j job, maxDepth int, limiter <-chan time.Time, events chan<- Event, enqueue chan<- job, processed, found *int64) {
227+
defer atomic.AddInt64(processed, 1)
228+
229+
if ctx.Err() != nil {
230+
return
231+
}
232+
if limiter != nil {
233+
select {
234+
case <-limiter:
235+
case <-ctx.Done():
236+
return
237+
}
238+
}
239+
240+
var resolved bool
241+
var records []dns.Record
242+
if cfg.Simulate {
243+
records, resolved = dns.SimulateResolve(j.domain, cfg.HitRate, cfg.Verbose, cfg.Types)
244+
} else {
245+
records, resolved = dns.ResolveDomainWithRetry(ctx, j.domain, cfg.Timeout, cfg.DNSServer, cfg.Verbose, cfg.Attempts, cfg.Types)
246+
}
247+
if !resolved {
248+
return
249+
}
250+
251+
atomic.AddInt64(found, 1)
252+
events <- Event{Kind: EventResult, Domain: j.domain, Records: records}
253+
254+
if cfg.Recursive && j.depth < maxDepth {
255+
for _, entry := range cfg.Entries {
256+
child := job{domain: entry + "." + j.domain, depth: j.depth + 1}
257+
select {
258+
case enqueue <- child:
259+
case <-ctx.Done():
260+
return
261+
}
262+
}
263+
}
264+
}

0 commit comments

Comments
 (0)