diff --git a/CHANGELOG.md b/CHANGELOG.md index 69f8400..448c3f2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,13 +5,17 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [Unreleased] +## [0.6.0] - 2026-06-03 ### Added - Resolved records are now captured during scans. `internal/dns` exposes `Resolve` and a `Record{Type, Value}` type; `scan.Event` carries `Records` for each resolved subdomain (A/AAAA today, extensible to CNAME and more). - `-format text|json|csv` flag (default `text`, byte-for-byte identical to prior output). JSON emits a buffered array of `{"subdomain", "records"}` objects; CSV streams `subdomain,type,value` rows with a header. The `-o` output file honors the selected format. Output formats are CLI-only for now (TUI-pending). - `-rate ` flag (default 0 = unlimited) caps total DNS queries per second across the worker pool via a shared stdlib ticker gate inside `scan.Run`. The limiter respects context cancellation so `Ctrl+C` stays responsive. - `-type A,AAAA,CNAME` flag (default `A,AAAA`, preserving prior behavior) performs per-type DNS lookups and filters results to the requested types. The resolved record type is carried in the existing `Record` shape, so the JSON/CSV schema is unchanged. +- `-recursive` and `-depth ` flags for recursive enumeration of discovered subdomains. `scan.Run` was restructured around a dispatcher that tracks outstanding work and closes the queue only when it drains to zero, so resolved subdomains can safely enqueue depth-capped children (the previous close-after-feed shape would have panicked on a send to a closed channel). A centralized visited set provides loop and duplicate protection, and the progress total expands as new work is discovered. + +### Changed +- Internal: the scan engine's worker queue lifecycle moved from a feed-then-close channel to a dispatcher-owned queue with a pending-work counter. ## [0.5.1] - 2026-06-03 diff --git a/README.md b/README.md index de8369a..53e1777 100644 --- a/README.md +++ b/README.md @@ -80,6 +80,7 @@ Or launch the interactive terminal UI with no flags: | Output Formats | Emit results as `text`, `json` (array of subdomain plus typed records), or `csv` via `-format` | | Rate Limiting | Cap total DNS queries per second across the worker pool with `-rate` (context-aware) | | Record Types | Look up and filter by `A`, `AAAA`, or `CNAME` records with `-type` | +| Recursive Enumeration | Enumerate subdomains of discovered subdomains with `-recursive` and a `-depth` cap, with loop and duplicate protection | | Interactive TUI | Form-based config and live-scrolling results via `-tui`; session values persisted |
@@ -204,6 +205,8 @@ make help # list all targets | `-format ` | `text` | Output format: `text`, `json`, or `csv` | | `-rate ` | `0` | Max DNS queries per second across all workers (0 = unlimited) | | `-type ` | `A,AAAA` | Comma-separated record types to look up: `A`, `AAAA`, `CNAME` | +| `-recursive` | `false` | Recursively enumerate subdomains of discovered subdomains | +| `-depth ` | `1` | Max recursion depth when `-recursive` is set (1 = no recursion) | | `-v` | `false` | Verbose output: IPs, timings, per-query detail (stderr) | | `-progress` | `true` | Live progress line on stderr | | `-simulate` | `false` | Simulation mode: no real DNS queries | diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index fc7e231..efac608 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -87,12 +87,13 @@ internal/tui/config.go — Session persistence (load/save ~/.config/sube * **Implementation**: The worker pool logic lives in `internal/scan/runner.go` as `scan.Run(ctx, cfg, events)`. Both the CLI (`run()` in `main.go`) and the TUI (`internal/tui`) call this function. * **`scan.Config`**: A struct carrying all scan parameters (domain, entries slice, concurrency, timeout, DNS server, simulate flag, etc.). * **`scan.Event` / `scan.EventKind`**: Typed events emitted on a `chan<- scan.Event` — `EventResult`, `EventProgress`, `EventWildcard`, `EventError`, `EventDone`. - * **`subdomains := make(chan string)`**: An internal channel acts as a work queue. Entries from the pre-loaded wordlist slice are fed into it. + * **Dispatcher and work queue**: A dispatcher goroutine owns an internal `jobs` channel, the queue of pending `job{domain, depth}` items, a visited set, and a pending-work counter. It seeds the queue from the wordlist slice and feeds workers. Workers submit newly discovered children back to the dispatcher over an `enqueue` channel and signal each finished job over a `completed` channel. The dispatcher closes `jobs` only when the pending counter reaches zero (or the context is cancelled). This lifecycle lets resolved subdomains enqueue children safely (recursive mode) without risking a send on a closed channel. * **`var wg sync.WaitGroup`**: A `sync.WaitGroup` waits for all worker goroutines to finish. - * **Worker Goroutines Loop**: `cfg.Concurrency` goroutines are launched. Each reads prefixes from the channel, constructs the full domain, and calls `dns.ResolveDomainWithRetry()` (or `dns.SimulateResolution()` in simulate mode). - * **Progress ticker**: A separate goroutine fires every second and emits `EventProgress` events so callers can update their display. + * **Worker Goroutines Loop**: `cfg.Concurrency` goroutines are launched. Each reads a job from `jobs`, constructs nothing further (the job already holds the full domain), and calls `dns.ResolveDomainWithRetry()` (or `dns.SimulateResolve()` in simulate mode). + * **Recursive enumeration** (optional): when `cfg.Recursive` is set and a job at depth `d < cfg.Depth` resolves, the worker enqueues one child per wordlist entry at depth `d+1`. The dispatcher's visited set deduplicates domains (loop and duplicate protection), and the progress total grows as new work is admitted. + * **Progress ticker**: A separate goroutine fires every second and emits `EventProgress` events so callers can update their display. The total is read atomically since recursion can expand it mid-scan. * **Rate limiter** (optional): when `cfg.Rate > 0`, a shared `time.Ticker` gate paces total DNS queries per second across the whole pool. Each worker waits on the gate before issuing a query, selecting on `ctx.Done()` so cancellation stays responsive. `0` means unlimited. - * **Closing the Channel**: After all entries are sent, the channel is closed, signalling workers to exit. `wg.Wait()` blocks until all workers are done, then `EventDone` is emitted. + * **Completion**: `wg.Wait()` blocks until all workers exit (after the dispatcher closes `jobs`), then the progress ticker is stopped and `EventDone` is emitted. * **Interactions**: `scan.Run` is the single entry point for scanning used by both the CLI output pipeline and the Bubble Tea TUI. It decouples the scan engine from any specific display layer. ### 2.5. Output Formatting (`internal/output`) @@ -100,7 +101,7 @@ internal/tui/config.go — Session persistence (load/save ~/.config/sube * **Purpose**: Thread-safe output that keeps stdout pipe-clean. Resolved subdomains go to stdout; everything else (progress, verbose diagnostics, errors) goes to stderr. * **Implementation**: * `output.Writer` struct with mutex-protected methods: - * `Result(domain, records)` - in `text` format prints `Found: ` to stdout (and the output file if configured); in `json` format buffers `{"subdomain", "records"}` objects and writes a single array at completion; in `csv` format streams `subdomain,type,value` rows with a header. The format is selected with `-format text|json|csv` (default `text`, which is byte-for-byte identical to prior behavior). Output formats are CLI-only for now (TUI-pending). + * `Result(domain, records)` - in `text` format prints `Found: ` to stdout (and the output file if configured); in `json` format buffers `{"subdomain", "records"}` objects and writes a single array at completion; in `csv` format streams `subdomain,type,value` rows with a header. The format is selected with `-format text|json|csv` (default `text`, which is byte-for-byte identical to prior behavior). The JSON array is buffered because it is a single document and does not stream; JSONL would be the streaming-friendly alternative if needed. Output formats are CLI-only for now (TUI-pending). * `Progress(pct, processed, total, found)` — writes a carriage-return progress line to stderr. * `Info(format, args...)` — writes an informational line to stderr. * `Error(format, args...)` — writes an error line to stderr. diff --git a/examples/advanced_usage.md b/examples/advanced_usage.md index d655239..eff4915 100644 --- a/examples/advanced_usage.md +++ b/examples/advanced_usage.md @@ -151,6 +151,20 @@ Simulation mode with verbose output shows fake IPs and timings: ./subenum -simulate -hit-rate 25 -v -w examples/sample_wordlist.txt example.com ``` +## Recursive Enumeration + +Use `-recursive` with a `-depth` cap to enumerate subdomains of discovered subdomains. Each resolved subdomain is re-scanned with the same wordlist, up to the depth limit. A visited set provides loop and duplicate protection, and the progress total grows as new work is discovered: + +```bash +./subenum -w wordlist.txt -recursive -depth 2 example.com +``` + +Combine with simulation mode to see how the work tree expands without any network I/O: + +```bash +./subenum -simulate -hit-rate 100 -recursive -depth 3 -w examples/sample_wordlist.txt example.com +``` + ## Record Types By default `subenum` looks up `A` and `AAAA` records. Use `-type` to choose which record types to query and treat as a hit. A subdomain counts as found if any requested type resolves: diff --git a/internal/scan/runner.go b/internal/scan/runner.go index 05bc67f..fb5f697 100644 --- a/internal/scan/runner.go +++ b/internal/scan/runner.go @@ -21,8 +21,17 @@ type Config struct { Attempts int Force bool Verbose bool - Rate int - Types []string + Rate int // max DNS queries per second across all workers (0 = unlimited) + Types []string // record types to look up (A, AAAA, CNAME); empty = A,AAAA + Recursive bool // enumerate subdomains of discovered subdomains + Depth int // max recursion depth (1 = no recursion) +} + +// job is a single unit of work: a fully qualified domain to test and its depth +// in the recursion tree (initial entries are depth 1). +type job struct { + domain string + depth int } // EventKind categorises a scan event. @@ -40,7 +49,7 @@ const ( type Event struct { Kind EventKind Domain string // EventResult: the resolved subdomain - Records []dns.Record // EventResult: the resolved records + Records []dns.Record // EventResult: the resolved records (A/AAAA/CNAME) Processed int64 // EventProgress Total int64 // EventProgress Found int64 // EventProgress / EventDone @@ -53,8 +62,13 @@ type Event struct { func Run(ctx context.Context, cfg Config, events chan<- Event) { defer close(events) - total := int64(len(cfg.Entries)) - var processed, found int64 + var total, processed, found int64 + atomic.StoreInt64(&total, int64(len(cfg.Entries))) + + maxDepth := cfg.Depth + if maxDepth < 1 { + maxDepth = 1 + } // Wildcard detection (skip in simulation mode). if !cfg.Simulate { @@ -73,9 +87,17 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) { } } - subdomains := make(chan string) var wg sync.WaitGroup + // Work queue channels. The dispatcher owns the lifecycle: it tracks + // outstanding work and closes jobs only once every enqueued job has + // completed. This lets workers safely enqueue depth-capped children after + // the initial feed, which the old "close right after feeding" shape could + // not do without risking a send on a closed channel. + jobs := make(chan job) + enqueue := make(chan job) + completed := make(chan struct{}) + // Optional rate limiter: a shared ticker gate paces total queries per second // across the whole worker pool. nil means unlimited. var limiter <-chan time.Time @@ -104,7 +126,7 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) { p := atomic.LoadInt64(&processed) f := atomic.LoadInt64(&found) select { - case events <- Event{Kind: EventProgress, Processed: p, Total: total, Found: f}: + case events <- Event{Kind: EventProgress, Processed: p, Total: atomic.LoadInt64(&total), Found: f}: case <-tickerDone: return case <-ctx.Done(): @@ -118,52 +140,72 @@ func Run(ctx context.Context, cfg Config, events chan<- Event) { } }() + // Dispatcher: owns the queue, the visited set (loop/dup protection), and the + // pending-work counter. It closes jobs when pending reaches zero (all work + // done) or when the context is cancelled. + go func() { + visited := make(map[string]bool, len(cfg.Entries)) + queue := make([]job, 0, len(cfg.Entries)) + for _, entry := range cfg.Entries { + d := entry + "." + cfg.Domain + if !visited[d] { + visited[d] = true + queue = append(queue, job{domain: d, depth: 1}) + } + } + pending := len(queue) + atomic.StoreInt64(&total, int64(pending)) + if pending == 0 { + close(jobs) + return + } + for { + var out chan job + var next job + if len(queue) > 0 { + out = jobs + next = queue[0] + } + select { + case <-ctx.Done(): + close(jobs) + return + case j := <-enqueue: + // Children candidates arrive here; dedup centrally so workers + // need no shared lock. Only new domains add to pending/total. + if !visited[j.domain] { + visited[j.domain] = true + queue = append(queue, j) + pending++ + atomic.AddInt64(&total, 1) + } + case out <- next: + queue = queue[1:] + case <-completed: + pending-- + if pending == 0 { + close(jobs) + return + } + } + } + }() + // Worker pool. for i := 0; i < cfg.Concurrency; i++ { wg.Add(1) go func() { defer wg.Done() - for prefix := range subdomains { - if ctx.Err() != nil { - atomic.AddInt64(&processed, 1) - continue - } - if limiter != nil { - select { - case <-limiter: - case <-ctx.Done(): - atomic.AddInt64(&processed, 1) - continue - } - } - fullDomain := prefix + "." + cfg.Domain - var resolved bool - var records []dns.Record - if cfg.Simulate { - records, resolved = dns.SimulateResolve(fullDomain, cfg.HitRate, cfg.Verbose, cfg.Types) - } else { - records, resolved = dns.ResolveDomainWithRetry(ctx, fullDomain, cfg.Timeout, cfg.DNSServer, cfg.Verbose, cfg.Attempts, cfg.Types) - } - if resolved { - atomic.AddInt64(&found, 1) - events <- Event{Kind: EventResult, Domain: fullDomain, Records: records} + for j := range jobs { + processJob(ctx, cfg, j, maxDepth, limiter, events, enqueue, &processed, &found) + select { + case completed <- struct{}{}: + case <-ctx.Done(): } - atomic.AddInt64(&processed, 1) } }() } - // Feed entries into the worker pool. - for _, entry := range cfg.Entries { - select { - case <-ctx.Done(): - goto drain - case subdomains <- entry: - } - } - -drain: - close(subdomains) wg.Wait() // Stop the ticker goroutine and wait for it to fully exit before emitting // EventDone, so the deferred close(events) can never race an in-flight @@ -174,7 +216,49 @@ drain: events <- Event{ Kind: EventDone, Processed: atomic.LoadInt64(&processed), - Total: total, + Total: atomic.LoadInt64(&total), Found: atomic.LoadInt64(&found), } } + +// processJob resolves a single job and, on success, optionally enqueues +// depth-capped children for recursive enumeration. +func processJob(ctx context.Context, cfg Config, j job, maxDepth int, limiter <-chan time.Time, events chan<- Event, enqueue chan<- job, processed, found *int64) { + defer atomic.AddInt64(processed, 1) + + if ctx.Err() != nil { + return + } + if limiter != nil { + select { + case <-limiter: + case <-ctx.Done(): + return + } + } + + var resolved bool + var records []dns.Record + if cfg.Simulate { + records, resolved = dns.SimulateResolve(j.domain, cfg.HitRate, cfg.Verbose, cfg.Types) + } else { + records, resolved = dns.ResolveDomainWithRetry(ctx, j.domain, cfg.Timeout, cfg.DNSServer, cfg.Verbose, cfg.Attempts, cfg.Types) + } + if !resolved { + return + } + + atomic.AddInt64(found, 1) + events <- Event{Kind: EventResult, Domain: j.domain, Records: records} + + if cfg.Recursive && j.depth < maxDepth { + for _, entry := range cfg.Entries { + child := job{domain: entry + "." + j.domain, depth: j.depth + 1} + select { + case enqueue <- child: + case <-ctx.Done(): + return + } + } + } +} diff --git a/internal/scan/runner_test.go b/internal/scan/runner_test.go index 14ef353..e87b4a5 100644 --- a/internal/scan/runner_test.go +++ b/internal/scan/runner_test.go @@ -66,6 +66,94 @@ func TestRunSimulateConcurrent(t *testing.T) { } } +// TestRunRecursiveEnqueuesChildren exercises the restructured queue lifecycle: +// resolved subdomains enqueue depth-capped children mid-scan. It asserts no +// panic (send on closed channel), clean completion, and that recursion expands +// the total beyond the initial entry count. Run under -race. +func TestRunRecursiveEnqueuesChildren(t *testing.T) { + // hitRate 100 so every job resolves and spawns children, maximizing the + // chance of catching a send-on-closed-channel race. + cfg := Config{ + Domain: "example.com", + Entries: []string{"www", "api", "dev"}, + Concurrency: 8, + Timeout: time.Second, + Simulate: true, + HitRate: 100, + Attempts: 1, + Recursive: true, + Depth: 3, + } + + events := make(chan Event, 64) + go Run(context.Background(), cfg, events) + + var done *Event + results := 0 + for ev := range events { + switch ev.Kind { + case EventResult: + results++ + case EventDone: + e := ev + done = &e + } + } + + if done == nil { + t.Fatal("no EventDone received") + } + // Initial 3 entries, each resolving and spawning 3 children to depth 3: + // 3 + 3*3 + 3*3*3 = 39 unique jobs. + if done.Total <= 3 { + t.Errorf("expected recursion to expand total beyond initial 3, got %d", done.Total) + } + if done.Processed != done.Total { + t.Errorf("Processed %d != Total %d", done.Processed, done.Total) + } + if int64(results) != done.Found { + t.Errorf("result events %d != Found %d", results, done.Found) + } +} + +// TestRunRecursiveLoopProtection asserts the visited set prevents duplicate or +// cyclic work: with depth high and full resolution, the job count stays finite +// and equals the unique domain count. +func TestRunRecursiveLoopProtection(t *testing.T) { + cfg := Config{ + Domain: "example.com", + Entries: []string{"a", "b"}, + Concurrency: 4, + Timeout: time.Second, + Simulate: true, + HitRate: 100, + Attempts: 1, + Recursive: true, + Depth: 4, + } + + events := make(chan Event, 64) + go Run(context.Background(), cfg, events) + + var done *Event + for ev := range events { + if ev.Kind == EventDone { + e := ev + done = &e + } + } + if done == nil { + t.Fatal("no EventDone received") + } + // 2 entries to depth 4: 2 + 4 + 8 + 16 = 30 unique jobs. + if done.Total != 30 { + t.Errorf("expected 30 unique jobs, got %d", done.Total) + } + if done.Processed != done.Total { + t.Errorf("Processed %d != Total %d", done.Processed, done.Total) + } +} + // TestRunRateLimit asserts that -rate paces queries: N queries at R qps should // take at least (N-1)/R seconds. Uses simulate mode so it is network-free. func TestRunRateLimit(t *testing.T) { diff --git a/main.go b/main.go index 1de7db5..dad70e1 100644 --- a/main.go +++ b/main.go @@ -104,6 +104,8 @@ type cliFlags struct { format string rate int recordTypes string + recursive bool + depth int } func parseFlags() cliFlags { @@ -125,6 +127,8 @@ func parseFlags() cliFlags { flag.StringVar(&f.format, "format", "text", "Output format: text, json, or csv") flag.IntVar(&f.rate, "rate", 0, "Max DNS queries per second across all workers (0 = unlimited)") flag.StringVar(&f.recordTypes, "type", "A,AAAA", "Comma-separated DNS record types to look up: A, AAAA, CNAME") + flag.BoolVar(&f.recursive, "recursive", false, "Recursively enumerate subdomains of discovered subdomains") + flag.IntVar(&f.depth, "depth", 1, "Max recursion depth when -recursive is set (1 = no recursion)") flag.Parse() return f } @@ -155,6 +159,10 @@ func validateFlags(f cliFlags, out *output.Writer, maxAttempts int) (string, boo out.Error("Rate (-rate) must be 0 (unlimited) or a positive integer") return "", false } + if f.depth < 1 { + out.Error("Depth (-depth) must be at least 1") + return "", false + } if !f.testMode { if err := validateDNSServer(f.dnsServer); err != nil { out.Error("DNS server %s: %v", f.dnsServer, err) @@ -327,6 +335,8 @@ func run() int { Verbose: f.verbose, Rate: f.rate, Types: recordTypes, + Recursive: f.recursive, + Depth: f.depth, } events := make(chan scan.Event, 64)