feat(regulatory): classify and publish regulatory actions#2567
feat(regulatory): classify and publish regulatory actions#2567lspassos1 wants to merge 6 commits intokoala73:mainfrom
Conversation
Add a standalone seeder that fetches and normalizes SEC, CFTC, Federal Reserve, FDIC, and FINRA regulatory feeds without introducing new dependencies. The script stays import-safe, tolerates partial feed failure, and emits JSON for the fetch/parse-only phase of the pipeline. Unit tests cover RSS/Atom parsing, deduplication, ordering, and degraded-feed behavior. Refs koala73#2492 Refs koala73#2493 Refs koala73#2494 Refs koala73#2495
Build on the standalone RSS fetcher by adding keyword-based tier classification, aggregate payload counts, and runSeed integration for regulatory:actions:v1. The updated tests cover matched keywords, payload stats, and the runSeed wiring needed for Redis publication. Refs koala73#2493 Depends on koala73#2564
|
@lspassos1 is attempting to deploy a commit to the Elie Team on Vercel. A member of the Team first needs to authorize it. |
Add regulatory:actions:v1 as a new cross-source input, map regulatory actions into the policy category, and emit CROSS_SOURCE_SIGNAL_TYPE_REGULATORY_ACTION signals for recent high/medium items. The new test covers severity scoring and composite escalation when policy, financial, and economic signals co-fire in Global Markets. Refs koala73#2494 Depends on koala73#2567
Greptile SummaryThis PR completes the second step of the regulatory RSS pipeline by adding keyword-based tiering ( Key findings:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Cron as Railway Cron
participant Main as main()
participant Utils as _seed-utils / runSeed
participant Feeds as Regulatory Feeds (5)
participant Redis as Upstash Redis
Cron->>Main: trigger seed-regulatory-actions.mjs
Main->>Utils: runSeed('regulatory','actions','regulatory:actions:v1', fetchFn, opts)
Utils->>Utils: acquireLock('regulatory:actions')
Utils->>Main: invoke fetchFn()
Main->>Main: fetchRegulatoryActionPayload()
Main->>Feeds: Promise.allSettled(fetchFeed × 5)
Feeds-->>Main: RSS/Atom XML responses
Main->>Main: parseFeed() → normalizeFeedItems()
Main->>Main: dedupeAndSortActions()
Main->>Main: classifyAction() × N (high/medium/low)
Main->>Main: buildSeedPayload() → {actions, fetchedAt, recordCount, highCount, mediumCount}
Main-->>Utils: resolved payload
Utils->>Redis: atomicPublish('regulatory:actions:v1', payload, TTL=7200s)
Utils->>Redis: writeFreshnessMetadata('regulatory','actions')
Utils->>Redis: releaseLock()
Reviews (1): Last reviewed commit: "feat(regulatory): classify and publish r..." | Re-trigger Greptile |
Use the repository-standard fetch wrapper in the seeder defaults, keep the documented FINRA HTTP exception in place, and include publish time in generated action ids to avoid same-day collisions. Validated with: node --test tests/regulatory-seed-unit.test.mjs; node scripts/seed-regulatory-actions.mjs | head -n 20
Clean up the leftover cherry-pick marker after carrying the shared seeder hardening changes onto this branch. Validated with: node --test tests/regulatory-seed-unit.test.mjs and a local fetchRegulatoryActionPayload smoke check.
|
Follow-up on the bootstrap thread: I am intentionally leaving |
Review — PR #2567 (classify + Redis write)Why this PR? Classifies fetched regulatory actions by keyword matching and writes to Redis. Blocking1. TTL = cron interval (1x). Must be >= 3x. 2. Missing 3. Keyword model misses obvious enforcement headlines. Fix: Expand the keyword list to include enforcement action verbs:
Also consider classifying against 4. Fix: Either remove 5. Suggestions
|
Extract RSS and Atom descriptions into the normalized action payload so later classifier work can use the same parsed feed output. Also adds @ts-check and documents the FINRA HTTP feed constraint.
Raise the Redis retention window, classify against combined title and description text, reserve low for routine notices, and export the shared regulatory cache key for downstream health wiring.
Summary
This adds the second regulatory RSS step by classifying normalized actions into
high/medium/low, computing aggregate counts, and publishing the final payload toregulatory:actions:v1viarunSeed.Root cause
The fetch/parse seeder from #2564 produces stable normalized actions, but the pipeline still needed tiering logic and Redis publication before cross-source signals could consume regulatory events.
Changes
HIGH_KEYWORDSandMEDIUM_KEYWORDSclassification rules from feat(regulatory): seed-regulatory-actions.mjs — classify + Redis write #2493tierandmatchedKeywords, using word-boundary matching to avoid false positives likebaninsidebankactions,fetchedAt,recordCount,highCount, andmediumCountmain()throughrunSeed('regulatory', 'actions', 'regulatory:actions:v1', ...)with TTL7200and an empty-array-safevalidateFntests/regulatory-seed-unit.test.mjsto cover classification, payload counts, fetch-to-payload flow, and therunSeedwiringValidation
node --test tests/regulatory-seed-unit.test.mjsnode -e "import('./scripts/seed-regulatory-actions.mjs').then(async (m) => { const data = await m.fetchRegulatoryActionPayload(); process.stdout.write(JSON.stringify({recordCount: data.recordCount, highCount: data.highCount, mediumCount: data.mediumCount, first: data.actions[0]}, null, 2) + '\\n'); })"node -e "import('./scripts/seed-regulatory-actions.mjs').then(() => process.stdout.write('import-ok\\n'))"Risk
Low risk. This only evolves the new regulatory seeder introduced in #2564 and adds focused test coverage around the new classification/publish behavior.
Note
This branch is currently stacked on #2564. Until #2564 merges, this PR includes the parent fetch/parse commit in the diff.
Depends on #2564
Closes #2493
Refs #2494
Refs #2495