pm_edge_engine

Deterministic Rust (tokio) Polymarket sports edge engine with independent probability models.

Status

Early-stage open-source project focused on reproducible market evaluation and candidate order generation for Polymarket sports markets.

This repository is opinionated about three things:

deterministic model output for the same input state
explicit JSON contracts for downstream tooling
risk-aware candidate generation instead of unconstrained trade chasing

Why this repo exists

Most prediction-market tooling either copies market prices, hides model logic behind opaque services, or mixes research and execution state in ways that are hard to audit. pm_edge_engine is meant to be a transparent baseline that keeps pricing, calibration, mapping, and order filtering inspectable.

Features

Independent probabilities (no LLM, no price->prob copying)
Data ingestion:
- Polymarket Gamma API (/markets, /markets/slug/{slug})
- football-data.org v4 matches/results
- OpenLigaDB public fallback for supported competitions when FOOTBALL_DATA_TOKEN is absent
- TheSportsDB targeted event lookup for missing match mappings in selected leagues
SQLite cache with WAL mode
Models:
- ELO baseline (time decay)
- League Poisson attack/defense model (weighted MLE-like optimization)
- Hybrid blend (V1 default 0.55 Poisson + 0.45 ELO)
V2 upgrades:
- Odds fusion plugin interface (odds_provider.rs)
- Calibration (isotonic / platt in calibration.rs)
- Match confidence gate in market mapping
- Dynamic cost model and dynamic min-edge in order engine
- League-wise Poisson auto-degrade (<800 matches => ELO-only)

Safety boundaries

This repository does not claim guaranteed profitability. It is an execution-support engine with explicit filters and conservative defaults.

Invalid market state should resolve to no action.
Model confidence and market liquidity gates are enforced before order generation.
Near-event and high-cost setups are filtered out.
Generated orders are candidate outputs; production deployment still requires separate operational controls, monitoring, and secrets handling.

Requirements

Rust stable
Optional environment variable:
- FOOTBALL_DATA_TOKEN (used as the primary football-data source when present)

Without FOOTBALL_DATA_TOKEN, fetch can fall back to OpenLigaDB for supported competitions such as Bundesliga and selected UEFA competitions.

Quick Start

cargo build
cp config.toml.example config.toml

1) Fetch data

cargo run -- fetch

2) Train models

cargo run -- train

3) Predict fair probs from markets file

cargo run -- predict --markets_file examples/markets_input.json > fair_probs.json

4) Generate candidate orders

cargo run -- candidates --markets_file examples/markets_input.json --equity_usd 50 > orders.json

5) Produce a shadow-book report

cargo run -- shadow --markets_file examples/markets_input_wait.json --equity_usd 100

This prints a machine-readable JSON report containing:

candidate decisions and reason codes
any generated orders
resolved shadow PnL for orders whose mapped match already has a final score
summary metrics such as buy_count, settled_orders, total_pnl_usd, and roi_pct

6) Diagnose market mapping

cargo run -- diagnose --markets_file examples/demo_real_market_input.json > mapping_diagnostics.json

To generate a paste-ready GitHub issue body instead:

cargo run -- diagnose --markets_file examples/demo_real_market_input.json --issue-body > mapping_issue.md

This prints a machine-readable JSON report containing:

mapping state for each market
reason codes, match confidence, and match references
summary counts for mapped, unmapped, remote-lookup, and no-match rows

Use this when filing the mapping-miss report or when you need a compact view of why a market did not map cleanly.

7) Replay dated snapshots in backtest mode

cargo run -- backtest --snapshots_file examples/backtest_input_inline.json --equity_usd 200

or use a manifest that references many snapshot files:

cargo run -- backtest --snapshots_file examples/backtest_manifest.json --equity_usd 200

or run the tail-window replay example with local odds snapshots:

cargo run -- backtest --snapshots_file examples/backtest_tail_manifest.json --equity_usd 200

or run the 5/10/15-minute tail replay example with replay-only overrides:

cargo run -- backtest --snapshots_file examples/backtest_tail_515_manifest.json --equity_usd 200

or build a manifest from live archived sports-tail snapshots and replay that:

cargo run -- tail-manifest \
  --snapshots_dir ../polymarket-bot/state/pm_edge_tail_history/snapshots \
  --manifest_out ../polymarket-bot/state/pm_edge_tail_history/manifests/latest_tail.json \
  --from_utc 2026-03-01 \
  --to_utc 2026-03-12

cargo run -- backtest --snapshots_file ../polymarket-bot/state/pm_edge_tail_history/manifests/latest_tail.json --equity_usd 200

This prints a machine-readable JSON report containing:

per-snapshot counts for BUY / WAIT, newly entered orders, and settled trades
a trade ledger with entry price, size, mapped match, and realized PnL
breakdowns by entry date and by league with realized PnL and hit rate
a by_minutes_to_start section for pre-kickoff bucket analysis (5-9, 10-14, 15-29, etc.)
bankroll summary metrics such as trades_entered, settled_trades, total_pnl_usd, roi_pct, and max_drawdown_usd

Usage notes:

backtest accepts a self-contained file with matches plus snapshots, or a snapshots-only file that reuses your local sqlite match cache
backtest also accepts a manifest with matches_files and snapshot_files; relative paths resolve from the manifest file location
backtest accepts optional tail_window filters so you can restrict replay to late-market samples before kickoff
backtest accepts optional inline odds or manifest odds_files; replay uses the freshest odds snapshot at or before each as_of_utc
backtest accepts optional replay-only overrides so you can relax timing or model gates for controlled experiments without changing config.toml
if replay ends with open trades but the mapped match result is already known, backtest now settles them at the end of the run instead of leaving them artificially OPEN
tail-manifest scans an archive directory of one-snapshot JSON files and writes a standard backtest manifest
polymarket-bot/scripts/live/clawx_signal_pm_edge.sh now auto-archives sports-tail snapshots into state/pm_edge_tail_history/snapshots/ when that signal path runs
with the default max_single_trade_equity_pct=0.0075 and a 1 USD minimum order size, bankrolls below about 133.34 USD will naturally produce no trades

8) Run scheduler daemon

cargo run -- run

Scheduler behavior:

every 15 minutes: refresh Polymarket markets
every 60 minutes: refresh football-data or public fallback + retrain
after refresh: writes fair_probs.json + orders.json (if enabled)

Primary workflow

Fetch market and match data.
Train or refresh league models.
Map Polymarket markets to football fixtures.
Produce fair probabilities.
Generate candidate orders only when edge, confidence, liquidity, and timing filters pass.

CLI

pm_edge_engine fetch
pm_edge_engine train
pm_edge_engine predict --markets_file input.json
pm_edge_engine candidates --markets_file input.json
pm_edge_engine shadow --markets_file input.json
pm_edge_engine diagnose --markets_file input.json
pm_edge_engine diagnose --markets_file input.json --issue-body
pm_edge_engine backtest --snapshots_file input.json
pm_edge_engine tail-manifest --snapshots_dir dir --manifest_out file
pm_edge_engine run

JSON I/O

Input `markets_file`

Either:

[
  {
    "market_slug": "...",
    "question": "...",
    "outcomes": ["Yes", "No"],
    "prices": [0.47, 0.53],
    "best_bid": 0.46,
    "best_ask": 0.48,
    "spread": 0.02,
    "liquidity": 5000,
    "volume": 12000,
    "volume_5m": 900,
    "start_time_utc": "2026-02-18T18:00:00Z",
    "event_title": "Team A vs Team B",
    "event_slug": "...",
    "event_home_team": "Team A",
    "event_away_team": "Team B",
    "league_hint": "PL",
    "active": true,
    "closed": false,
    "accepting_orders": true
  }
]

or:

{ "markets": [ ... ] }

Input `snapshots_file`

backtest accepts:

{
  "matches": [
    {
      "id": "fixture-match",
      "league": "PL",
      "season": "2026",
      "datetime_utc": "2026-02-18T18:00:00Z",
      "home_team": "Team A",
      "away_team": "Team B",
      "home_goals": 2,
      "away_goals": 0,
      "status": "FINISHED"
    }
  ],
  "snapshots": [
    {
      "as_of_utc": "2026-02-18T12:00:00Z",
      "markets": [ ... ]
    }
  ]
}

Notes:

matches is optional; when omitted, backtest uses the local sqlite cache for historical match rows
odds is optional; when present, rows are matched by league/team names and event time, then filtered to snapshots fetched at or before as_of_utc
tail_window is optional and can restrict replay to markets whose start_time_utc falls within a chosen minute range
snapshots[*].as_of_utc anchors both model training decay and execution filters to that point in time
markets uses the same MarketRecord shape as predict and candidates

It also accepts a manifest:

{
  "tail_window": {
    "min_minutes_to_start": 5,
    "max_minutes_to_start": 15,
    "require_start_time": true
  },
  "overrides": {
    "engine": {
      "min_time_to_event_minutes": 5
    },
    "model": {
      "poisson_min_matches": 200
    }
  },
  "matches_files": ["backtest_batch/matches.json"],
  "odds_files": ["backtest_tail_batch/odds.json"],
  "snapshot_files": [
    "backtest_batch/2026-02-18T12-00-00Z.json",
    "backtest_batch/2026-02-18T22-00-00Z.json"
  ]
}

Manifest notes:

matches_files and snapshot_files are optional lists that are merged with any inline matches / snapshots
odds_files is an optional list merged with any inline odds
overrides is optional and only affects the replay process; it does not modify your persistent runtime config
match files may be either a raw array or { "matches": [...] }
odds files may be either a raw array or { "odds": [...] }
snapshot files may be either a single { "as_of_utc": ..., "markets": [...] } object or { "snapshots": [...] }
relative paths resolve from the manifest file's directory

fair_probs output

{"results":[{"market_slug":"...","fair_probs":[0.5,0.5]}]}

orders output

{"orders":[{"market_slug":"...","side":"BUY","outcome_index":0,"limit_price":0.42,"size_usd":5.0,"order_type":"maker"}]}

Schema and compatibility notes:

See docs/JSON_CONTRACT.md for schema/versioning expectations.
Machine-readable reference schemas live under schemas/.
Consumers should join on market_slug and treat example numeric values as illustrative, not frozen snapshots.

Examples

See examples/README.md for:

minimal and extended market input payloads
annotated notes for the extended example
a deterministic WAIT fixture and empty-order example
a self-contained backtest snapshot example
a batch backtest manifest example
a tail-window backtest manifest with local odds snapshots
example fair-probability and order outputs
copy-paste commands for local prediction and candidate generation
JSON schema references for downstream tooling

See docs/DEMO.md for a short walkthrough with captured CLI outputs from a real-team sample payload.

Config

See config.toml.example.

Runtime env overrides:

PM_EDGE_CONFIG
PM_EDGE_DB_PATH
FOOTBALL_DATA_TOKEN
FOOTBALL_COMPETITIONS
PM_EDGE_PUBLIC_FOOTBALL_FALLBACK_ENABLED
PM_EDGE_SPORTSDB_LOOKUP_ENABLED
PM_EDGE_BASE_MIN_EDGE
PM_EDGE_MIN_MATCH_CONFIDENCE
PM_EDGE_ODDS_ENABLED

Notes

No API keys are hardcoded.
Model output is deterministic for the same input state.
If FOOTBALL_DATA_TOKEN is missing and public fallback is enabled, fetch uses OpenLigaDB for supported competition codes (BL1, CL/UCL, EL/UEL, best-effort PL).
Unsupported fallback competition codes remain explicit skips, not silent substitutions.
If a market still fails local match mapping, predict/candidates can query TheSportsDB by event name for supported leagues (PL, PD, SA, FL1, BL1) before falling back to NO_MATCH_MAPPING.
TheSportsDB is used here as a low-volume mapping repair path, not as a bulk historical training source.
If FOOTBALL_DATA_TOKEN is missing and public fallback is disabled, fetch skips football ingestion without crashing.

Development

Run the local validation loop:

cargo fmt --all
cargo check --all-targets
cargo test --all-targets

CI runs the same checks on pushes to main and on pull requests. The test suite now includes example-driven fixture coverage for both a mapped predict flow and a deterministic WAIT candidate path. Dependabot tracks Cargo and GitHub Actions updates weekly, and CodeQL runs on pushes, pull requests, and a scheduled scan.

Roadmap

Expand unit and fixture-based test coverage across market mapping and calibration flows.
Add more examples for input preparation and output interpretation.
Add release notes and tagged versions as the CLI and JSON contracts stabilize.
Broaden odds-provider integrations while keeping deterministic fallbacks.

Open roadmap issues:

Current milestone:

v0.2.0

Good ways to contribute right now:

Contributing

See CONTRIBUTING.md.

Project policies:

Support

Use the GitHub issue templates for bugs and feature requests. Include repro steps, example payloads, and the commit or release you tested against. For mapping misses, use the dedicated mapping-miss template and include the expected fixture plus the raw market payload. If you have a local payload, run cargo run -- diagnose --markets_file <file> --issue-body and paste the Markdown into the issue body. Use --issue-body only when you want a ready-to-edit report; otherwise attach the raw JSON diagnostics.

Discussion entry points:

If you want to share the project externally, see docs/OUTREACH.md for ready-to-post copy. If you need to integrate the CLI into another tool, start with docs/JSON_CONTRACT.md.

Changelog

See CHANGELOG.md.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
docs		docs
examples		examples
schemas		schemas
src		src
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
config.toml.example		config.toml.example
findings.md		findings.md
progress.md		progress.md
task_plan.md		task_plan.md

Folders and files

Latest commit

History

Repository files navigation

pm_edge_engine

Status

Why this repo exists

Features

Safety boundaries

Requirements

Quick Start

1) Fetch data

2) Train models

3) Predict fair probs from markets file

4) Generate candidate orders

5) Produce a shadow-book report

6) Diagnose market mapping

7) Replay dated snapshots in backtest mode

8) Run scheduler daemon

Primary workflow

CLI

JSON I/O

Input markets_file

Input snapshots_file

fair_probs output

orders output

Examples

Config

Notes

Development

Roadmap

Contributing

Support

Changelog

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Input `markets_file`

Input `snapshots_file`

Packages