HTML Backtest reporting

## Summary

Replace the current Plotly/Jinja2-based `BacktestReport` with a self-contained HTML dashboard that dynamically adapts to the data:

- **1 strategy with N runs** → single-strategy deep-dive (equity, drawdown, trades, risk, heatmap)
- **N strategies with N runs** → comparison view (ranking table, metric bars, per-strategy drill-down)

The new API should work seamlessly with both live `Backtest` objects (from `run_vector_backtests`) and disk-loaded results.

---

## Motivation

### Current state (v3.7.3)

| Limitation | Detail |
|---|---|
| **Single-run only** | `_create_html_report()` reads `self.backtests[0]` — the rest are ignored |
| **No comparison view** | No way to visualise N strategies side-by-side |
| **No multi-run awareness** | No concept of strategy → runs hierarchy |
| **External deps** | Plotly + Jinja2 bloat the HTML (~3 MB per chart) and require CDN |
| **Flat storage model** | `_is_backtest()` expects `results.json` + `metrics.json` at a single level — doesn't walk `{strategy}/{runs}/{run_name}/` |
| **No `run_vector_backtests` (plural)** | Only `run_vector_backtest` (singular) exists — no batch API |

### New storage format (already in use)

```
backtest_results/
└── batch_1/                          # ← BacktestReport.open() entry point
    ├── 04a159e7/                     # strategy (hash or name)
    │   ├── algorithm_id.json         # {"algorithm_id": "..."}
    │   ├── summary.json              # aggregated metrics across runs
    │   ├── risk_free_rate.json
    │   ├── metadata.json             # optional strategy metadata
    │   └── runs/
    │       ├── backtest_EUR_20240101_20241231/
    │       │   ├── metrics.json      # BacktestMetrics serialised
    │       │   └── run.json          # BacktestResult serialised (trades, etc.)
    │       ├── backtest_EUR_20240331_20250331/
    │       │   ├── metrics.json
    │       │   └── run.json
    │       └── ...
    ├── 1a9ecb38/
    │   └── ...
    └── ...
```

This issue proposes changes across three areas:

1. **New batch API** — `app.run_vector_backtests()` (plural)
2. **Redesigned `BacktestReport`** — new constructor, `__getitem__`, unified HTML generator
3. **Self-contained HTML dashboard** — zero external deps, canvas-based charts, dark/light theme

---

## Proposed API

### 1. From live objects (after `run_vector_backtests`)

```python
from investing_algorithm_framework import BacktestReport

backtests = app.run_vector_backtests(
    strategies=strategies,
    backtest_date_ranges=backtest_windows,
    initial_amount=1000,
)
# Returns: Dict[str, List[Backtest]]
#   key   = strategy identifier (algorithm_id or hash)
#   value = list of Backtest objects (one per date range)

# ── Single strategy (with all its runs) ──
report = BacktestReport(backtests["04a159e7"])
report.show()                          # inline in Jupyter
report.show(browser=True)              # opens in default browser
report.save("single_strategy.html")    # save to file

# ── Compare multiple strategies ──
report = BacktestReport(backtests)     # pass the full dict
report.show()                          # comparison dashboard
```

### 2. From disk (reload in a new session)

```python
# Load all strategies from a batch → comparison view
report = BacktestReport.open("./backtest_results/batch_1/")
report.show()

# Load a single strategy → single-strategy view
report = BacktestReport.open("./backtest_results/batch_1/04a159e7/")
report.show()
```

### 3. Drill-down from comparison to single

```python
report = BacktestReport.open("./backtest_results/batch_1/")
report.show()                          # comparison of all 12 strategies

single = report["04a159e7"]            # select one strategy
single.show()                          # single-strategy dashboard
```

---

## Implementation Plan

### A. New method: `app.run_vector_backtests()` (plural)

```python
# app/app.py

def run_vector_backtests(
    self,
    strategies: List[TradingStrategy],
    backtest_date_ranges: List[BacktestDateRange],
    initial_amount: float = 1000,
    snapshot_interval: SnapshotInterval = SnapshotInterval.DAILY,
    risk_free_rate: Optional[float] = None,
    output_directory: Optional[str] = None,
) -> Dict[str, List[Backtest]]:
    """
    Run vectorised backtests for multiple strategies across multiple
    date ranges.

    Returns a dict keyed by strategy identifier, where each value is
    a list of Backtest objects (one per date range).

    If output_directory is provided, results are persisted to disk in
    the hierarchical format:
        {output_directory}/{strategy_id}/runs/{run_name}/
    """
    results: Dict[str, List[Backtest]] = {}

    for strategy in strategies:
        strategy_id = strategy.algorithm_id  # or hash
        strategy_backtests = []

        for date_range in backtest_date_ranges:
            backtest = self.run_vector_backtest(
                strategy=strategy,
                backtest_date_range=date_range,
                initial_amount=initial_amount,
                snapshot_interval=snapshot_interval,
                risk_free_rate=risk_free_rate,
            )
            strategy_backtests.append(backtest)

        results[strategy_id] = strategy_backtests

        if output_directory:
            self._save_strategy_backtests(
                strategy_id, strategy_backtests, output_directory
            )

    return results
```

### B. Redesigned `BacktestReport`

```python
# app/reporting/backtest_report.py

@dataclass
class BacktestReport:
    """
    Unified backtest report that adapts to the data:
      - 1 strategy  → single-strategy dashboard (runs as pages)
      - N strategies → comparison dashboard (strategies as pages)
    """

    # Internal: dict of {strategy_id: {"summary": dict, "runs": [...]}}
    _strategies: Dict[str, dict] = field(default_factory=dict)
    _html: str = None

    def __init__(self, backtests):
        """
        Accept multiple input shapes:
          - List[Backtest]         → single strategy, multiple runs
          - Dict[str, List[Backtest]] → multiple strategies
        """
        if isinstance(backtests, list):
            # Single strategy — infer ID from first backtest's metadata
            strategy_id = self._infer_strategy_id(backtests)
            self._strategies[strategy_id] = self._build_strategy_entry(backtests)

        elif isinstance(backtests, dict):
            for strategy_id, bt_list in backtests.items():
                self._strategies[strategy_id] = self._build_strategy_entry(bt_list)

    @staticmethod
    def open(directory_path: str) -> "BacktestReport":
        """
        Load from the hierarchical disk format.

        If directory_path points to a single strategy (has summary.json),
        load a single-strategy report.

        If it contains subdirectories with summary.json, load all as a
        comparison report.
        """
        strategies = {}

        if os.path.isfile(os.path.join(directory_path, "summary.json")):
            # Single strategy directory
            strategy_id, entry = BacktestReport._load_strategy_dir(directory_path)
            strategies[strategy_id] = entry
        else:
            # Batch directory — scan for strategy subdirectories
            for name in sorted(os.listdir(directory_path)):
                subdir = os.path.join(directory_path, name)
                if os.path.isdir(subdir) and os.path.isfile(
                    os.path.join(subdir, "summary.json")
                ):
                    strategy_id, entry = BacktestReport._load_strategy_dir(subdir)
                    strategies[strategy_id] = entry

        if not strategies:
            raise OperationalException(
                f"No valid backtest data found in {directory_path}"
            )

        report = BacktestReport.__new__(BacktestReport)
        report._strategies = strategies
        report._html = None
        return report

    def __getitem__(self, strategy_id: str) -> "BacktestReport":
        """
        Drill-down: select a single strategy from a comparison report.
        Returns a new BacktestReport with just that strategy.
        """
        if strategy_id not in self._strategies:
            raise KeyError(f"Strategy '{strategy_id}' not found. "
                           f"Available: {list(self._strategies.keys())}")
        report = BacktestReport.__new__(BacktestReport)
        report._strategies = {strategy_id: self._strategies[strategy_id]}
        report._html = None
        return report

    @property
    def is_single_strategy(self) -> bool:
        return len(self._strategies) == 1

    @property
    def strategy_ids(self) -> list:
        return list(self._strategies.keys())

    def show(self, browser: bool = False):
        """Display the dashboard inline (Jupyter) or in the browser."""
        if not self._html:
            self._html = self._generate_html()

        if self._in_jupyter():
            from IPython.display import display, HTML
            display(HTML(self._html))
        else:
            browser = True

        if browser:
            import tempfile, webbrowser
            path = os.path.join(tempfile.gettempdir(), "backtest_report.html")
            with open(path, "w") as f:
                f.write(self._html)
            webbrowser.open(f"file://{path}")

    def save(self, path: str):
        """Save the HTML dashboard to a file."""
        if not self._html:
            self._html = self._generate_html()
        with open(path, "w") as f:
            f.write(self._html)

    def _generate_html(self) -> str:
        """
        Generate the self-contained HTML dashboard.
        Uses canvas-based charts — zero external dependencies.
        Adapts layout based on len(self._strategies).
        """
        # → calls the unified HTML generator (see section C)
        ...

    @staticmethod
    def _build_strategy_entry(backtests: List[Backtest]) -> dict:
        """Convert a list of Backtest objects into the internal format."""
        runs = []
        for bt in backtests:
            runs.append({
                "name": _derive_run_name(bt),
                "metrics": bt.backtest_metrics.to_dict(),
                "results": bt.backtest_results.to_dict(),
            })
        return {
            "summary": _aggregate_summary(runs),
            "runs": runs,
        }

    @staticmethod
    def _load_strategy_dir(directory_path: str):
        """Load a strategy from disk, returning (strategy_id, entry)."""
        with open(os.path.join(directory_path, "summary.json")) as f:
            summary = json.load(f)

        algo_id_path = os.path.join(directory_path, "algorithm_id.json")
        if os.path.isfile(algo_id_path):
            with open(algo_id_path) as f:
                strategy_id = json.load(f).get("algorithm_id", os.path.basename(directory_path))
        else:
            strategy_id = os.path.basename(directory_path)

        runs = []
        runs_dir = os.path.join(directory_path, "runs")
        if os.path.isdir(runs_dir):
            for run_name in sorted(os.listdir(runs_dir)):
                run_path = os.path.join(runs_dir, run_name)
                metrics_path = os.path.join(run_path, "metrics.json")
                results_path = os.path.join(run_path, "run.json")
                if os.path.isfile(metrics_path):
                    with open(metrics_path) as f:
                        metrics = json.load(f)
                    results = {}
                    if os.path.isfile(results_path):
                        with open(results_path) as f:
                            results = json.load(f)
                    runs.append({
                        "name": run_name,
                        "metrics": metrics,
                        "results": results,
                    })

        return strategy_id, {"summary": summary, "runs": runs}

    @staticmethod
    def _in_jupyter() -> bool:
        try:
            return get_ipython().__class__.__name__ == "ZMQInteractiveShell"
        except (NameError, ImportError):
            return False
```

### C. Self-contained HTML dashboard

The HTML generator is already implemented as a working prototype in `_gen_unified_dashboard.py`. It produces a **zero-dependency**, self-contained HTML file with:

| Feature | Single-Strategy Mode | Multi-Strategy Mode |
|---|---|---|
| **Sidebar** | Overview + individual runs | Overview + strategy names |
| **Overview** | Summary KPIs, runs table, equity overlay (€) | Ranking table with run-view dropdown, normalized equity overlay (%), 2×2 metric bars |
| **Detail pages** | 4 tabs: Overview · Performance · Trades · Risk | 3 tabs: Summary · Runs · Performance |
| **Trades** | Sortable table, donut by symbol, P&L bar | Summary metrics only |
| **Risk** | Rolling Sharpe (252d), underwater equity | — |
| **Charts** | Canvas-based, no external JS libs | Canvas-based, no external JS libs |
| **Theme** | Dark/light toggle | Dark/light toggle |
| **Finterion** | Sponsor page | Sponsor page |

The HTML adapts dynamically at render time based on `STRATEGIES.length === 1`.

---

## Changes to Existing Code

### Files to modify

| File | Change |
|---|---|
| `app/app.py` | Add `run_vector_backtests()` method |
| `app/reporting/backtest_report.py` | Rewrite class (new constructor, `__getitem__`, unified HTML gen) |
| `app/reporting/templates/` | Remove Jinja2 templates (no longer needed) |
| `domain/backtesting/backtest.py` | No changes required |
| `domain/backtesting/backtest_metrics.py` | No changes required |

### Files to add

| File | Purpose |
|---|---|
| `app/reporting/html_generator.py` | Port of `_gen_unified_dashboard.py` — pure-Python HTML string builder |

### Dependencies to remove

| Package | Reason |
|---|---|
| `plotly` | Replaced by canvas-based charts |
| `jinja2` | Replaced by Python f-string template |

### Backward compatibility

| Concern | Mitigation |
|---|---|
| `BacktestReport(backtests=[...])` (current kwarg API) | Support for 1 release via deprecation warning; migrate callers to positional arg |
| `BacktestReport.open(backtests=[], directory_path=None)` | Keep `directory_path` kwarg; drop `backtests` kwarg (use constructor instead) |
| `_is_backtest()` checks for `results.json` | Update to also accept `run.json` |
| `_create_html_report()` | Replaced by `_generate_html()` |

---

## Storage Format Validation

The `_is_backtest` check needs updating. Currently:

```python
# Current — only works with flat format
@staticmethod
def _is_backtest(path):
    return (
        os.path.isfile(os.path.join(path, "results.json"))
        and os.path.isfile(os.path.join(path, "metrics.json"))
    )
```

Proposed detection logic:

```python
@staticmethod
def _is_strategy_dir(path):
    """A strategy dir has summary.json and a runs/ subdirectory."""
    return (
        os.path.isdir(path)
        and os.path.isfile(os.path.join(path, "summary.json"))
    )

@staticmethod
def _is_run_dir(path):
    """A run dir has metrics.json (and optionally run.json or results.json)."""
    return (
        os.path.isdir(path)
        and os.path.isfile(os.path.join(path, "metrics.json"))
    )
```

---

## Field Name Normalisation

Two naming conventions exist in serialised `metrics.json` files:

| Metric | Convention A (event backtest) | Convention B (vector backtest) |
|---|---|---|
| Equity curve | `equity_curve` | `equity` |
| Drawdown series | `drawdown_series` | `drawdown` |
| Cumulative return | `cumulative_return_series` | `cumulative_return` |
| Monthly returns | `monthly_returns` | `monthly_return` |

The HTML generator should normalise on load:

```python
eq = metrics.get("equity_curve") or metrics.get("equity", [])
dd = metrics.get("drawdown_series") or metrics.get("drawdown", [])
```

Long-term, `BacktestMetrics.to_dict()` should standardise the keys.

---

## Acceptance Criteria

- [ ] `BacktestReport(list_of_backtests)` produces a single-strategy dashboard
- [ ] `BacktestReport(dict_of_backtests)` produces a comparison dashboard
- [ ] `BacktestReport.open(strategy_dir)` loads a single strategy from disk
- [ ] `BacktestReport.open(batch_dir)` loads N strategies from disk
- [ ] `report["strategy_id"]` returns a new single-strategy report
- [ ] `report.show()` renders inline in Jupyter
- [ ] `report.show(browser=True)` opens in the default browser
- [ ] `report.save("path.html")` writes a self-contained HTML file
- [ ] HTML has zero external dependencies (no Plotly, no CDN, no Jinja2)
- [ ] HTML adapts layout dynamically for 1 vs N strategies
- [ ] Dark/light theme toggle works
- [ ] `app.run_vector_backtests()` returns `Dict[str, List[Backtest]]`
- [ ] `app.run_vector_backtests(output_directory=...)` persists to disk
- [ ] Both field name conventions (`equity_curve` / `equity`) are handled

---
Report should similiar to quant connect report
![image](https://github.com/user-attachments/assets/025e4eed-39a1-4d75-bce7-e719dd85a38c)

![image](https://github.com/user-attachments/assets/d49efe34-d39c-4375-9d0e-d1c9dfbcd44a)

![image](https://github.com/user-attachments/assets/70d6fa30-256e-45d3-a257-599617b66ab2)

![image](https://github.com/user-attachments/assets/8e458cfa-eb74-4f1b-9baa-60e8f94ad619)


Limitation	Detail
Single-run only	`_create_html_report()` reads `self.backtests[0]` — the rest are ignored
No comparison view	No way to visualise N strategies side-by-side
No multi-run awareness	No concept of strategy → runs hierarchy
External deps	Plotly + Jinja2 bloat the HTML (~3 MB per chart) and require CDN
Flat storage model	`_is_backtest()` expects `results.json` + `metrics.json` at a single level — doesn't walk `{strategy}/{runs}/{run_name}/`
No `run_vector_backtests` (plural)	Only `run_vector_backtest` (singular) exists — no batch API

File	Change
`app/app.py`	Add `run_vector_backtests()` method
`app/reporting/backtest_report.py`	Rewrite class (new constructor, `__getitem__`, unified HTML gen)
`app/reporting/templates/`	Remove Jinja2 templates (no longer needed)
`domain/backtesting/backtest.py`	No changes required
`domain/backtesting/backtest_metrics.py`	No changes required

Concern	Mitigation
`BacktestReport(backtests=[...])` (current kwarg API)	Support for 1 release via deprecation warning; migrate callers to positional arg
`BacktestReport.open(backtests=[], directory_path=None)`	Keep `directory_path` kwarg; drop `backtests` kwarg (use constructor instead)
`_is_backtest()` checks for `results.json`	Update to also accept `run.json`
`_create_html_report()`	Replaced by `_generate_html()`

Metric	Convention A (event backtest)	Convention B (vector backtest)
Equity curve	`equity_curve`	`equity`
Drawdown series	`drawdown_series`	`drawdown`
Cumulative return	`cumulative_return_series`	`cumulative_return`
Monthly returns	`monthly_returns`	`monthly_return`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTML Backtest reporting #283

Summary

Motivation

Current state (v3.7.3)

New storage format (already in use)

Proposed API

1. From live objects (after `run_vector_backtests`)

2. From disk (reload in a new session)

3. Drill-down from comparison to single

Implementation Plan

A. New method: `app.run_vector_backtests()` (plural)

B. Redesigned `BacktestReport`

C. Self-contained HTML dashboard

Changes to Existing Code

Files to modify

Files to add

Dependencies to remove

Backward compatibility

Storage Format Validation

Field Name Normalisation

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	Single-Strategy Mode	Multi-Strategy Mode
Sidebar	Overview + individual runs	Overview + strategy names
Overview	Summary KPIs, runs table, equity overlay (€)	Ranking table with run-view dropdown, normalized equity overlay (%), 2×2 metric bars
Detail pages	4 tabs: Overview · Performance · Trades · Risk	3 tabs: Summary · Runs · Performance
Trades	Sortable table, donut by symbol, P&L bar	Summary metrics only
Risk	Rolling Sharpe (252d), underwater equity	—
Charts	Canvas-based, no external JS libs	Canvas-based, no external JS libs
Theme	Dark/light toggle	Dark/light toggle
Finterion	Sponsor page	Sponsor page

Package	Reason
`plotly`	Replaced by canvas-based charts
`jinja2`	Replaced by Python f-string template

HTML Backtest reporting #283

Description

Summary

Motivation

Current state (v3.7.3)

New storage format (already in use)

Proposed API

1. From live objects (after run_vector_backtests)

2. From disk (reload in a new session)

3. Drill-down from comparison to single

Implementation Plan

A. New method: app.run_vector_backtests() (plural)

B. Redesigned BacktestReport

C. Self-contained HTML dashboard

Changes to Existing Code

Files to modify

Files to add

Dependencies to remove

Backward compatibility

Storage Format Validation

Field Name Normalisation

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. From live objects (after `run_vector_backtests`)

A. New method: `app.run_vector_backtests()` (plural)

B. Redesigned `BacktestReport`