Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 16 additions & 20 deletions PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -542,26 +542,22 @@ Implementation details:

## 7. Next steps

**Phases 0 through 2 are complete and merged** (v1 = table-only RAG; Phase 2 = DocLayNet
layout-crop integration, merged to `main` 2026-06-03). The active branch is
**Phase 3 (FUNSD relation branch)**, `feature/phase3-funsd-relations` off `main`.

Phase 3 V1 is implemented and scored (entirely local, CPU-only, no Colab):

1. Annotation-only deterministic relation baseline: `src/funsd_extraction.py` (parse + dedupe
+ per-answer-argmax predictor), `src/eval_funsd.py` (set-based P/R/F1),
`scripts/evaluate_funsd.py` (+ `scripts/fetch_funsd.py`), `tests/test_funsd_relations.py`
(17 synthetic tests). Full suite 236 passed.
2. Headline (held-out `test_50.qa_links`): P 0.946 / R 0.590 / **F1 0.727**; secondaries in
`DEVLOG.md` (2026-06-03) and `outputs/evaluation/phase3_funsd_relations.json`.

Remaining:

1. Open the Phase 3 PR.
2. Optional (train-only): tune `HeuristicParams` on `train_149` if higher recall is wanted;
never on `test_50`. FUNSD token classification (V2 / seqeval) and threshold-based multi-link
matching are future work, not V1.
3. Phase 4 (full demo + evaluation + report) is the next phase.
**Phases 0 through 3 are complete and merged** (v1 = table-only RAG; Phase 2 = DocLayNet
layout-crop integration; Phase 3 = FUNSD relation baseline, both merged to `main`
2026-06-03). **Phase 4 (full demo + evaluation + report) is the next phase.**

Phase 3 V1 delivered (annotation-only deterministic relation baseline; see
`docs/phase3_brief.md`): `src/funsd_extraction.py` (parse + dedupe + per-answer-argmax
predictor), `src/eval_funsd.py` (set-based P/R/F1), `scripts/evaluate_funsd.py`
(+ `scripts/fetch_funsd.py`), `tests/test_funsd_relations.py` (17 synthetic tests),
`notebooks/05_phase3_funsd_relations.ipynb` (runner). Headline (held-out `test_50.qa_links`):
P 0.946 / R 0.590 / **F1 0.727**; full matrix in `DEVLOG.md` (2026-06-03) and
`outputs/evaluation/phase3_funsd_relations.json`.

Phase 3 follow-ups (optional, deferred to future work):
- Train-only: tune `HeuristicParams` on `train_149` if higher recall is wanted; never on
`test_50`.
- FUNSD token classification (V2 / seqeval) and threshold-based multi-link matching.

---

Expand Down
6 changes: 3 additions & 3 deletions notebooks/05_phase3_funsd_relations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"\n",
"Phase 3 is annotation-only and CPU-only. The FUNSD JSON carries entity text, bbox, label, and GT linking pairs, so this notebook does not load image pixels and does not need a GPU. Logic lives in `src/` and `scripts/`, not in this notebook.\n",
"\n",
"Before running in Colab, make sure `feature/phase3-funsd-relations` has been pushed to GitHub. After Phase 3 merges, set `BRANCH = 'main'` in the boot cell."
"Phase 3 is merged to `main`, so the boot cell pins `BRANCH = 'main'`. Repin it to a dev branch only if you resume Phase 3 work."
]
},
{
Expand Down Expand Up @@ -41,7 +41,7 @@
"import os\n",
"\n",
"REPO = '/content/FinDocStructRAG'\n",
"BRANCH = 'feature/phase3-funsd-relations' # change to 'main' after Phase 3 merges\n",
"BRANCH = 'main' # Phase 3 merged; tracks main\n",
"\n",
"if not os.path.isdir(f'{REPO}/.git'):\n",
" !git clone --quiet https://github.com/AD2000X/FinDocStructRAG.git {REPO}\n",
Expand Down Expand Up @@ -192,7 +192,7 @@
"source": [
"## Step 4a - Error table\n",
"\n",
"Notebook-only qualitative error analysis. This reads FUNSD JSON annotations and relation predictions; it does not write artifacts and does not affect the Phase 3 acceptance gate."
"Notebook-only qualitative error analysis. This reads FUNSD JSON annotations and relation predictions; it does not write artifacts and does not affect the Phase 3 acceptance gate. Step 4 helper functions are notebook-local display utilities only; Phase 3 scoring remains in `src/` and `scripts/`."
]
},
{
Expand Down
Loading