Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions notebooks/06_demo.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Phase 4 - Demo (Colab runner)\n",
"\n",
"Runner only: mount Drive, pull the Phase 4 branch, install gradio, regenerate the summary, then\n",
"launch the artifact-backed Gradio demo. Logic lives in `scripts/run_demo.py` and `src/`, not in\n",
"this notebook (P1/P2).\n",
"\n",
"The demo degrades gracefully: it launches with **no** `OPENROUTER_API_KEY` (metrics, artifact\n",
"views, and BM25 retrieval all work; the answer-generation tab shows disabled). Dense/RRF retrieval\n",
"light up only if the embedding stack is installed. Set the key in the optional cell below to enable\n",
"grounded answer generation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Boot"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Mount Drive so config.OUTPUT_ROOT (the staged artifacts + chunks) is available.\n",
"from google.colab import drive\n",
"drive.mount('/content/drive')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 2. Get the code onto the VM and pin the Phase 4 branch.\n",
"# Repin BRANCH to 'main' once the Phase 4 PRs are merged.\n",
"import os\n",
"\n",
"REPO = '/content/FinDocStructRAG'\n",
"BRANCH = 'feature/phase4-demo' # PR-C; flip to 'main' after merge\n",
"\n",
"if not os.path.isdir(f'{REPO}/.git'):\n",
" !git clone --quiet https://github.com/AD2000X/FinDocStructRAG.git {REPO}\n",
"\n",
"!cd {REPO} && git fetch origin --quiet\n",
"!cd {REPO} && git checkout {BRANCH} && git pull --ff-only origin {BRANCH}\n",
"!cd {REPO} && echo branch: $(git rev-parse --abbrev-ref HEAD) HEAD: $(git log --oneline -1)\n",
"%cd /content/FinDocStructRAG"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install demo deps\n",
"\n",
"gradio is a demo-only dependency (not in `requirements-core.txt`)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -m pip install -q gradio"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## (Optional) enable answer generation\n",
"\n",
"Leave this cell as-is to run retrieval-only (no key needed). To enable grounded answer generation,\n",
"store `OPENROUTER_API_KEY` in Colab Secrets and uncomment the two lines."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# from google.colab import userdata\n",
"# os.environ['OPENROUTER_API_KEY'] = userdata.get('OPENROUTER_API_KEY')\n",
"print('answer generation:', 'enabled' if os.getenv('OPENROUTER_API_KEY') else 'disabled (retrieval-only)')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build the summary\n",
"\n",
"So the Overview tab and the embedded metrics table are fresh."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python scripts/build_phase4_summary.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Launch the demo\n",
"\n",
"This cell stays running and prints a public `share` URL. Open it to use the tabs: Overview,\n",
"Table QA, Table Extraction, Layout, FUNSD Relations, Limitations. Stop the cell to shut the app\n",
"down."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python scripts/run_demo.py"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading