IBM · nianjunz · Mar 19, 2026 · Feb 4, 2026 · Feb 4, 2026 · Feb 4, 2026
diff --git a/.env.public b/.env.public
@@ -1,8 +1,9 @@
 # ── CouchDB (IoTAgent server) ────────────────────────────────────────────────
 COUCHDB_URL=http://localhost:5984
-COUCHDB_DBNAME=chiller
 COUCHDB_USERNAME=admin
 COUCHDB_PASSWORD=password
+IOT_DBNAME=chiller                                                             
+WO_DBNAME=workorder
 
 # ── IBM WatsonX (plan-execute runner) ────────────────────────────────────────
 WATSONX_APIKEY=

diff --git a/.github/PULL_REQUEST_TEMPLATE/bugfix.md b/.github/PULL_REQUEST_TEMPLATE/bugfix.md
@@ -0,0 +1,17 @@
+## Description
+## Fix Details
+## Impact on Benchmarking
+- [ ] **No change to baselines**: This fix only improves stability/performance.
+- [ ] **Baseline change**: This fix corrects a scoring error. (Please provide "Before vs. After" results).
+
+## Related Issues
+- Fixes: #
+
+## Verification Steps
+1. Run the following command: `uv run pytest tests/integration`
+2. Describe any manual verification performed:
+
+## Checklist
+- [ ] I have added tests that prove my fix is effective.
+- [ ] My code follows the project's Ruff formatting and linting rules.
+- [ ] I have signed off my commits (DCO).
diff --git a/.github/PULL_REQUEST_TEMPLATE/chore.md b/.github/PULL_REQUEST_TEMPLATE/chore.md
@@ -0,0 +1,8 @@
+## Description
+## Changes
+- [ ] Dependency update (`uv lock`)
+- [ ] Documentation / Tutorial update
+- [ ] Refactoring (no logic change)
+
+## Checklist
+- [ ] I have signed off my commits (DCO).
diff --git a/.github/PULL_REQUEST_TEMPLATE/feature.md b/.github/PULL_REQUEST_TEMPLATE/feature.md
@@ -0,0 +1,21 @@
+## Description
+## Type of Change
+- [ ] New Benchmark Scenario (Industry/Asset type)
+- [ ] Evaluation Metric / Scorer
+- [ ] Agentic Orchestration Logic (ReAct, Plan-Execute, etc.)
+- [ ] Infrastructure / Tooling Improvement
+
+## Industry Relevance
+## Related Issues
+- Refs: #
+
+## Testing & Validation
+- [ ] **Unit Tests**: `uv run pytest tests/unit` passed.
+- [ ] **Scenario Validation**: Verified that the agent can execute the trajectory.
+- [ ] **Data Integrity**: Checked that no PII or sensitive industrial data is included.
+
+## Checklist
+- [ ] My code follows the project's Ruff formatting and linting rules.
+- [ ] I have performed a self-review of my code.
+- [ ] I have updated the documentation (README or /docs) accordingly.
+- [ ] I have signed off my commits (DCO).
diff --git a/.gitignore b/.gitignore
@@ -199,4 +199,5 @@ benchmark/cods_track2/.env.local
 CLAUDE.md
 mcp/couchdb/sample_data/bulk_docs.json
 .env
-mcp/servers/tsfm/artifacts/tsfm_models/
+mcp/servers/tsfm/artifacts/tsfm_models/
+src/tmp/
diff --git a/.python-version b/.python-version
@@ -1 +1 @@
-3.14
+3.12
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,129 @@
+# Contributing to AssetOpsBench
+
+Thank you for your interest in contributing to **AssetOpsBench**! This project aims to advance the state of Industrial AI by providing a rigorous benchmarking framework for autonomous asset operations.
+
+## How to Contribute
+
+1. **Fork the repository** to your own GitHub account.
+2. **Create a feature branch** from `main` in your fork: `git checkout -b feature/<short-topic>`.
+3. **Keep PRs small and focused**: We prefer PRs with fewer than 300 changed lines to ensure high-quality reviews.
+4. **Follow Conventional Commits** for all commits and PR titles.
+5. **Run formatting and tests** locally before opening a pull request.
+6. **Open a Pull Request** from your fork to `main` with a clear description of the benchmarking impact.
+
+> **Note:** All PRs are merged using **Squash and merge**. The PR title will become the final commit message. Please write it carefully using the Conventional Commits format.
+
+---
+
+## DCO: Developer's Certificate of Origin
+
+This repository requires a **DCO 1.1 signoff** on every commit. This is a legal statement asserting that you have the right to submit the code. You can sign off by adding the `-s` or `--signoff` flag:
+
+```bash
+git commit -s -m 'feat(eval): add predictive maintenance scoring for pumps'
+
+```
+
+If you have already made commits without a signoff, you can fix them:
+
+* **Last commit only:** `git commit --amend --no-edit --signoff`
+* **Multiple commits:** `git rebase --signoff HEAD~<n>` (where `<n>` is the number of commits).
+
+Followed by a `git push -f` to your fork.
+
+---
+
+## Commit & Branching Standards
+
+### Conventional Commits
+
+We follow the [Conventional Commits](https://www.conventionalcommits.org/) specification.
+
+**Structure:** `<type>[optional scope]: <description>`
+
+* `feat`: New benchmark scenario, asset model, or agentic logic (e.g., ReAct).
+* `fix`: Bug fix in evaluation scripts or data loaders.
+* `docs`: Documentation improvements.
+* `refactor`: Code changes that neither fix a bug nor add a feature.
+* `perf`: Improvements to evaluation speed or data processing.
+
+### Branch Naming
+
+Use the structure: `<type>/<description>`
+
+* **Good:** `feature/hvac-chiller-scenario`, `bugfix/fix-jsonl-loader`
+* **Bad:** `update1`, `feature_new_stuff` (no underscores or vague names)
+
+---
+
+## Local Development Setup
+
+We use `uv` for lightning-fast Python dependency management.
+
+### 1. Install Dependencies
+
+```bash
+uv sync --dev
+source .venv/bin/activate
+
+```
+
+### 2. Code Quality & Formatting
+
+We use `ruff` for both linting and formatting. Run these before every commit:
+
+```bash
+uv run ruff format .
+uv run ruff check --fix .
+
+```
+
+### 3. Security Scanning
+
+To protect industrial metadata and API keys, run the IBM `detect-secrets` scan:
+
+```bash
+uv pip install --upgrade "git+[https://github.com/ibm/detect-secrets.git@master#egg=detect-secrets](https://github.com/ibm/detect-secrets.git@master#egg=detect-secrets)"
+detect-secrets scan --update .secrets.baseline
+detect-secrets audit .secrets.baseline
+
+```
+
+---
+
+## Running Tests & Validation
+
+### Unit Tests
+
+Validate core logic for metrics and data parsing:
+
+```bash
+uv run pytest tests/unit
+
+```
+
+### Integration & Benchmark Validation
+
+Verify that agent trajectories and environment simulations run correctly:
+
+```bash
+chmod +x ./scripts/run_tests.sh
+./scripts/run_tests.sh
+
+```
+
+This script validates:
+
+* **Linting**: Ruff validation.
+* **Agentic Logic**: Verification of ReAct and Plan-Execute orchestration.
+* **Asset Consistency**: Ensuring industrial asset IDs (e.g., FailureSensorIQ) match registry definitions.
+
+---
+
+## Pull Request Guidelines
+
+* **Benchmark Integrity**: If your change modifies existing scoring logic, please include a "Before vs. After" comparison in the PR description.
+* **Asset Privacy**: Ensure no real-world sensitive telemetry data is included in scenarios without anonymization.
+* **Documentation**: Update the relevant dataset cards (e.g., for FailureSensorIQ) if you modify the underlying data structures.
+* **PR Templates**: Use the provided templates for Features, Bug Fixes, or Chores to ensure consistent review cycles.
+