Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
6435850
feat(gateway/sandbox): add global and sandbox runtime settings flow
johntmyers Mar 17, 2026
70c32f6
feat(settings): wip sandbox settings channel and typed registry
johntmyers Mar 17, 2026
e9ee7e7
feat(settings): wip global settings get and full key materialization
johntmyers Mar 18, 2026
b4bb0a2
fix(settings): use prefixed ID for sandbox settings to avoid object s…
johntmyers Mar 18, 2026
28f77ef
feat(tui): add global settings tab with typed editing and HITL confir…
johntmyers Mar 18, 2026
e73124f
feat(settings): support sandbox-scoped setting delete when not global…
johntmyers Mar 18, 2026
6864636
feat(tui): add per-sandbox settings tab with scope indicators and edi…
johntmyers Mar 18, 2026
0d66f3d
refactor(sandbox): improve poll loop logging to diff settings and con…
johntmyers Mar 18, 2026
cd1d42c
update arch docs for new settings comms channel
johntmyers Mar 18, 2026
87b097e
fix(settings): add mutex to serialize settings mutations and prevent …
johntmyers Mar 18, 2026
8a4a7fb
fix(settings): prefix global ID, use wrapping_add, add --json output,…
johntmyers Mar 18, 2026
2273894
refactor(proto): rename UpdateSandboxPolicy to UpdateSettings for con…
johntmyers Mar 18, 2026
051df3c
fix(settings): address remaining review findings (W3-W6, S1)
johntmyers Mar 18, 2026
c659d93
feat(settings): add global policy versioning with revision history an…
johntmyers Mar 18, 2026
2dac61e
feat(settings): add global policy versioning, dashboard indicator, an…
johntmyers Mar 19, 2026
befaf78
chore: fix rustfmt import ordering
johntmyers Mar 19, 2026
aef42f2
fix(e2e): update Python tests for UpdateSandboxPolicy -> UpdateSettin…
johntmyers Mar 19, 2026
f0fdbd0
fix(tui): add Left arrow key to sandbox policy/settings tab switching
johntmyers Mar 19, 2026
11ff1e9
chore(settings): gate dev keys behind feature flag, filter stale keys…
johntmyers Mar 19, 2026
efe4c52
chore: fix rustfmt import ordering in settings.rs
johntmyers Mar 19, 2026
1b761b0
fix(settings): gate CLI tests referencing dev-settings keys
johntmyers Mar 19, 2026
633e970
fix(settings): block draft chunk approval when global policy is active
johntmyers Mar 20, 2026
408650f
fix(settings): ensure global policy dedup still writes settings blob
johntmyers Mar 20, 2026
c9166c7
fix(settings): supersede global policy revisions when global policy i…
johntmyers Mar 20, 2026
c95de83
fix(settings): skip dedup when latest global policy revision is super…
johntmyers Mar 20, 2026
1998927
docs(architecture): document global policy lifecycle, state machine, …
johntmyers Mar 20, 2026
682cbbf
refactor(proto): rename GetSandboxSettings/GetGatewaySettings to GetS…
johntmyers Mar 20, 2026
b0c4c85
fix(e2e): update remaining UpdateSandboxPolicy reference in Python test
johntmyers Mar 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,7 @@ jobs:
env:
DOCKER_BUILDER: openshell
OPENSHELL_CARGO_VERSION: ${{ steps.version.outputs.cargo_version }}
# Enable dev-settings feature for test settings (dummy_bool, dummy_int)
# used by e2e tests.
EXTRA_CARGO_FEATURES: openshell-core/dev-settings
run: mise run --no-prepare docker:build:${{ inputs.component }}
9 changes: 6 additions & 3 deletions architecture/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,17 +224,19 @@ Sandbox behavior is governed by policies written in YAML and evaluated by an emb

Inference routing to `inference.local` is configured separately at the cluster level and does not require network policy entries. The OPA engine evaluates only explicit network policies; `inference.local` connections bypass OPA entirely and are handled by the proxy's dedicated inference interception path.

Policies are not intended to be hand-edited by end users in normal operation. They are associated with sandboxes at creation time and fetched by the sandbox supervisor at startup via gRPC. For development and testing, policies can also be loaded from local files.
Policies are not intended to be hand-edited by end users in normal operation. They are associated with sandboxes at creation time and fetched by the sandbox supervisor at startup via gRPC. For development and testing, policies can also be loaded from local files. A gateway-global policy can override all sandbox policies via `openshell policy set --global`.

For more detail, see [Policy Language](security-policy.md).
In addition to policy, the gateway delivers runtime **settings** -- typed key-value pairs (e.g., `log_level`) that can be configured per-sandbox or globally. Settings and policy are delivered together through the `GetSandboxSettings` RPC and tracked by a single `config_revision` fingerprint. See [Gateway Settings Channel](gateway-settings.md) for details.

For more detail on the policy language, see [Policy Language](security-policy.md).

### Command-Line Interface

The CLI is the primary way users interact with the platform. It provides commands organized into four groups:

- **Gateway management** (`openshell gateway`): Deploy, stop, destroy, and inspect clusters. Supports both local and remote (SSH) targets.
- **Sandbox management** (`openshell sandbox`): Create sandboxes (with optional file upload and provider auto-discovery), connect to sandboxes via SSH, and delete sandboxes.
- **Top-level commands**: `openshell status` (cluster health), `openshell logs` (sandbox logs), `openshell forward` (port forwarding), `openshell policy` (sandbox policy management).
- **Top-level commands**: `openshell status` (cluster health), `openshell logs` (sandbox logs), `openshell forward` (port forwarding), `openshell policy` (sandbox policy management), `openshell settings` (effective sandbox settings and global/sandbox key updates).
- **Provider management** (`openshell provider`): Create, update, list, and delete external service credentials.
- **Inference management** (`openshell cluster inference`): Configure cluster-level inference by specifying a provider and model. The gateway resolves endpoint and credential details from the named provider record.

Expand Down Expand Up @@ -297,4 +299,5 @@ This opens an interactive SSH session into the sandbox, with all provider creden
| [Policy Language](security-policy.md) | The YAML/Rego policy system that governs sandbox behavior. |
| [Inference Routing](inference-routing.md) | Transparent interception and sandbox-local routing of AI inference API calls to configured backends. |
| [System Architecture](system-architecture.md) | Top-level system architecture diagram with all deployable components and communication flows. |
| [Gateway Settings Channel](gateway-settings.md) | Runtime settings channel: two-tier key-value configuration, global policy override, settings registry, CLI/TUI commands. |
| [TUI](tui.md) | Terminal user interface for sandbox interaction. |
2 changes: 1 addition & 1 deletion architecture/gateway-security.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ These are used to build a `tonic::transport::ClientTlsConfig` with:
- `identity()` -- presents the shared client certificate for mTLS.

The sandbox calls two RPCs over this authenticated channel:
- `GetSandboxPolicy` -- fetches the YAML policy that governs the sandbox's behavior.
- `GetSandboxSettings` -- fetches the YAML policy that governs the sandbox's behavior.
- `GetSandboxProviderEnvironment` -- fetches provider credentials as environment variables.

## SSH Tunnel Authentication
Expand Down
561 changes: 561 additions & 0 deletions architecture/gateway-settings.md

Large diffs are not rendered by default.

36 changes: 23 additions & 13 deletions architecture/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Proto definitions consumed by the gateway:
| `proto/openshell.proto` | `openshell.v1` | `OpenShell` service, sandbox/provider/SSH/watch messages |
| `proto/inference.proto` | `openshell.inference.v1` | `Inference` service: `SetClusterInference`, `GetClusterInference`, `GetInferenceBundle` |
| `proto/datamodel.proto` | `openshell.datamodel.v1` | `Sandbox`, `SandboxSpec`, `SandboxStatus`, `Provider`, `SandboxPhase` |
| `proto/sandbox.proto` | `openshell.sandbox.v1` | `SandboxPolicy`, `NetworkPolicyRule` |
| `proto/sandbox.proto` | `openshell.sandbox.v1` | `SandboxPolicy`, `NetworkPolicyRule`, `SettingValue`, `EffectiveSetting`, `SettingScope`, `PolicySource`, `GetSandboxSettingsRequest/Response`, `GetGatewaySettingsRequest/Response` |

## Startup Sequence

Expand Down Expand Up @@ -141,6 +141,9 @@ pub struct ServerState {
pub sandbox_index: SandboxIndex,
pub sandbox_watch_bus: SandboxWatchBus,
pub tracing_log_bus: TracingLogBus,
pub ssh_connections_by_token: Mutex<HashMap<String, u32>>,
pub ssh_connections_by_sandbox: Mutex<HashMap<String, u32>>,
pub settings_mutex: tokio::sync::Mutex<()>,
}
```

Expand All @@ -149,6 +152,7 @@ pub struct ServerState {
- **`sandbox_index`** -- in-memory bidirectional index mapping sandbox names and agent pod names to sandbox IDs. Used by the event tailer to correlate Kubernetes events.
- **`sandbox_watch_bus`** -- `broadcast`-based notification bus keyed by sandbox ID. Producers call `notify(&id)` when the persisted sandbox record changes; consumers in `WatchSandbox` streams receive `()` signals and re-read the record.
- **`tracing_log_bus`** -- captures `tracing` events that include a `sandbox_id` field and republishes them as `SandboxLogLine` messages. Maintains a per-sandbox tail buffer (default 200 entries). Also contains a nested `PlatformEventBus` for Kubernetes events.
- **`settings_mutex`** -- serializes settings mutations (global and sandbox) to prevent read-modify-write races. Held for the duration of any setting set/delete or global policy set/delete operation. See [Gateway Settings Channel](gateway-settings.md#global-policy-lifecycle).

## Protocol Multiplexing

Expand Down Expand Up @@ -225,13 +229,14 @@ Full CRUD for `Provider` objects, which store typed credentials (e.g., API keys
| `UpdateProvider` | Updates an existing provider by name. Preserves the stored `id` and `name`; replaces `type`, `credentials`, and `config`. |
| `DeleteProvider` | Deletes a provider by name. Returns `deleted: true/false`. |

#### Policy and Provider Environment Delivery
#### Policy, Settings, and Provider Environment Delivery

These RPCs are called by sandbox pods at startup to bootstrap themselves.
These RPCs are called by sandbox pods at startup and during runtime polling.

| RPC | Description |
|-----|-------------|
| `GetSandboxPolicy` | Returns the `SandboxPolicy` from a sandbox's spec, looked up by sandbox ID. |
| `GetSandboxSettings` | Returns effective sandbox config looked up by sandbox ID: policy payload, policy metadata (version, hash, source, `global_policy_version`), merged effective settings, and a `config_revision` fingerprint for change detection. Two-tier resolution: registered keys start unset, sandbox values overlay, global values override. The reserved `policy` key in global settings can override the sandbox's own policy. When a global policy is active, `policy_source` is `GLOBAL` and `global_policy_version` carries the active revision number. See [Gateway Settings Channel](gateway-settings.md). |
| `GetGatewaySettings` | Returns gateway-global settings only (excluding the reserved `policy` key). Returns registered keys with empty values when unconfigured, and a monotonic `settings_revision`. |
| `GetSandboxProviderEnvironment` | Resolves provider credentials into environment variables for a sandbox. Iterates the sandbox's `spec.providers` list, fetches each `Provider`, and collects credential key-value pairs. First provider wins on duplicate keys. Skips credential keys that do not match `^[A-Za-z_][A-Za-z0-9_]*$`. |

#### Policy Recommendation (Network Rules)
Expand All @@ -242,9 +247,9 @@ These RPCs support the sandbox-initiated policy recommendation pipeline. The san
|-----|-------------|
| `SubmitPolicyAnalysis` | Receives pre-formed `PolicyChunk` proposals from a sandbox. Validates each chunk, persists via upsert on `(sandbox_id, host, port, binary)` dedup key, notifies watch bus. |
| `GetDraftPolicy` | Returns all draft chunks for a sandbox with current draft version. |
| `ApproveDraftChunk` | Approves a pending or rejected chunk. Merges the proposed rule into the active policy (appends binary to existing rule or inserts new rule). |
| `RejectDraftChunk` | Rejects a pending chunk or revokes an approved chunk. If revoking, removes the binary from the active policy rule. |
| `ApproveAllDraftChunks` | Bulk approves all pending chunks for a sandbox. |
| `ApproveDraftChunk` | Approves a pending or rejected chunk. Merges the proposed rule into the active policy (appends binary to existing rule or inserts new rule). **Blocked when a global policy is active** -- returns `FailedPrecondition`. |
| `RejectDraftChunk` | Rejects a pending chunk or revokes an approved chunk. If revoking, removes the binary from the active policy rule. Rejection of `pending` chunks is always allowed. **Revoking approved chunks is blocked when a global policy is active** -- returns `FailedPrecondition`. |
| `ApproveAllDraftChunks` | Bulk approves all pending chunks for a sandbox. **Blocked when a global policy is active** -- returns `FailedPrecondition`. |
| `EditDraftChunk` | Updates the proposed rule on a pending chunk. |
| `GetDraftHistory` | Returns all chunks (including rejected) for audit trail. |

Expand Down Expand Up @@ -457,12 +462,16 @@ Objects are identified by `(object_type, id)` with a unique constraint on `(obje

### Object Types

| Object type string | Proto message | Traits implemented |
|--------------------|---------------|-------------------|
| `"sandbox"` | `Sandbox` | `ObjectType`, `ObjectId`, `ObjectName` |
| `"provider"` | `Provider` | `ObjectType`, `ObjectId`, `ObjectName` |
| `"ssh_session"` | `SshSession` | `ObjectType`, `ObjectId`, `ObjectName` |
| `"inference_route"` | `InferenceRoute` | `ObjectType`, `ObjectId`, `ObjectName` |
| Object type string | Proto message / format | Traits implemented | Notes |
|--------------------|------------------------|-------------------|-------|
| `"sandbox"` | `Sandbox` | `ObjectType`, `ObjectId`, `ObjectName` | |
| `"provider"` | `Provider` | `ObjectType`, `ObjectId`, `ObjectName` | |
| `"ssh_session"` | `SshSession` | `ObjectType`, `ObjectId`, `ObjectName` | |
| `"inference_route"` | `InferenceRoute` | `ObjectType`, `ObjectId`, `ObjectName` | |
| `"gateway_settings"` | JSON `StoredSettings` | Generic `put`/`get` | Singleton, id=`"global"`. Contains the reserved `policy` key for global policy delivery. |
| `"sandbox_settings"` | JSON `StoredSettings` | Generic `put`/`get` | Per-sandbox, id=`"settings:{sandbox_uuid}"` |

The `sandbox_policies` table stores versioned policy revisions for both sandbox-scoped and global policies. Global revisions use the sentinel `sandbox_id = "__global__"`. See [Gateway Settings Channel](gateway-settings.md#storage-model) for schema details.

### Generic Protobuf Codec

Expand Down Expand Up @@ -559,6 +568,7 @@ Updated by the sandbox watcher on every Applied event and by gRPC handlers durin
## Cross-References

- [Sandbox Architecture](sandbox.md) -- sandbox-side policy enforcement, proxy, and isolation details
- [Gateway Settings Channel](gateway-settings.md) -- runtime settings channel, two-tier resolution, CLI/TUI commands
- [Inference Routing](inference-routing.md) -- end-to-end inference interception flow, sandbox-side proxy logic, and route resolution
- [Container Management](build-containers.md) -- how sandbox container images are built and configured
- [Sandbox Connect](sandbox-connect.md) -- client-side SSH connection flow
Expand Down
2 changes: 1 addition & 1 deletion architecture/sandbox-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ variables (injected into the pod spec by the gateway's Kubernetes sandbox creati

In `run_sandbox()` (`crates/openshell-sandbox/src/lib.rs`):

1. loads the sandbox policy via gRPC (`GetSandboxPolicy`),
1. loads the sandbox policy via gRPC (`GetSandboxSettings`),
2. fetches provider credentials via gRPC (`GetSandboxProviderEnvironment`),
3. if the fetch fails, continues with an empty map (graceful degradation with a warning).

Expand Down
Loading
Loading