standalone websearch extension by sayan-oai · Pull Request #23823 · openai/codex

sayan-oai · 2026-05-21T04:39:57Z

Summary

Add the extension-backed standalone web.run tool so Codex can call the standalone search endpoint through the codex-api search client and return its encrypted output to Responses.

gate the new tool behind standalone_web_search
install the extension in the app-server thread registry and hide hosted web_search when standalone search is enabled for OpenAI providers so the two paths stay mutually exclusive
build search context from persisted history using a small tail heuristic: previous user message, assistant text between the last two user turns capped at about 1k tokens, and current user message

Test Plan

cargo test -p codex-web-search-extension
cargo test -p codex-api
cargo test -p codex-core hosted_tools_follow_provider_auth_model_and_config_gates

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cdd92100c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T04:49:02Z

+            enabled: config.features.enabled(Feature::StandaloneWebSearch)
+                && config.model_provider.is_openai(),


Honor disabled web search before exposing web.run

For OpenAI threads with standalone_web_search enabled and web_search_mode = disabled (including requirement-constrained configs), this still marks the extension enabled, so tools() exposes web.run. The existing hosted path intentionally returns no web_search tool for WebSearchMode::Disabled, so this re-enables web access in contexts that configured it off.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:49:02Z

+            reasoning: None,
+            input: recent_input(&history.items),
+            commands: Some(commands),
+            settings: None,


Forward configured search settings to alpha/search

When the standalone tool replaces hosted web_search, this sends settings: None, dropping the configured cached/live mode and web_search_config constraints such as allowed domains, user location, and context size. In environments with allowed_domains or cached-only web search configured, web.run calls go to the standalone endpoint without those constraints, unlike create_web_search_tool which applies them to the hosted tool.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-21T04:49:02Z

+        if role != "assistant" {
+            return true;
+        }


Cap user messages in search context

When a user prompt contains a large paste or data-URL image, this early return excludes user messages from the only token budget even though recent_input includes the previous and current user messages. That sends unbounded model-visible input to alpha/search and can exceed the repository's hard context caps; cap or filter user content as well as assistant text.

Useful? React with 👍 / 👎.

I let you judge this one but that sounds interesting

chatgpt-codex-connector · 2026-05-21T04:49:02Z

    "ext/goal",
    "ext/guardian",
    "ext/memories",
+    "ext/web-search",


Update the Bazel lockfile for the new crate

Adding ext/web-search and the new workspace dependency changes the Rust dependency graph, but this commit does not update MODULE.bazel.lock. The repo guidance requires just bazel-lock-update after Rust dependency changes; without the refreshed lockfile, Bazel lock checks/builds will see drift once this new crate/dependency is resolved.

Useful? React with 👍 / 👎.

jif-oai · 2026-05-21T09:39:59Z

    let provider_capabilities = turn_context.provider.capabilities();
-    let web_search_mode = provider_capabilities
-        .web_search
+    let standalone_web_search = turn_context


Is you want this logic to stand, you also need to install the extension on the core-api for example + sync with training depending on how they spawn codex through the bridge

jif-oai · 2026-05-21T09:41:17Z

+
+---
+
+## Decision boundary


I don't understand why we have this section. It does not really seem related to the extension...

jif-oai · 2026-05-21T09:43:37Z

+        if role != "assistant" {
+            return true;
+        }


I let you judge this one but that sounds interesting

jif-oai · 2026-05-21T09:44:15Z

+}
+
+impl WebSearchExtensionConfig {
+    fn from_config(config: &Config) -> Self {


Could this be a proper impl From<>?

jif-oai · 2026-05-21T10:15:21Z

+
+const ASSISTANT_CONTEXT_TOKEN_LIMIT: usize = 1_000;
+
+/// Builds the persisted conversation tail for standalone web search.


This looks like most of this code should live closer to RolloutItem and the protocol in general no?

jif-oai · 2026-05-21T10:22:54Z

+pub fn install(
+    registry: &mut ExtensionRegistryBuilder<Config>,
+    auth_manager: Arc<AuthManager>,
+    thread_store: Arc<dyn ThreadStore>,


Do we want first-party tool extensions to reach back into ThreadStore to reconstruct model context themselves? The existing extension-api already pushes thread/turn/... states through scoped ExtensionData, and web.run only needs the recent search context tail here. If persisted history is intentionally part of the tool-extension contract, this seems better making explicit. Otherwise I think the API should expose the relevant recent context instead of making this extension depend on the full thread store

usgenes39-dotcom

I found two blocking standalone-search semantics issues on the current head.

Current head fixes the earlier disabled-mode and settings-forwarding problems, but recent_input still sends unbounded user text/image content to alpha/search, and hosted web_search is hidden for OpenAI providers even in ThreadManager paths that do not install the standalone extension. That creates both unbounded search-context/cost risk and a product regression where web search disappears outside app-server. CI is not fully green on this head because the Bazel workflow was cancelled, and the PR is currently reported as not mergeable.

usgenes39-dotcom · 2026-05-21T15:12:52Z

+        };
+        if role != "assistant" {
+            return true;
+        }


Current head still leaves the user side outside the only context budget. cap_assistant_text caps assistant output, but both recent user messages are retained verbatim, and the test below explicitly preserves InputImage data URLs. recent_input then sends those items to alpha/search, so a large paste or base64 image in either the previous or current user message can bypass the 1k-token cap, exceed the standalone search request budget, and leak irrelevant/high-cost payloads into every web.run call. Please cap user text too and drop or summarize non-text/image content before building SearchInput; cover huge previous/current user messages and data URL images.

usgenes39-dotcom · 2026-05-21T15:12:52Z

+        .features
+        .enabled(Feature::StandaloneWebSearch)
+        && turn_context.provider.info().is_openai();
+    let web_search_mode = (!standalone_web_search && provider_capabilities.web_search)


This suppresses hosted web_search based only on the feature flag and OpenAI provider, but the standalone extension is only installed by app-server’s thread_extensions. Other ThreadManager entry points still use empty_extension_registry() (for example mcp-server/src/message_processor.rs, core/src/prompt_debug.rs, and thread-manager-sample / core-api-style callers), so when the feature is enabled there, hosted search is hidden and no web.run tool is contributed. That silently removes web search for those surfaces. Please either install the web-search extension in every ThreadManager path that can enable the feature, or only hide hosted web_search when the standalone tool is actually present; add coverage for feature-enabled OpenAI config with an empty extension registry.

[codex] Add standalone web search extension

cdd9210

sayan-oai marked this pull request as ready for review May 21, 2026 04:41

sayan-oai requested a review from a team as a code owner May 21, 2026 04:41

sayan-oai changed the title ~~[codex] Add standalone web search extension~~ standalone websearch extension May 21, 2026

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

[codex] Preserve standalone search settings

48ce082

jif-oai reviewed May 21, 2026

View reviewed changes

usgenes39-dotcom reviewed May 21, 2026

View reviewed changes

		enabled: config.features.enabled(Feature::StandaloneWebSearch)
		&& config.model_provider.is_openai(),


		const ASSISTANT_CONTEXT_TOKEN_LIMIT: usize = 1_000;

		/// Builds the persisted conversation tail for standalone web search.

Conversation

sayan-oai commented May 21, 2026

Summary

Test Plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

usgenes39-dotcom left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants