diff --git a/plugins/box/.codex-plugin/plugin.json b/plugins/box/.codex-plugin/plugin.json index cf010ba0..9556badf 100644 --- a/plugins/box/.codex-plugin/plugin.json +++ b/plugins/box/.codex-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "box", - "version": "0.0.0", + "version": "0.0.1", "description": "Search and reference your documents", "author": { "name": "OpenAI", diff --git a/plugins/box/skills/box/SKILL.md b/plugins/box/skills/box/SKILL.md index 18579a5e..2d732c71 100644 --- a/plugins/box/skills/box/SKILL.md +++ b/plugins/box/skills/box/SKILL.md @@ -54,6 +54,7 @@ Follow these steps in order when coding against Box. - When a task requires understanding document content — classification, extraction, categorization — use Box AI (Q&A, extract) as the first method attempted. Box AI operates server-side and does not require downloading file bodies. Fall back to metadata inspection, previews, or local analysis only if Box AI is unavailable, not authorized, or returns an error on the first attempt. - Pace Box AI calls at least 1–2 seconds apart. For content-based classification of many files, classify a small sample first to validate the prompt and discover whether cheaper signals (filename, extension, metadata) can sort the remaining files without additional AI calls. - Avoid downloading file bodies or routing content through external AI pipelines when Box-native methods (Box AI, search, metadata, previews) can answer the question server-side. +- For connected Box app or MCP single-file reading, prefer `get_file_content` only when the file is likely to have markdown or extracted-text content. If it says markdown or text representation is unavailable, do not retry the same content call; switch to preview, metadata, or the next scoped fallback. - Request only the fields the application actually needs, and persist returned Box IDs instead of reconstructing paths later. - Run Box CLI commands strictly one at a time. The CLI does not support concurrent invocations and parallel calls cause auth conflicts and dropped operations. For bulk work (organizing, batch moves, batch metadata), default to REST over CLI. - Make webhook and event consumers idempotent. Box delivery and retry paths can produce duplicates. diff --git a/plugins/box/skills/box/references/ai-and-retrieval.md b/plugins/box/skills/box/references/ai-and-retrieval.md index f0666d79..4fdf7dc1 100644 --- a/plugins/box/skills/box/references/ai-and-retrieval.md +++ b/plugins/box/skills/box/references/ai-and-retrieval.md @@ -26,6 +26,18 @@ When the task requires understanding what a document contains (classification, e If the first Box AI call fails with a 403 or feature-not-available error, switch to the next method immediately rather than retrying AI for the remaining files. +## Connected Box app text retrieval + +`get_file_content` requires a markdown or extracted-text representation. Use file signals to avoid sending obviously unsupported files into a text-content read: + +1. Search or list narrowly until you have the exact Box file ID and lightweight file context. When that call is already part of the path, request `extension` and `representations`. +2. For one file with a text representation, prefer `get_file_content` and let the model reason over the returned text. +3. If the file is obviously visual, binary, or preview-oriented, prefer preview or metadata paths before a text-content read. + +If the earlier search or list result did not include enough file signals and a text read is uncertain, use `get_file_details` with the smallest useful `fields` set, such as `["extension", "size", "representations"]`. Check for `markdown` or `extracted_text` before calling `get_file_content` when avoiding a likely text-content miss is worth that extra metadata call. Do not add this preflight for every likely text-backed document. + +If `get_file_content` returns `Markdown or text representation is not available for this file`, do not retry the same call. Use a preview path for previewable content, inspect metadata when it can answer the question, or choose a scoped fallback. + ### Box AI via CLI **Before the first AI call**, run `box ai:ask --help` to confirm the command exists in the installed CLI version. diff --git a/plugins/box/skills/box/references/troubleshooting.md b/plugins/box/skills/box/references/troubleshooting.md index 497e03b4..c28adebe 100644 --- a/plugins/box/skills/box/references/troubleshooting.md +++ b/plugins/box/skills/box/references/troubleshooting.md @@ -9,6 +9,7 @@ - 429 - Webhook verification failures - Search quality problems +- Missing text representation - CLI auth problems - Codex sandbox network access @@ -68,6 +69,15 @@ When using Box CLI, run `box --help` before the first invocation of an - Expecting search to return content the current identity cannot see - Downloading too early instead of returning IDs and metadata first +## Missing text representation + +`get_file_content` reads markdown or extracted text. It can fail when Box has neither representation for the selected file. + +- Do not retry `get_file_content` on the same file after `Markdown or text representation is not available for this file`. +- Prefer preview or page-image tools for previewable visual content. +- Use metadata when it can answer the question without a body read. +- If document content is still required, choose the smallest fallback allowed by the task and actor permissions. + ## CLI auth problems - `box` is installed but the current environment is not authorized