withmargin · linyiru · Jun 1, 2026 · May 22, 2026 · Jun 1, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+
+- Anthropic Compatible provider support for local or gateway endpoints that
+  implement the Anthropic Messages-style `/v1/messages` API, including model
+  fetching via `/v1/models`, optional Bearer auth, local-provider concurrency
+  limits, and options-page setup.
+
 ## [0.3.2] - 2026-05-31
 
 ### Added

diff --git a/README.md b/README.md
@@ -31,8 +31,8 @@ The extension is usable for normal article pages, legacy text-heavy pages, and s
 - Handle legacy `table`, `font`, and `br`-separated pages.
 - Avoid common non-reading areas such as navigation, forms, buttons, code blocks, hidden text, and page chrome.
 - Use user-configured provider endpoints and API keys.
-- Support OpenAI, Anthropic Claude, and Google Gemini provider adapters.
-- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon).
+- Support OpenAI, Anthropic Claude, Google Gemini, and compatible provider adapters.
+- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon), plus Anthropic Messages API-compatible endpoints.
 - Fetch provider model lists from the options page.
 - Choose integrated or highlighted translation display styles.
 - Optionally show a floating page button that starts translation only after the user clicks it.
@@ -82,13 +82,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages
 Google Gemini: https://generativelanguage.googleapis.com/v1beta/models
 ```
 
-The endpoint field is shown only for OpenAI Compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint.
+The endpoint field is shown only for compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint.
 
 The Fetch models action reads available models from the selected provider:
 
 - OpenAI: `GET /v1/models`
 - Anthropic Claude: `GET /v1/models`
 - Google Gemini: `GET /v1beta/models`
+- OpenAI Compatible / Anthropic Compatible: `GET /v1/models`
 
 Fetched models appear in the model selector. Margin keeps the currently configured model as an option when a provider default or previously saved model is not returned by the provider list.
 
@@ -108,34 +109,42 @@ Quoted posts are disabled by default and can be enabled from options. Posts that
 
 ## Local LLMs
 
-Margin supports local LLM runtimes through the OpenAI Compatible provider. This provider uses the OpenAI-style `/v1/chat/completions` API, allows an empty API key, and uses a lower default translation concurrency for local inference.
+Margin supports local LLM runtimes through compatible providers:
 
-Common endpoint presets:
+- OpenAI Compatible uses the OpenAI-style `/v1/chat/completions` API.
+- Anthropic Compatible uses the Anthropic Messages-style `/v1/messages` API with tool `input_schema` structured output. It is a wire-protocol option for compatible local or gateway endpoints, not a separate Anthropic-hosted service.
+
+Both compatible providers allow an empty API key and use lower default translation concurrency for local inference. If an Anthropic-compatible gateway requires a key, Margin sends it as `Authorization: Bearer ...`.
+
+Common compatible endpoints:
 
 ```text
 LM Studio: http://localhost:1234/v1/chat/completions
 Ollama: http://localhost:11434/v1/chat/completions
 llama.cpp server: http://localhost:8080/v1/chat/completions
 omlx: http://localhost:8000/v1/chat/completions
+Generic Anthropic-compatible: http://localhost:8000/v1/messages
+Ollama Anthropic compatibility: http://localhost:11434/v1/messages
 ```
 
 To use a local runtime:
 
 1. Start the local model server.
 2. Open Margin options.
-3. Select OpenAI Compatible as the provider.
-4. Select an endpoint preset, or enter the endpoint URL shown by your runtime.
+3. Select OpenAI Compatible for `/v1/chat/completions`, or Anthropic Compatible for `/v1/messages`.
+4. Select an OpenAI-compatible endpoint preset, or enter the endpoint URL shown by your runtime.
 5. Leave API key empty unless your local gateway requires one.
 6. Click Fetch models and choose a served model from the model selector.
-7. Keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field.
+7. For OpenAI Compatible, keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field.
 
 Runtime notes:
 
 - LM Studio commonly serves OpenAI-compatible requests at `http://localhost:1234/v1/chat/completions`.
 - Ollama requires its OpenAI-compatible API to be available at `http://localhost:11434/v1/chat/completions`.
+- Ollama can also expose Anthropic-compatible requests at `http://localhost:11434/v1/messages`. Margin sends tools for structured output but does not force `tool_choice` for Anthropic-compatible endpoints, because some compatible runtimes accept tools but do not support forced tool selection.
 - llama.cpp server must be started with an OpenAI-compatible HTTP server enabled, commonly at `http://localhost:8080/v1/chat/completions`.
 - omlx is an Apple Silicon MLX inference server. Start it with `omlx serve` (zero-config, models from `~/.omlx/models`) or `omlx serve --model-dir /path/to/models`; the OpenAI-compatible API becomes available at `http://localhost:8000/v1/chat/completions`.
-- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions`, and the runtime exposes a compatible `/v1/models` endpoint.
+- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions` or `/v1/messages`, and the runtime exposes a compatible `/v1/models` endpoint.
 
 Local model quality, speed, context length, and JSON reliability depend on the model and runtime. Instruct models with strong multilingual ability are recommended for translation.
 

diff --git a/README.zh-TW.md b/README.zh-TW.md
@@ -22,8 +22,8 @@ Margin 目前仍是早期 MVP，支援 Chrome 與其他 Chromium 系瀏覽器，
 - 支援舊式 `table`、`font`，以及以 `br` 分隔文字的頁面。
 - 避開常見的非閱讀區域，例如導覽列、表單、按鈕、程式碼區塊、隱藏文字與頁面介面。
 - 使用你自行設定的 provider endpoint 與 API key。
-- 支援 OpenAI、Anthropic Claude 與 Google Gemini provider adapter。
-- 支援本機 OpenAI-compatible runtime，例如 LM Studio、Ollama、llama.cpp server 與 omlx（Apple Silicon）。
+- 支援 OpenAI、Anthropic Claude、Google Gemini 與 compatible provider adapter。
+- 支援本機 OpenAI-compatible runtime，例如 LM Studio、Ollama、llama.cpp server 與 omlx（Apple Silicon），以及 Anthropic Messages API-compatible endpoint。
 - 可從 options 頁面取得 provider 的模型列表。
 - 可選擇融入原文或醒目提示的譯文顯示樣式。
 - 可選擇在頁面顯示浮動翻譯按鈕，且只有使用者點擊後才開始翻譯。
@@ -66,13 +66,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages
 Google Gemini: https://generativelanguage.googleapis.com/v1beta/models
 ```
 
-Endpoint 欄位只會在 OpenAI Compatible / Local LLM 設定中顯示，因為這些情境才需要使用者選擇或輸入本機 endpoint。
+Endpoint 欄位只會在 compatible / Local LLM 設定中顯示，因為這些情境才需要使用者選擇或輸入本機 endpoint。
 
 Fetch models 會從目前選擇的 provider 讀取可用模型：
 
 - OpenAI: `GET /v1/models`
 - Anthropic Claude: `GET /v1/models`
 - Google Gemini: `GET /v1beta/models`
+- OpenAI Compatible / Anthropic Compatible: `GET /v1/models`
 
 取得的模型會出現在模型選單中。如果目前設定的 provider 預設模型或已儲存模型沒有出現在 provider 回傳的列表中，Margin 會保留它作為可選項目。
 
@@ -92,34 +93,42 @@ Quoted posts 預設不會翻譯，可在 options 中啟用。X 已標示為翻
 
 ## 本機 LLM
 
-Margin 透過 OpenAI Compatible provider 支援本機 LLM runtime。這個 provider 使用 OpenAI 風格的 `/v1/chat/completions` API，允許 API key 留空，並針對本機推理使用較低的預設翻譯 concurrency。
+Margin 透過 compatible provider 支援本機 LLM runtime：
 
-常見 endpoint preset：
+- OpenAI Compatible 使用 OpenAI 風格的 `/v1/chat/completions` API。
+- Anthropic Compatible 使用 Anthropic Messages 風格的 `/v1/messages` API，並透過 tool `input_schema` 取得結構化輸出。這是給相容本機或 gateway endpoint 使用的 wire-protocol 選項，不是另一個 Anthropic 官方代管服務。
+
+兩種 compatible provider 都允許 API key 留空，並針對本機推理使用較低的預設翻譯 concurrency。如果 Anthropic-compatible gateway 需要 key，Margin 會以 `Authorization: Bearer ...` 送出。
+
+常見 compatible endpoint：
 
 ```text
 LM Studio: http://localhost:1234/v1/chat/completions
 Ollama: http://localhost:11434/v1/chat/completions
 llama.cpp server: http://localhost:8080/v1/chat/completions
 omlx: http://localhost:8000/v1/chat/completions
+Generic Anthropic-compatible: http://localhost:8000/v1/messages
+Ollama Anthropic compatibility: http://localhost:11434/v1/messages
 ```
 
 使用本機 runtime：
 
 1. 啟動本機模型 server。
 2. 開啟 Margin options。
-3. 選擇 OpenAI Compatible 作為 provider。
-4. 選擇 endpoint preset，或輸入你的 runtime 顯示的 endpoint URL。
+3. 如果 endpoint 是 `/v1/chat/completions`，選擇 OpenAI Compatible；如果 endpoint 是 `/v1/messages`，選擇 Anthropic Compatible。
+4. 選擇 OpenAI-compatible endpoint preset，或輸入你的 runtime 顯示的 endpoint URL。
 5. 除非你的本機 gateway 需要 API key，否則 API key 可以留空。
 6. 點擊 Fetch models，並從模型選單中選擇 server 提供的模型。
-7. 如果 runtime 支援，建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位，請停用此選項。
+7. 對 OpenAI Compatible 而言，如果 runtime 支援，建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位，請停用此選項。
 
 Runtime 注意事項：
 
 - LM Studio 通常在 `http://localhost:1234/v1/chat/completions` 提供 OpenAI-compatible request。
 - Ollama 需要 OpenAI-compatible API 可在 `http://localhost:11434/v1/chat/completions` 使用。
+- Ollama 也可以在 `http://localhost:11434/v1/messages` 提供 Anthropic-compatible request。Margin 會送出 tools 以取得結構化輸出，但不會在 Anthropic-compatible endpoint 強制指定 `tool_choice`，因為部分相容 runtime 接受 tools，卻不支援 forced tool selection。
 - llama.cpp server 必須啟動 OpenAI-compatible HTTP server，常見位址為 `http://localhost:8080/v1/chat/completions`。
 - omlx 是 Apple Silicon 上的 MLX 推論 server。以 `omlx serve`（零設定，模型從 `~/.omlx/models` 載入）或 `omlx serve --model-dir /path/to/models` 啟動後，OpenAI-compatible API 預設位於 `http://localhost:8000/v1/chat/completions`。
-- 如果 Fetch models 失敗，請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 結尾，且 runtime 有提供 compatible `/v1/models` endpoint。
+- 如果 Fetch models 失敗，請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 或 `/v1/messages` 結尾，且 runtime 有提供 compatible `/v1/models` endpoint。
 
 本機模型的品質、速度、context length 與 JSON 穩定性，取決於模型與 runtime。建議使用具備強多語能力的 instruct model 進行翻譯。
 

diff --git a/apps/extension/public/options.html b/apps/extension/public/options.html
@@ -22,6 +22,7 @@ <h2 data-i18n="provider">Provider</h2>
                 <option value="anthropic">Anthropic Claude</option>
                 <option value="google">Google Gemini</option>
                 <option value="openai-compatible">OpenAI Compatible</option>
+                <option value="anthropic-compatible">Anthropic Compatible</option>
               </select>
             </label>
             <fieldset data-provider-section="openai-compatible">
@@ -37,10 +38,6 @@ <h2 data-i18n="provider">Provider</h2>
                 </select>
                 <span class="hint" data-i18n="endpointPresetHint">Presets switch the provider to OpenAI Compatible and fill the endpoint.</span>
               </label>
-              <label>
-                <span data-i18n="providerEndpoint">Endpoint URL</span>
-                <input name="providerEndpoint" type="url" required />
-              </label>
               <label class="checkbox">
                 <input name="openAICompatibleJsonMode" type="checkbox" />
                 <span>
@@ -49,10 +46,22 @@ <h2 data-i18n="provider">Provider</h2>
                 </span>
               </label>
             </fieldset>
+            <fieldset data-provider-section="anthropic-compatible">
+              <legend data-i18n="localAnthropicPresets">Local Anthropic-compatible endpoint</legend>
+              <p class="hint" data-i18n="localAnthropicEndpointHint">
+                Use an Anthropic Messages API endpoint such as http://localhost:8000/v1/messages.
+              </p>
+            </fieldset>
+            <label data-provider-section="openai-compatible anthropic-compatible">
+              <span data-i18n="providerEndpoint">Endpoint URL</span>
+              <input name="providerEndpoint" type="url" required />
+            </label>
             <label data-field="api-key">
               <span data-i18n="apiKey">API key</span>
               <input name="apiKey" type="password" autocomplete="off" />
-              <span class="hint" data-i18n="apiKeyHint">Paste the raw provider API key. Local OpenAI-compatible endpoints can leave this empty.</span>
+              <span class="hint" data-i18n="apiKeyHint">
+                Paste the raw provider API key. Local OpenAI-compatible and Anthropic-compatible endpoints can leave this empty.
+              </span>
             </label>
             <label>
               <span data-i18n="model">Model</span>

diff --git a/apps/extension/src/background/providers/anthropic.test.ts b/apps/extension/src/background/providers/anthropic.test.ts
@@ -1,7 +1,7 @@
 import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
 import { DEFAULT_SETTINGS } from "../../shared/defaults";
 import type { ExtensionSettings, TextSegment } from "../../shared/types";
-import { anthropicProvider } from "./anthropic";
+import { anthropicCompatibleProvider, anthropicProvider } from "./anthropic";
 
 const segments: TextSegment[] = [
   { id: "a", text: "Hello" },
@@ -119,6 +119,109 @@ describe("anthropicProvider.translate", () => {
   });
 });
 
+describe("anthropicCompatibleProvider.translate", () => {
+  it("uses Bearer auth and omits browser-access header when api key is provided", async () => {
+    const body = JSON.stringify({
+      content: [
+        {
+          type: "tool_use",
+          name: "return_translations",
+          input: { translations: [{ id: "a", text: "你好" }] }
+        }
+      ]
+    });
+    const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
+    vi.stubGlobal("fetch", stub);
+
+    const results = await anthropicCompatibleProvider.translate(
+      segments,
+      makeSettings({
+        provider: "anthropic-compatible",
+        apiKey: "test-key",
+        providerEndpoint: "http://localhost:8000/v1/messages"
+      })
+    );
+
+    expect(results).toEqual([{ id: "a", text: "你好" }]);
+    const headers = calls[0].init.headers as Record<string, string>;
+    expect(headers.Authorization).toBe("Bearer test-key");
+    expect(headers["x-api-key"]).toBeUndefined();
+    expect(headers["anthropic-dangerous-direct-browser-access"]).toBeUndefined();
+    expect(headers["anthropic-version"]).toBe("2023-06-01");
+    expect(calls[0].body.tool_choice).toBeUndefined();
+    expect((calls[0].body.tools as Array<{ name: string }>)[0].name).toBe("return_translations");
+  });
+
+  it("omits Authorization header when api key is empty", async () => {
+    const body = JSON.stringify({
+      content: [
+        {
+          type: "tool_use",
+          name: "return_translations",
+          input: { translations: [{ id: "a", text: "你好" }] }
+        }
+      ]
+    });
+    const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
+    vi.stubGlobal("fetch", stub);
+
+    const results = await anthropicCompatibleProvider.translate(
+      segments,
+      makeSettings({
+        provider: "anthropic-compatible",
+        apiKey: "",
+        providerEndpoint: "http://localhost:8000/v1/messages"
+      })
+    );
+
+    expect(results).toEqual([{ id: "a", text: "你好" }]);
+    const headers = calls[0].init.headers as Record<string, string>;
+    expect(headers.Authorization).toBeUndefined();
+    expect(headers["x-api-key"]).toBeUndefined();
+  });
+});
+
+describe("anthropicCompatibleProvider.listModels", () => {
+  it("uses Bearer auth when api key is provided", async () => {
+    const body = JSON.stringify({ data: [{ id: "local-model" }] });
+    const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
+    vi.stubGlobal("fetch", stub);
+
+    const models = await anthropicCompatibleProvider.listModels(
+      makeSettings({
+        provider: "anthropic-compatible",
+        apiKey: "cloud-key",
+        providerEndpoint: "http://localhost:8000/v1/messages"
+      })
+    );
+
+    expect(models).toEqual([{ id: "local-model" }]);
+    expect(calls[0].url).toBe("http://localhost:8000/v1/models");
+    const headers = calls[0].init.headers as Record<string, string>;
+    expect(headers.Authorization).toBe("Bearer cloud-key");
+    expect(headers["x-api-key"]).toBeUndefined();
+  });
+
+  it("omits Authorization header when api key is empty", async () => {
+    const body = JSON.stringify({ data: [{ id: "local-model" }] });
+    const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
+    vi.stubGlobal("fetch", stub);
+
+    const models = await anthropicCompatibleProvider.listModels(
+      makeSettings({
+        provider: "anthropic-compatible",
+        apiKey: "",
+        providerEndpoint: "http://localhost:8000/v1/messages"
+      })
+    );
+
+    expect(models).toEqual([{ id: "local-model" }]);
+    const headers = calls[0].init.headers as Record<string, string>;
+    expect(headers.Authorization).toBeUndefined();
+    expect(headers["x-api-key"]).toBeUndefined();
+  });
+});
+
 describe("anthropicProvider.listModels", () => {
   it("returns models with display_name mapped to displayName", async () => {
     const body = JSON.stringify({