diff --git a/CHANGELOG.md b/CHANGELOG.md index a2526e8..df119d0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added + +- Anthropic Compatible provider support for local or gateway endpoints that + implement the Anthropic Messages-style `/v1/messages` API, including model + fetching via `/v1/models`, optional Bearer auth, local-provider concurrency + limits, and options-page setup. + ## [0.3.2] - 2026-05-31 ### Added diff --git a/README.md b/README.md index bc61c37..994c97b 100644 --- a/README.md +++ b/README.md @@ -31,8 +31,8 @@ The extension is usable for normal article pages, legacy text-heavy pages, and s - Handle legacy `table`, `font`, and `br`-separated pages. - Avoid common non-reading areas such as navigation, forms, buttons, code blocks, hidden text, and page chrome. - Use user-configured provider endpoints and API keys. -- Support OpenAI, Anthropic Claude, and Google Gemini provider adapters. -- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon). +- Support OpenAI, Anthropic Claude, Google Gemini, and compatible provider adapters. +- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon), plus Anthropic Messages API-compatible endpoints. - Fetch provider model lists from the options page. - Choose integrated or highlighted translation display styles. - Optionally show a floating page button that starts translation only after the user clicks it. @@ -82,13 +82,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages Google Gemini: https://generativelanguage.googleapis.com/v1beta/models ``` -The endpoint field is shown only for OpenAI Compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint. +The endpoint field is shown only for compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint. The Fetch models action reads available models from the selected provider: - OpenAI: `GET /v1/models` - Anthropic Claude: `GET /v1/models` - Google Gemini: `GET /v1beta/models` +- OpenAI Compatible / Anthropic Compatible: `GET /v1/models` Fetched models appear in the model selector. Margin keeps the currently configured model as an option when a provider default or previously saved model is not returned by the provider list. @@ -108,34 +109,42 @@ Quoted posts are disabled by default and can be enabled from options. Posts that ## Local LLMs -Margin supports local LLM runtimes through the OpenAI Compatible provider. This provider uses the OpenAI-style `/v1/chat/completions` API, allows an empty API key, and uses a lower default translation concurrency for local inference. +Margin supports local LLM runtimes through compatible providers: -Common endpoint presets: +- OpenAI Compatible uses the OpenAI-style `/v1/chat/completions` API. +- Anthropic Compatible uses the Anthropic Messages-style `/v1/messages` API with tool `input_schema` structured output. It is a wire-protocol option for compatible local or gateway endpoints, not a separate Anthropic-hosted service. + +Both compatible providers allow an empty API key and use lower default translation concurrency for local inference. If an Anthropic-compatible gateway requires a key, Margin sends it as `Authorization: Bearer ...`. + +Common compatible endpoints: ```text LM Studio: http://localhost:1234/v1/chat/completions Ollama: http://localhost:11434/v1/chat/completions llama.cpp server: http://localhost:8080/v1/chat/completions omlx: http://localhost:8000/v1/chat/completions +Generic Anthropic-compatible: http://localhost:8000/v1/messages +Ollama Anthropic compatibility: http://localhost:11434/v1/messages ``` To use a local runtime: 1. Start the local model server. 2. Open Margin options. -3. Select OpenAI Compatible as the provider. -4. Select an endpoint preset, or enter the endpoint URL shown by your runtime. +3. Select OpenAI Compatible for `/v1/chat/completions`, or Anthropic Compatible for `/v1/messages`. +4. Select an OpenAI-compatible endpoint preset, or enter the endpoint URL shown by your runtime. 5. Leave API key empty unless your local gateway requires one. 6. Click Fetch models and choose a served model from the model selector. -7. Keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field. +7. For OpenAI Compatible, keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field. Runtime notes: - LM Studio commonly serves OpenAI-compatible requests at `http://localhost:1234/v1/chat/completions`. - Ollama requires its OpenAI-compatible API to be available at `http://localhost:11434/v1/chat/completions`. +- Ollama can also expose Anthropic-compatible requests at `http://localhost:11434/v1/messages`. Margin sends tools for structured output but does not force `tool_choice` for Anthropic-compatible endpoints, because some compatible runtimes accept tools but do not support forced tool selection. - llama.cpp server must be started with an OpenAI-compatible HTTP server enabled, commonly at `http://localhost:8080/v1/chat/completions`. - omlx is an Apple Silicon MLX inference server. Start it with `omlx serve` (zero-config, models from `~/.omlx/models`) or `omlx serve --model-dir /path/to/models`; the OpenAI-compatible API becomes available at `http://localhost:8000/v1/chat/completions`. -- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions`, and the runtime exposes a compatible `/v1/models` endpoint. +- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions` or `/v1/messages`, and the runtime exposes a compatible `/v1/models` endpoint. Local model quality, speed, context length, and JSON reliability depend on the model and runtime. Instruct models with strong multilingual ability are recommended for translation. diff --git a/README.zh-TW.md b/README.zh-TW.md index 240b860..3187284 100644 --- a/README.zh-TW.md +++ b/README.zh-TW.md @@ -22,8 +22,8 @@ Margin 目前仍是早期 MVP,支援 Chrome 與其他 Chromium 系瀏覽器, - 支援舊式 `table`、`font`,以及以 `br` 分隔文字的頁面。 - 避開常見的非閱讀區域,例如導覽列、表單、按鈕、程式碼區塊、隱藏文字與頁面介面。 - 使用你自行設定的 provider endpoint 與 API key。 -- 支援 OpenAI、Anthropic Claude 與 Google Gemini provider adapter。 -- 支援本機 OpenAI-compatible runtime,例如 LM Studio、Ollama、llama.cpp server 與 omlx(Apple Silicon)。 +- 支援 OpenAI、Anthropic Claude、Google Gemini 與 compatible provider adapter。 +- 支援本機 OpenAI-compatible runtime,例如 LM Studio、Ollama、llama.cpp server 與 omlx(Apple Silicon),以及 Anthropic Messages API-compatible endpoint。 - 可從 options 頁面取得 provider 的模型列表。 - 可選擇融入原文或醒目提示的譯文顯示樣式。 - 可選擇在頁面顯示浮動翻譯按鈕,且只有使用者點擊後才開始翻譯。 @@ -66,13 +66,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages Google Gemini: https://generativelanguage.googleapis.com/v1beta/models ``` -Endpoint 欄位只會在 OpenAI Compatible / Local LLM 設定中顯示,因為這些情境才需要使用者選擇或輸入本機 endpoint。 +Endpoint 欄位只會在 compatible / Local LLM 設定中顯示,因為這些情境才需要使用者選擇或輸入本機 endpoint。 Fetch models 會從目前選擇的 provider 讀取可用模型: - OpenAI: `GET /v1/models` - Anthropic Claude: `GET /v1/models` - Google Gemini: `GET /v1beta/models` +- OpenAI Compatible / Anthropic Compatible: `GET /v1/models` 取得的模型會出現在模型選單中。如果目前設定的 provider 預設模型或已儲存模型沒有出現在 provider 回傳的列表中,Margin 會保留它作為可選項目。 @@ -92,34 +93,42 @@ Quoted posts 預設不會翻譯,可在 options 中啟用。X 已標示為翻 ## 本機 LLM -Margin 透過 OpenAI Compatible provider 支援本機 LLM runtime。這個 provider 使用 OpenAI 風格的 `/v1/chat/completions` API,允許 API key 留空,並針對本機推理使用較低的預設翻譯 concurrency。 +Margin 透過 compatible provider 支援本機 LLM runtime: -常見 endpoint preset: +- OpenAI Compatible 使用 OpenAI 風格的 `/v1/chat/completions` API。 +- Anthropic Compatible 使用 Anthropic Messages 風格的 `/v1/messages` API,並透過 tool `input_schema` 取得結構化輸出。這是給相容本機或 gateway endpoint 使用的 wire-protocol 選項,不是另一個 Anthropic 官方代管服務。 + +兩種 compatible provider 都允許 API key 留空,並針對本機推理使用較低的預設翻譯 concurrency。如果 Anthropic-compatible gateway 需要 key,Margin 會以 `Authorization: Bearer ...` 送出。 + +常見 compatible endpoint: ```text LM Studio: http://localhost:1234/v1/chat/completions Ollama: http://localhost:11434/v1/chat/completions llama.cpp server: http://localhost:8080/v1/chat/completions omlx: http://localhost:8000/v1/chat/completions +Generic Anthropic-compatible: http://localhost:8000/v1/messages +Ollama Anthropic compatibility: http://localhost:11434/v1/messages ``` 使用本機 runtime: 1. 啟動本機模型 server。 2. 開啟 Margin options。 -3. 選擇 OpenAI Compatible 作為 provider。 -4. 選擇 endpoint preset,或輸入你的 runtime 顯示的 endpoint URL。 +3. 如果 endpoint 是 `/v1/chat/completions`,選擇 OpenAI Compatible;如果 endpoint 是 `/v1/messages`,選擇 Anthropic Compatible。 +4. 選擇 OpenAI-compatible endpoint preset,或輸入你的 runtime 顯示的 endpoint URL。 5. 除非你的本機 gateway 需要 API key,否則 API key 可以留空。 6. 點擊 Fetch models,並從模型選單中選擇 server 提供的模型。 -7. 如果 runtime 支援,建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位,請停用此選項。 +7. 對 OpenAI Compatible 而言,如果 runtime 支援,建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位,請停用此選項。 Runtime 注意事項: - LM Studio 通常在 `http://localhost:1234/v1/chat/completions` 提供 OpenAI-compatible request。 - Ollama 需要 OpenAI-compatible API 可在 `http://localhost:11434/v1/chat/completions` 使用。 +- Ollama 也可以在 `http://localhost:11434/v1/messages` 提供 Anthropic-compatible request。Margin 會送出 tools 以取得結構化輸出,但不會在 Anthropic-compatible endpoint 強制指定 `tool_choice`,因為部分相容 runtime 接受 tools,卻不支援 forced tool selection。 - llama.cpp server 必須啟動 OpenAI-compatible HTTP server,常見位址為 `http://localhost:8080/v1/chat/completions`。 - omlx 是 Apple Silicon 上的 MLX 推論 server。以 `omlx serve`(零設定,模型從 `~/.omlx/models` 載入)或 `omlx serve --model-dir /path/to/models` 啟動後,OpenAI-compatible API 預設位於 `http://localhost:8000/v1/chat/completions`。 -- 如果 Fetch models 失敗,請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 結尾,且 runtime 有提供 compatible `/v1/models` endpoint。 +- 如果 Fetch models 失敗,請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 或 `/v1/messages` 結尾,且 runtime 有提供 compatible `/v1/models` endpoint。 本機模型的品質、速度、context length 與 JSON 穩定性,取決於模型與 runtime。建議使用具備強多語能力的 instruct model 進行翻譯。 diff --git a/apps/extension/public/options.html b/apps/extension/public/options.html index a3b8687..4875eca 100644 --- a/apps/extension/public/options.html +++ b/apps/extension/public/options.html @@ -22,6 +22,7 @@

Provider

+
@@ -37,10 +38,6 @@

Provider

Presets switch the provider to OpenAI Compatible and fill the endpoint. -
+
+ Local Anthropic-compatible endpoint +

+ Use an Anthropic Messages API endpoint such as http://localhost:8000/v1/messages. +

+
+