Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- Anthropic Compatible provider support for local or gateway endpoints that
implement the Anthropic Messages-style `/v1/messages` API, including model
fetching via `/v1/models`, optional Bearer auth, local-provider concurrency
limits, and options-page setup.

## [0.3.2] - 2026-05-31

### Added
Expand Down
27 changes: 18 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ The extension is usable for normal article pages, legacy text-heavy pages, and s
- Handle legacy `table`, `font`, and `br`-separated pages.
- Avoid common non-reading areas such as navigation, forms, buttons, code blocks, hidden text, and page chrome.
- Use user-configured provider endpoints and API keys.
- Support OpenAI, Anthropic Claude, and Google Gemini provider adapters.
- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon).
- Support OpenAI, Anthropic Claude, Google Gemini, and compatible provider adapters.
- Support local OpenAI-compatible runtimes such as LM Studio, Ollama, llama.cpp server, and omlx (Apple Silicon), plus Anthropic Messages API-compatible endpoints.
- Fetch provider model lists from the options page.
- Choose integrated or highlighted translation display styles.
- Optionally show a floating page button that starts translation only after the user clicks it.
Expand Down Expand Up @@ -82,13 +82,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages
Google Gemini: https://generativelanguage.googleapis.com/v1beta/models
```

The endpoint field is shown only for OpenAI Compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint.
The endpoint field is shown only for compatible / Local LLM setups, where the user is expected to choose or enter a local endpoint.

The Fetch models action reads available models from the selected provider:

- OpenAI: `GET /v1/models`
- Anthropic Claude: `GET /v1/models`
- Google Gemini: `GET /v1beta/models`
- OpenAI Compatible / Anthropic Compatible: `GET /v1/models`

Fetched models appear in the model selector. Margin keeps the currently configured model as an option when a provider default or previously saved model is not returned by the provider list.

Expand All @@ -108,34 +109,42 @@ Quoted posts are disabled by default and can be enabled from options. Posts that

## Local LLMs

Margin supports local LLM runtimes through the OpenAI Compatible provider. This provider uses the OpenAI-style `/v1/chat/completions` API, allows an empty API key, and uses a lower default translation concurrency for local inference.
Margin supports local LLM runtimes through compatible providers:

Common endpoint presets:
- OpenAI Compatible uses the OpenAI-style `/v1/chat/completions` API.
- Anthropic Compatible uses the Anthropic Messages-style `/v1/messages` API with tool `input_schema` structured output. It is a wire-protocol option for compatible local or gateway endpoints, not a separate Anthropic-hosted service.

Both compatible providers allow an empty API key and use lower default translation concurrency for local inference. If an Anthropic-compatible gateway requires a key, Margin sends it as `Authorization: Bearer ...`.

Common compatible endpoints:

```text
LM Studio: http://localhost:1234/v1/chat/completions
Ollama: http://localhost:11434/v1/chat/completions
llama.cpp server: http://localhost:8080/v1/chat/completions
omlx: http://localhost:8000/v1/chat/completions
Generic Anthropic-compatible: http://localhost:8000/v1/messages
Ollama Anthropic compatibility: http://localhost:11434/v1/messages
```

To use a local runtime:

1. Start the local model server.
2. Open Margin options.
3. Select OpenAI Compatible as the provider.
4. Select an endpoint preset, or enter the endpoint URL shown by your runtime.
3. Select OpenAI Compatible for `/v1/chat/completions`, or Anthropic Compatible for `/v1/messages`.
4. Select an OpenAI-compatible endpoint preset, or enter the endpoint URL shown by your runtime.
5. Leave API key empty unless your local gateway requires one.
6. Click Fetch models and choose a served model from the model selector.
7. Keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field.
7. For OpenAI Compatible, keep Request JSON mode enabled when supported. Disable it if the local runtime rejects the `response_format` request field.

Runtime notes:

- LM Studio commonly serves OpenAI-compatible requests at `http://localhost:1234/v1/chat/completions`.
- Ollama requires its OpenAI-compatible API to be available at `http://localhost:11434/v1/chat/completions`.
- Ollama can also expose Anthropic-compatible requests at `http://localhost:11434/v1/messages`. Margin sends tools for structured output but does not force `tool_choice` for Anthropic-compatible endpoints, because some compatible runtimes accept tools but do not support forced tool selection.
- llama.cpp server must be started with an OpenAI-compatible HTTP server enabled, commonly at `http://localhost:8080/v1/chat/completions`.
- omlx is an Apple Silicon MLX inference server. Start it with `omlx serve` (zero-config, models from `~/.omlx/models`) or `omlx serve --model-dir /path/to/models`; the OpenAI-compatible API becomes available at `http://localhost:8000/v1/chat/completions`.
- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions`, and the runtime exposes a compatible `/v1/models` endpoint.
- If Fetch models fails, confirm the local server is running, the endpoint URL ends with `/v1/chat/completions` or `/v1/messages`, and the runtime exposes a compatible `/v1/models` endpoint.

Local model quality, speed, context length, and JSON reliability depend on the model and runtime. Instruct models with strong multilingual ability are recommended for translation.

Expand Down
27 changes: 18 additions & 9 deletions README.zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ Margin 目前仍是早期 MVP,支援 Chrome 與其他 Chromium 系瀏覽器,
- 支援舊式 `table`、`font`,以及以 `br` 分隔文字的頁面。
- 避開常見的非閱讀區域,例如導覽列、表單、按鈕、程式碼區塊、隱藏文字與頁面介面。
- 使用你自行設定的 provider endpoint 與 API key。
- 支援 OpenAI、Anthropic ClaudeGoogle Gemini provider adapter。
- 支援本機 OpenAI-compatible runtime,例如 LM Studio、Ollama、llama.cpp server 與 omlx(Apple Silicon)。
- 支援 OpenAI、Anthropic ClaudeGoogle Gemini 與 compatible provider adapter。
- 支援本機 OpenAI-compatible runtime,例如 LM Studio、Ollama、llama.cpp server 與 omlx(Apple Silicon),以及 Anthropic Messages API-compatible endpoint
- 可從 options 頁面取得 provider 的模型列表。
- 可選擇融入原文或醒目提示的譯文顯示樣式。
- 可選擇在頁面顯示浮動翻譯按鈕,且只有使用者點擊後才開始翻譯。
Expand Down Expand Up @@ -66,13 +66,14 @@ Anthropic Claude: https://api.anthropic.com/v1/messages
Google Gemini: https://generativelanguage.googleapis.com/v1beta/models
```

Endpoint 欄位只會在 OpenAI Compatible / Local LLM 設定中顯示,因為這些情境才需要使用者選擇或輸入本機 endpoint。
Endpoint 欄位只會在 compatible / Local LLM 設定中顯示,因為這些情境才需要使用者選擇或輸入本機 endpoint。

Fetch models 會從目前選擇的 provider 讀取可用模型:

- OpenAI: `GET /v1/models`
- Anthropic Claude: `GET /v1/models`
- Google Gemini: `GET /v1beta/models`
- OpenAI Compatible / Anthropic Compatible: `GET /v1/models`

取得的模型會出現在模型選單中。如果目前設定的 provider 預設模型或已儲存模型沒有出現在 provider 回傳的列表中,Margin 會保留它作為可選項目。

Expand All @@ -92,34 +93,42 @@ Quoted posts 預設不會翻譯,可在 options 中啟用。X 已標示為翻

## 本機 LLM

Margin 透過 OpenAI Compatible provider 支援本機 LLM runtime。這個 provider 使用 OpenAI 風格的 `/v1/chat/completions` API,允許 API key 留空,並針對本機推理使用較低的預設翻譯 concurrency。
Margin 透過 compatible provider 支援本機 LLM runtime

常見 endpoint preset:
- OpenAI Compatible 使用 OpenAI 風格的 `/v1/chat/completions` API。
- Anthropic Compatible 使用 Anthropic Messages 風格的 `/v1/messages` API,並透過 tool `input_schema` 取得結構化輸出。這是給相容本機或 gateway endpoint 使用的 wire-protocol 選項,不是另一個 Anthropic 官方代管服務。

兩種 compatible provider 都允許 API key 留空,並針對本機推理使用較低的預設翻譯 concurrency。如果 Anthropic-compatible gateway 需要 key,Margin 會以 `Authorization: Bearer ...` 送出。

常見 compatible endpoint:

```text
LM Studio: http://localhost:1234/v1/chat/completions
Ollama: http://localhost:11434/v1/chat/completions
llama.cpp server: http://localhost:8080/v1/chat/completions
omlx: http://localhost:8000/v1/chat/completions
Generic Anthropic-compatible: http://localhost:8000/v1/messages
Ollama Anthropic compatibility: http://localhost:11434/v1/messages
```

使用本機 runtime:

1. 啟動本機模型 server。
2. 開啟 Margin options。
3. 選擇 OpenAI Compatible 作為 provider
4. 選擇 endpoint preset,或輸入你的 runtime 顯示的 endpoint URL。
3. 如果 endpoint 是 `/v1/chat/completions`,選擇 OpenAI Compatible;如果 endpoint 是 `/v1/messages`,選擇 Anthropic Compatible
4. 選擇 OpenAI-compatible endpoint preset,或輸入你的 runtime 顯示的 endpoint URL。
5. 除非你的本機 gateway 需要 API key,否則 API key 可以留空。
6. 點擊 Fetch models,並從模型選單中選擇 server 提供的模型。
7. 如果 runtime 支援,建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位,請停用此選項。
7. 對 OpenAI Compatible 而言,如果 runtime 支援,建議保持 Request JSON mode 啟用。若本機 runtime 拒絕 `response_format` request 欄位,請停用此選項。

Runtime 注意事項:

- LM Studio 通常在 `http://localhost:1234/v1/chat/completions` 提供 OpenAI-compatible request。
- Ollama 需要 OpenAI-compatible API 可在 `http://localhost:11434/v1/chat/completions` 使用。
- Ollama 也可以在 `http://localhost:11434/v1/messages` 提供 Anthropic-compatible request。Margin 會送出 tools 以取得結構化輸出,但不會在 Anthropic-compatible endpoint 強制指定 `tool_choice`,因為部分相容 runtime 接受 tools,卻不支援 forced tool selection。
- llama.cpp server 必須啟動 OpenAI-compatible HTTP server,常見位址為 `http://localhost:8080/v1/chat/completions`。
- omlx 是 Apple Silicon 上的 MLX 推論 server。以 `omlx serve`(零設定,模型從 `~/.omlx/models` 載入)或 `omlx serve --model-dir /path/to/models` 啟動後,OpenAI-compatible API 預設位於 `http://localhost:8000/v1/chat/completions`。
- 如果 Fetch models 失敗,請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 結尾,且 runtime 有提供 compatible `/v1/models` endpoint。
- 如果 Fetch models 失敗,請確認本機 server 已啟動、endpoint URL 以 `/v1/chat/completions` 或 `/v1/messages` 結尾,且 runtime 有提供 compatible `/v1/models` endpoint。

本機模型的品質、速度、context length 與 JSON 穩定性,取決於模型與 runtime。建議使用具備強多語能力的 instruct model 進行翻譯。

Expand Down
19 changes: 14 additions & 5 deletions apps/extension/public/options.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ <h2 data-i18n="provider">Provider</h2>
<option value="anthropic">Anthropic Claude</option>
<option value="google">Google Gemini</option>
<option value="openai-compatible">OpenAI Compatible</option>
<option value="anthropic-compatible">Anthropic Compatible</option>
</select>
</label>
<fieldset data-provider-section="openai-compatible">
Expand All @@ -37,10 +38,6 @@ <h2 data-i18n="provider">Provider</h2>
</select>
<span class="hint" data-i18n="endpointPresetHint">Presets switch the provider to OpenAI Compatible and fill the endpoint.</span>
</label>
<label>
<span data-i18n="providerEndpoint">Endpoint URL</span>
<input name="providerEndpoint" type="url" required />
</label>
<label class="checkbox">
<input name="openAICompatibleJsonMode" type="checkbox" />
<span>
Expand All @@ -49,10 +46,22 @@ <h2 data-i18n="provider">Provider</h2>
</span>
</label>
</fieldset>
<fieldset data-provider-section="anthropic-compatible">
<legend data-i18n="localAnthropicPresets">Local Anthropic-compatible endpoint</legend>
<p class="hint" data-i18n="localAnthropicEndpointHint">
Use an Anthropic Messages API endpoint such as http://localhost:8000/v1/messages.
</p>
</fieldset>
<label data-provider-section="openai-compatible anthropic-compatible">
<span data-i18n="providerEndpoint">Endpoint URL</span>
<input name="providerEndpoint" type="url" required />
</label>
<label data-field="api-key">
<span data-i18n="apiKey">API key</span>
<input name="apiKey" type="password" autocomplete="off" />
<span class="hint" data-i18n="apiKeyHint">Paste the raw provider API key. Local OpenAI-compatible endpoints can leave this empty.</span>
<span class="hint" data-i18n="apiKeyHint">
Paste the raw provider API key. Local OpenAI-compatible and Anthropic-compatible endpoints can leave this empty.
</span>
</label>
<label>
<span data-i18n="model">Model</span>
Expand Down
105 changes: 104 additions & 1 deletion apps/extension/src/background/providers/anthropic.test.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { DEFAULT_SETTINGS } from "../../shared/defaults";
import type { ExtensionSettings, TextSegment } from "../../shared/types";
import { anthropicProvider } from "./anthropic";
import { anthropicCompatibleProvider, anthropicProvider } from "./anthropic";

const segments: TextSegment[] = [
{ id: "a", text: "Hello" },
Expand Down Expand Up @@ -119,6 +119,109 @@ describe("anthropicProvider.translate", () => {
});
});

describe("anthropicCompatibleProvider.translate", () => {
it("uses Bearer auth and omits browser-access header when api key is provided", async () => {
const body = JSON.stringify({
content: [
{
type: "tool_use",
name: "return_translations",
input: { translations: [{ id: "a", text: "你好" }] }
}
]
});
const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
vi.stubGlobal("fetch", stub);

const results = await anthropicCompatibleProvider.translate(
segments,
makeSettings({
provider: "anthropic-compatible",
apiKey: "test-key",
providerEndpoint: "http://localhost:8000/v1/messages"
})
);

expect(results).toEqual([{ id: "a", text: "你好" }]);
const headers = calls[0].init.headers as Record<string, string>;
expect(headers.Authorization).toBe("Bearer test-key");
expect(headers["x-api-key"]).toBeUndefined();
expect(headers["anthropic-dangerous-direct-browser-access"]).toBeUndefined();
expect(headers["anthropic-version"]).toBe("2023-06-01");
expect(calls[0].body.tool_choice).toBeUndefined();
expect((calls[0].body.tools as Array<{ name: string }>)[0].name).toBe("return_translations");
});

it("omits Authorization header when api key is empty", async () => {
const body = JSON.stringify({
content: [
{
type: "tool_use",
name: "return_translations",
input: { translations: [{ id: "a", text: "你好" }] }
}
]
});
const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
vi.stubGlobal("fetch", stub);

const results = await anthropicCompatibleProvider.translate(
segments,
makeSettings({
provider: "anthropic-compatible",
apiKey: "",
providerEndpoint: "http://localhost:8000/v1/messages"
})
);

expect(results).toEqual([{ id: "a", text: "你好" }]);
const headers = calls[0].init.headers as Record<string, string>;
expect(headers.Authorization).toBeUndefined();
expect(headers["x-api-key"]).toBeUndefined();
});
});

describe("anthropicCompatibleProvider.listModels", () => {
it("uses Bearer auth when api key is provided", async () => {
const body = JSON.stringify({ data: [{ id: "local-model" }] });
const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
vi.stubGlobal("fetch", stub);

const models = await anthropicCompatibleProvider.listModels(
makeSettings({
provider: "anthropic-compatible",
apiKey: "cloud-key",
providerEndpoint: "http://localhost:8000/v1/messages"
})
);

expect(models).toEqual([{ id: "local-model" }]);
expect(calls[0].url).toBe("http://localhost:8000/v1/models");
const headers = calls[0].init.headers as Record<string, string>;
expect(headers.Authorization).toBe("Bearer cloud-key");
expect(headers["x-api-key"]).toBeUndefined();
});

it("omits Authorization header when api key is empty", async () => {
const body = JSON.stringify({ data: [{ id: "local-model" }] });
const { fetch: stub, calls } = stubFetch(new Response(body, { status: 200 }));
vi.stubGlobal("fetch", stub);

const models = await anthropicCompatibleProvider.listModels(
makeSettings({
provider: "anthropic-compatible",
apiKey: "",
providerEndpoint: "http://localhost:8000/v1/messages"
})
);

expect(models).toEqual([{ id: "local-model" }]);
const headers = calls[0].init.headers as Record<string, string>;
expect(headers.Authorization).toBeUndefined();
expect(headers["x-api-key"]).toBeUndefined();
});
});

describe("anthropicProvider.listModels", () => {
it("returns models with display_name mapped to displayName", async () => {
const body = JSON.stringify({
Expand Down
Loading
Loading