Skip to content

getTextDiff#34

Merged
DoneDeal0 merged 18 commits intomainfrom
text-diff
Feb 15, 2026
Merged

getTextDiff#34
DoneDeal0 merged 18 commits intomainfrom
text-diff

Conversation

@DoneDeal0
Copy link
Owner

@DoneDeal0 DoneDeal0 commented Jan 1, 2026

🚀 NEW FEATURE: getTextDiff

import { getTextDiff } from "@donedeal0/superdiff";

Compares two texts and returns a structured diff at a character, word, or sentence level.

FORMAT

Input

  previousText: string | null | undefined,
  currentText: string | null | undefined,
  options?: {
    separation?: "character" | "word" | "sentence", // "word" by default
    accuracy?: "normal" | "high", // "normal" by default
    detectMoves?: boolean // false by default
    ignoreCase?: boolean, // false by default
    ignorePunctuation?: boolean, // false by default
    locale?: Intl.Locale | string // undefined by default
  }
  • previousText: the original text.
  • currentText: the current text.
  • options
    • separation whether you want a character, word or sentence based diff.
    • accuracy:
      • normal (default): fastest mode, simple tokenization.
      • high: slower but exact tokenization. Handles all language subtleties (Unicode, emoji, CJK scripts, locale‑aware segmentation when a locale is provided).
    • detectMoves:
      • false (default): optimized for readability. Token moves are ignored so insertions don’t cascade and break equality (recommended for UI diffing).
      • true: semantically precise, but noiser — a single insertion shifts all following tokens, breaking equality.
    • ignoreCase: if true, hello and HELLO are considered equal.
    • ignorePunctuation: if true, hello! and hello are considered equal.
    • locale: the locale of your text. Enables locale‑aware segmentation in high accuracy mode.

Output

type TextDiff = {
  type: "text";
  status: "added" | "deleted" | "equal" | "updated";
  diff: {
    value: string;
    index: number | null;
    previousValue?: string;
    previousIndex: number | null;
    status: "added" | "deleted" | "equal" | "moved" | "updated";
  }[];
};

USAGE

WITHOUT MOVES DETECTION

This is the default output. Token moves are ignored so insertions don’t cascade and break equality. Updates are rendered as two entries (added + deleted). The algorithm uses longest common subsequence (LCS), similar to GitHub diffs.

Input

getTextDiff(
- "The brown fox jumped high",
+ "The orange cat has jumped",
{ detectMoves: false, separation: "word" }
);

Output

{
      type: "text",
+     status: "updated",
      diff: [
        {
          value: 'The',
          index: 0,
          previousIndex: 0,
          status: 'equal',
        },
-       {
-         value: "brown",
-         index: null,
-         previousIndex: 1,
-         status: "deleted",
-       },
-       {
-         value: "fox",
-         index: null,
-         previousIndex: 2,
-         status: "deleted",
-       },
+       {
+         value: "orange",
+         index: 1,
+         previousIndex: null,
+         status: "added",
+       },
+       {
+         value: "cat",
+         index: 2,
+         previousIndex: null,
+         status: "added",
+       },
+       {
+         value: "has",
+         index: 3,
+         previousIndex: null,
+         status: "added",
+       },
        {
          value: "jumped",
          index: 4,
          previousIndex: 3,
          status: "equal",
        },
-       {
-         value: "high",
-         index: null,
-         previousIndex: 4,
-         status: "deleted",
-       }
      ],
    }

WITH MOVE DETECTION

If you prefer a semantically precise diff, activate the detectMoves option. Direct token swaps are considered updated.

Input

getTextDiff(
- "The brown fox jumped high",
+ "The orange cat has jumped",
{ detectMoves: true, separation: "word" }
);

Output

{
      type: "text",
+     status: "updated",
      diff: [
        {
          value: 'The',
          index: 0,
          previousIndex: 0,
          status: 'equal',
        },
+       {
+         value: "orange",
+         index: 1,
+         previousValue: "brown",
+         previousIndex: null,
+         status: "updated",
+       },
+       {
+         value: "cat",
+         index: 2,
+         previousValue: "fox",
+         previousIndex: null,
+         status: "updated",
+       },
+       {
+         value: "has",
+         index: 3,
+         previousIndex: null,
+         status: "added",
+       },
+       {
+         value: "jumped",
+         index: 4,
+         previousIndex: 3,
+         status: "moved",
+       },
-       {
-         value: "high",
-         index: null,
-         previousIndex: 4,
-         status: "deleted",
-       }
      ],
    }

📊 BENCHMARK

Scenario Superdiff diff
10k words 1.13 ms 3.68 ms
100k words 21.68 ms 45.93 ms
10k sentences 2.30 ms 5.61 ms
100k sentences 21.95 ms 62.03 ms

(Superdiff uses its normal accuracy settings to match diff's behavior)

@DoneDeal0 DoneDeal0 force-pushed the text-diff branch 10 times, most recently from ebfc55a to 6cf30a4 Compare January 7, 2026 20:35
@DoneDeal0 DoneDeal0 force-pushed the text-diff branch 3 times, most recently from 943efdf to aee6be8 Compare January 11, 2026 14:48
@DoneDeal0 DoneDeal0 self-assigned this Jan 11, 2026
@DoneDeal0 DoneDeal0 force-pushed the main branch 10 times, most recently from 42b0ec3 to 8d21774 Compare January 12, 2026 19:59
@DoneDeal0 DoneDeal0 force-pushed the text-diff branch 2 times, most recently from 6ef2035 to 3d054eb Compare January 26, 2026 19:44
@DoneDeal0 DoneDeal0 force-pushed the text-diff branch 7 times, most recently from 8a4fa0b to d5372fd Compare February 2, 2026 19:54
@DoneDeal0 DoneDeal0 marked this pull request as ready for review February 15, 2026 10:35
@DoneDeal0 DoneDeal0 merged commit c37f532 into main Feb 15, 2026
1 check passed
@DoneDeal0 DoneDeal0 deleted the text-diff branch February 15, 2026 10:39
@github-actions
Copy link

🎉 This PR is included in version 4.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant