diff --git a/docs.json b/docs.json index 4254322..0d91f0f 100644 --- a/docs.json +++ b/docs.json @@ -54,7 +54,15 @@ { "group": "Platform (Web App)", "pages": [ - "overview/platform" + "overview/platform", + { + "group": "Create Voice", + "icon": "user-plus", + "pages": [ + "overview/platform/create-voice/professional-voice-clone", + "overview/platform/create-voice/voice-design" + ] + } ] }, { diff --git a/images/platform/create-voice/01-hub.png b/images/platform/create-voice/01-hub.png new file mode 100644 index 0000000..d4d701c Binary files /dev/null and b/images/platform/create-voice/01-hub.png differ diff --git a/images/platform/create-voice/pvc-00-confirm.png b/images/platform/create-voice/pvc-00-confirm.png new file mode 100644 index 0000000..252b2a4 Binary files /dev/null and b/images/platform/create-voice/pvc-00-confirm.png differ diff --git a/images/platform/create-voice/pvc-01-upload.png b/images/platform/create-voice/pvc-01-upload.png new file mode 100644 index 0000000..58df75c Binary files /dev/null and b/images/platform/create-voice/pvc-01-upload.png differ diff --git a/images/platform/create-voice/pvc-03-verify.png b/images/platform/create-voice/pvc-03-verify.png new file mode 100644 index 0000000..7b83484 Binary files /dev/null and b/images/platform/create-voice/pvc-03-verify.png differ diff --git a/images/platform/create-voice/pvc-04-training.png b/images/platform/create-voice/pvc-04-training.png new file mode 100644 index 0000000..5eae1cd Binary files /dev/null and b/images/platform/create-voice/pvc-04-training.png differ diff --git a/overview/capabilities.mdx b/overview/capabilities.mdx index 57184e8..154ff61 100644 --- a/overview/capabilities.mdx +++ b/overview/capabilities.mdx @@ -22,6 +22,10 @@ Fish Audio is a voice AI platform. Every core feature is available three ways: i Clone a voice instantly from a clip, or train a persistent model. + + Clone a real, verified voice at studio quality in the web app. + + Stream audio as it generates — for voice agents and live apps. @@ -36,6 +40,10 @@ Fish Audio is a voice AI platform. Every core feature is available three ways: i These run in the browser, no code required — see the [Platform guide](/overview/platform). + + Design a voice from a plain-text description — best for original characters. + + Transform existing audio into a different voice. @@ -44,8 +52,8 @@ These run in the browser, no code required — see the [Platform guide](/overvie Produce multi-speaker, long-form audio — audiobooks and narration. - - Generate music and cinematic sound effects from a prompt. + + Generate cinematic sound effects from a text prompt. diff --git a/overview/platform.mdx b/overview/platform.mdx index 464844e..f52c343 100644 --- a/overview/platform.mdx +++ b/overview/platform.mdx @@ -32,10 +32,6 @@ The [Fish Audio web app](https://fish.audio/app) gives you every capability with Generate cinematic sound effects from a text prompt. - - Generate music from a description. - - Split audio into stems (e.g. vocals and background). diff --git a/overview/platform/create-voice/professional-voice-clone.mdx b/overview/platform/create-voice/professional-voice-clone.mdx new file mode 100644 index 0000000..c1f8ce3 --- /dev/null +++ b/overview/platform/create-voice/professional-voice-clone.mdx @@ -0,0 +1,158 @@ +--- +title: "Professional Voice Clone" +description: "Clone a real, verified voice at studio quality in the Fish Audio web app" +icon: "microphone-stand" +--- + +{/* + DRAFT NOTE (remove before publishing): two screenshots are still placeholders (italic lines). + Capture each on https://fish.audio/app (logged in), save to the path below, then replace the + italic line with a , e.g.: + ... + + SCREENSHOT MANIFEST — file (under images/platform/create-voice/) | where | what to show + - pvc-00-confirm.png | DONE | the "Create Professional Voice Clone?" confirmation dialog + - pvc-01-upload.png | DONE | upload area + duration timeline + uploaded clips + Start analyze + - pvc-03-verify.png | DONE | the verification step: language, prompt, Record + - pvc-04-training.png | DONE | the "Preparing training materials" status + - pvc-05-preview.png | TODO /app/professional-voice-clone | the preview samples and custom preview input + - pvc-06-release.png | TODO /app/professional-voice-clone/create-voice | the release form with the confirmation box +*/} + +Professional Voice Clone (PVC) builds a studio-quality voice from recordings you provide. It +adds identity verification — you read a short prompt aloud so Fish Audio can confirm the +recordings are yours — then trains a model you can preview and release. + + + Looking to make an original character voice instead? See [Voice + Design](/overview/platform/create-voice/voice-design). + + +## Before you begin + + + You must have permission to clone the voice — you own it, or have the speaker's consent. You + confirm this before the voice is created. + + +- **Sign in** at [fish.audio/app](https://fish.audio/app). +- **A PVC slot.** Professional Voice Clone uses a limited number of slots tied to your + plan; the hub shows how many you have used. +- **Time.** Training alone takes 10–30 minutes, so set aside time for the full flow. + +## Open Professional Voice Clone + +Go to [fish.audio/app/create-voice](https://fish.audio/app/create-voice) and choose +**Professional Voice Clone**. You are asked to confirm — note that **a Professional Voice +Clone job cannot be deleted once created**, so start one only when you are ready. Fish Audio +then creates the clone and opens its workspace. (If you have used all your slots, you +are sent to billing instead.) + + + Confirmation dialog warning that a Professional Voice Clone job cannot be deleted once created + + +## Create the clone + + + + Under **Upload your audio files**, add recordings of the voice you are cloning. + + - **Formats:** MP3, WAV, FLAC (other common audio and video files are also accepted). + - **Each clip:** less than 1 minute. + - **Recommended:** 12–15 clips, each 45–60 seconds, with a consistent tone. + - **Total:** at least 10 minutes; the timeline shows your progress toward the minimum (up + to a 180-minute maximum). + + Each file shows its duration, size, and a status badge as it is checked. If a clip is + rejected (for example, *"No usable speech detected…"*) or flagged for multiple speakers, + delete and replace it. + + + The Professional Voice Clone upload page with supported formats, the duration timeline, the uploaded clips, and the Start analyze button + + + + + When you have at least 10 minutes of usable audio, click **Start analyze** (at the bottom + of the upload page shown above). Fish Audio checks each clip's quality and language. + + + + In **Verify Voice Ownership**, choose the prompt **language**, click **Record**, and read + the on-screen text aloud (keep it under about 30 seconds). Submit the recording — Fish + Audio compares your voice to the uploaded clips. + + If verification fails, you will see a reason such as *"The words you read didn't match the + prompt."* or *"Your voice didn't match the uploaded audio."* Try again in a quiet room, + reading the prompt exactly. + + + The Verify Voice Ownership step with a language selector, prompt text, and a Record button + + + + + After verification, training starts. You will see **Preparing training materials…** Training + typically takes 10 to 30 minutes, and Fish Audio emails you when it is done. The page + updates itself, so you can leave and come back. + + + The Professional Voice Clone training status showing 'Preparing training materials' with model details + + + + + When training finishes, the **Preview** section offers ready-made samples — click play to + listen. To test your own words, click **Custom preview text**, type a sentence, and click + **Generate Preview**. + + _Screenshot: the preview samples and the custom preview input._ + + + + Click **Continue to Release** to open the release form. Enter a **Voice Name** (required), + adjust the optional details and **Type** (Public, Unlisted, or Private), then tick + the confirmation box (*"I confirm I have permission to create and share this voice + model."*) and click **Create Voice**. Your professional voice is published and you land on + its model page. + + _Screenshot: the release form with the confirmation box._ + + + +## Resume a clone in progress + +Because PVC takes a while, the Create Voice hub lists your clones in progress below the +method cards: + +- **Draft** → **Continue** — resume an unfinished clone (still uploading or awaiting verification). +- **Awaiting Release** → **Release** — finish a trained clone that is ready to publish. +- **Trained** → **View** — open a clone that is already done. +- **Training in progress** — clones that are still verifying or training show a spinner and update automatically. +- **Verification Failed** — verification did not pass; there is no further action here. See [Troubleshooting](#troubleshooting) below. + +## Troubleshooting + + + + You are out of Professional Voice Clone slots. Upgrade your plan to get more slots. + + + + Usually background noise, no clear speech, or multiple speakers. Re-record in a quiet room + with a single speaker and upload again. + + + + Read the prompt word for word in the selected language, in a quiet room, using the same + voice as your uploaded clips. If verification cannot pass, the clone is marked + **Verification Failed** on the Create Voice hub, the model is locked and cannot be deleted, + and you will need to contact customer service for help. + + + + Editing your audio clears the current analysis results. All clips reset and must be + analyzed again before you can continue — confirm when prompted. + + diff --git a/overview/platform/create-voice/voice-design.mdx b/overview/platform/create-voice/voice-design.mdx new file mode 100644 index 0000000..dc586a9 --- /dev/null +++ b/overview/platform/create-voice/voice-design.mdx @@ -0,0 +1,104 @@ +--- +title: "Voice Design" +description: "Generate a custom voice from a text description in the Fish Audio web app" +icon: "wand-magic-sparkles" +--- + +{/* + DRAFT NOTE (remove before publishing): screenshots are placeholders shown as italic lines. + Capture each on https://fish.audio/app (logged in), save to the path below, then replace the + italic line with a , e.g.: + ... + + SCREENSHOT MANIFEST — file (under images/platform/create-voice/) | where | what to show + - 01-hub.png | DONE | the hub with its creation-method cards + - vd-01-describe.png | TODO /app/voice-design | the description box, starter prompts, Generate button + - vd-02-pick.png | TODO /app/voice-design (after generate) | the two sample cards, one Selected, Continue enabled + - vd-03-details.png | TODO /app/voice-design (step 3) | the Voice Details form with visibility + confirmation +*/} + +Voice Design turns a written description into an AI voice. You write a prompt, generate a +couple of samples, pick your favorite, then add a few details and save. It is the fastest way +to make an original voice — best for characters rather than cloning a real person. + + + To clone a real, verified voice at studio quality instead, see [Professional Voice + Clone](/overview/platform/create-voice/professional-voice-clone). + + +## Before you begin + +- **Sign in** at [fish.audio/app](https://fish.audio/app). +- **Credits.** Each generation costs credits (or is free during a promotion — the cost is + shown next to the **Generate Samples** button). + +## Open Voice Design + +Go to [fish.audio/app/create-voice](https://fish.audio/app/create-voice) and, under **Pick a +method**, choose **Voice Design**. + + + The Create Voice hub with the Instant Voice Clone and Voice Design methods + + +## Create the voice + + + + Under **Describe the voice you want**, type one or two specific sentences — for example, + *"A warm, slightly raspy middle-aged narrator with a soft American accent. Calm, deliberate + pace, great for late-night audiobooks."* Your description must be between 20 and 1000 + characters. + + To start faster, click a starter prompt such as **Warm late-night radio host**, + **Documentary narrator**, or **Children's storyteller**. To control exactly what the + samples say, click **Set preview text** and enter a short line (5–150 characters); + otherwise Fish Audio writes a sample for you. + + Check the cost note (for example, **2000 credits per generation**, or **Free for a limited + time** during a promotion), then click **Generate Samples**. + + _Screenshot: the description box, starter prompts, and the Generate button._ + + + + When generation finishes, the **Pick a voice** section shows two samples. Click **play** on + a card to listen, and click the **waveform** to jump to a spot. Not happy with either? + Click **Re-generate Samples** for a new pair. + + Click **Select** on the sample you want — it changes to **Selected** — then click + **Continue**. + + _Screenshot: the two sample cards with one Selected and Continue enabled._ + + + + On the **Voice Details** screen, a preview card updates as you fill in the form: + + - **Voice Name** (required). + - **Cover**, **Description**, **Use Cases**, and **Voice Style** (optional — some are + pre-filled from the sample). + - **Gender** (Female, Male, or Neutral) and **Age** (Young, Middle-Aged, or Elderly). + - **Type** — visibility: **Public** (shown in discovery), **Unlisted** (shareable by + link), or **Private** (only you). On the free tier, Unlisted and Private are locked with + an upgrade hint. + + Tick the confirmation box (*"I confirm this voice does not impersonate a real, identifiable + person and complies with Fish Audio's usage policy."*), then click **Create Voice**. Your + voice is saved and you land on **My Voices**. + + _Screenshot: the Voice Details form with the visibility options and confirmation box._ + + + +## Next steps + + + + Clone a real, verified voice at studio quality. + + + + Find, edit, and use the voices you have created. + +