Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,15 @@
{
"group": "Platform (Web App)",
"pages": [
"overview/platform"
"overview/platform",
{
"group": "Create Voice",
"icon": "user-plus",
"pages": [
"overview/platform/create-voice/professional-voice-clone",
"overview/platform/create-voice/voice-design"
]
}
]
},
{
Expand Down
Binary file added images/platform/create-voice/01-hub.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/platform/create-voice/pvc-00-confirm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/platform/create-voice/pvc-01-upload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/platform/create-voice/pvc-03-verify.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/platform/create-voice/pvc-04-training.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 10 additions & 2 deletions overview/capabilities.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@
Clone a voice instantly from a clip, or train a persistent model.
</Card>

<Card title="Professional Voice Clone" icon="microphone-stand" href="/overview/platform/create-voice/professional-voice-clone">
Clone a real, verified voice at studio quality in the web app.
</Card>

<Card title="Realtime Streaming" icon="bolt" href="/features/realtime-streaming">
Stream audio as it generates — for voice agents and live apps.
</Card>
Expand All @@ -36,16 +40,20 @@
These run in the browser, no code required — see the [Platform guide](/overview/platform).

<CardGroup cols={2}>
<Card title="Voice Design" icon="wand-magic-sparkles" href="/overview/platform/create-voice/voice-design">
Design a voice from a plain-text description — best for original characters.
</Card>

<Card title="Voice Changer" icon="wand-magic-sparkles" href="/overview/platform">
Transform existing audio into a different voice.
</Card>

<Card title="Story Studio" icon="book-open" href="/overview/platform">
Produce multi-speaker, long-form audio — audiobooks and narration.

Check warning on line 52 in overview/capabilities.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

overview/capabilities.mdx#L52

Did you really mean 'audiobooks'?
</Card>

<Card title="Music & Sound Effects" icon="music" href="/overview/platform">
Generate music and cinematic sound effects from a prompt.
<Card title="Sound Effects" icon="volume-high" href="/overview/platform">
Generate cinematic sound effects from a text prompt.
</Card>

<Card title="Audio Separation" icon="scissors" href="/overview/platform">
Expand Down
4 changes: 0 additions & 4 deletions overview/platform.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

<Note>
This page is an orientation map of the web app. Detailed, step-by-step
walkthroughs for each tool are coming. To build with code instead, see the

Check warning on line 11 in overview/platform.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

overview/platform.mdx#L11

Did you really mean 'walkthroughs'?
[SDK Quickstart](/developer-guide/sdk-guide/quickstart) or the [API
Reference](/api-reference/introduction).
</Note>
Expand All @@ -32,10 +32,6 @@
Generate cinematic sound effects from a text prompt.
</Card>

<Card title="Music" icon="music" href="https://fish.audio/app/music-generation">
Generate music from a description.
</Card>

<Card title="Audio Separation" icon="scissors" href="https://fish.audio/app/audio-separation">
Split audio into stems (e.g. vocals and background).
</Card>
Expand All @@ -60,7 +56,7 @@
## Produce projects

<Card title="Story Studio" icon="book" href="https://fish.audio/app/story-studio">
Assemble multi-speaker, long-form audio — audiobooks, dialogue, and narration

Check warning on line 59 in overview/platform.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

overview/platform.mdx#L59

Did you really mean 'audiobooks'?
— in a project editor.
</Card>

Expand All @@ -80,7 +76,7 @@

<CardGroup cols={2}>
<Card title="API Keys" icon="key" href="https://fish.audio/app/api-keys">
Create and manage keys for the API and SDKs.

Check warning on line 79 in overview/platform.mdx

View check run for this annotation

Mintlify / Mintlify Validation (hanabiaiinc) - vale-spellcheck

overview/platform.mdx#L79

Did you really mean 'SDKs'?
</Card>

<Card title="Plans & Billing" icon="credit-card" href="https://fish.audio/app/billing">
Expand Down
158 changes: 158 additions & 0 deletions overview/platform/create-voice/professional-voice-clone.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: "Professional Voice Clone"
description: "Clone a real, verified voice at studio quality in the Fish Audio web app"
icon: "microphone-stand"
---

{/*
DRAFT NOTE (remove before publishing): two screenshots are still placeholders (italic lines).
Capture each on https://fish.audio/app (logged in), save to the path below, then replace the
italic line with a <Frame>, e.g.:
<Frame><img src="/images/platform/create-voice/pvc-05-preview.png" alt="..." /></Frame>
Comment on lines +7 to +11

SCREENSHOT MANIFEST — file (under images/platform/create-voice/) | where | what to show
- pvc-00-confirm.png | DONE | the "Create Professional Voice Clone?" confirmation dialog
- pvc-01-upload.png | DONE | upload area + duration timeline + uploaded clips + Start analyze
- pvc-03-verify.png | DONE | the verification step: language, prompt, Record
- pvc-04-training.png | DONE | the "Preparing training materials" status
- pvc-05-preview.png | TODO /app/professional-voice-clone | the preview samples and custom preview input
- pvc-06-release.png | TODO /app/professional-voice-clone/create-voice | the release form with the confirmation box
*/}

Professional Voice Clone (PVC) builds a studio-quality voice from recordings you provide. It
adds identity verification — you read a short prompt aloud so Fish Audio can confirm the
recordings are yours — then trains a model you can preview and release.

<Note>
Looking to make an original character voice instead? See [Voice
Design](/overview/platform/create-voice/voice-design).
</Note>

## Before you begin

<Warning>
You must have permission to clone the voice — you own it, or have the speaker's consent. You
confirm this before the voice is created.
</Warning>

- **Sign in** at [fish.audio/app](https://fish.audio/app).
- **A PVC slot.** Professional Voice Clone uses a limited number of slots tied to your
plan; the hub shows how many you have used.
- **Time.** Training alone takes 10–30 minutes, so set aside time for the full flow.

## Open Professional Voice Clone

Go to [fish.audio/app/create-voice](https://fish.audio/app/create-voice) and choose
**Professional Voice Clone**. You are asked to confirm — note that **a Professional Voice
Clone job cannot be deleted once created**, so start one only when you are ready. Fish Audio
then creates the clone and opens its workspace. (If you have used all your slots, you
are sent to billing instead.)

<Frame>
<img src="/images/platform/create-voice/pvc-00-confirm.png" alt="Confirmation dialog warning that a Professional Voice Clone job cannot be deleted once created" />
</Frame>

## Create the clone

<Steps>
<Step title="Upload your audio">
Under **Upload your audio files**, add recordings of the voice you are cloning.

- **Formats:** MP3, WAV, FLAC (other common audio and video files are also accepted).
- **Each clip:** less than 1 minute.
- **Recommended:** 12–15 clips, each 45–60 seconds, with a consistent tone.
- **Total:** at least 10 minutes; the timeline shows your progress toward the minimum (up
to a 180-minute maximum).

Each file shows its duration, size, and a status badge as it is checked. If a clip is
rejected (for example, *"No usable speech detected…"*) or flagged for multiple speakers,
delete and replace it.

<Frame>
<img src="/images/platform/create-voice/pvc-01-upload.png" alt="The Professional Voice Clone upload page with supported formats, the duration timeline, the uploaded clips, and the Start analyze button" />
</Frame>
</Step>

<Step title="Analyze">
When you have at least 10 minutes of usable audio, click **Start analyze** (at the bottom
of the upload page shown above). Fish Audio checks each clip's quality and language.
</Step>

<Step title="Verify it is your voice">
In **Verify Voice Ownership**, choose the prompt **language**, click **Record**, and read
the on-screen text aloud (keep it under about 30 seconds). Submit the recording — Fish
Audio compares your voice to the uploaded clips.

If verification fails, you will see a reason such as *"The words you read didn't match the
prompt."* or *"Your voice didn't match the uploaded audio."* Try again in a quiet room,
reading the prompt exactly.

<Frame>
<img src="/images/platform/create-voice/pvc-03-verify.png" alt="The Verify Voice Ownership step with a language selector, prompt text, and a Record button" />
</Frame>
</Step>

<Step title="Wait for training">
After verification, training starts. You will see **Preparing training materials…** Training
typically takes 10 to 30 minutes, and Fish Audio emails you when it is done. The page
updates itself, so you can leave and come back.

<Frame>
<img src="/images/platform/create-voice/pvc-04-training.png" alt="The Professional Voice Clone training status showing 'Preparing training materials' with model details" />
</Frame>
</Step>

<Step title="Preview the voice">
When training finishes, the **Preview** section offers ready-made samples — click play to
listen. To test your own words, click **Custom preview text**, type a sentence, and click
**Generate Preview**.

_Screenshot: the preview samples and the custom preview input._
</Step>

<Step title="Name and release">
Click **Continue to Release** to open the release form. Enter a **Voice Name** (required),
adjust the optional details and **Type** (Public, Unlisted, or Private), then tick
the confirmation box (*"I confirm I have permission to create and share this voice
model."*) and click **Create Voice**. Your professional voice is published and you land on
its model page.

_Screenshot: the release form with the confirmation box._
</Step>
</Steps>

## Resume a clone in progress

Because PVC takes a while, the Create Voice hub lists your clones in progress below the
method cards:

- **Draft** → **Continue** — resume an unfinished clone (still uploading or awaiting verification).
- **Awaiting Release** → **Release** — finish a trained clone that is ready to publish.
- **Trained** → **View** — open a clone that is already done.
- **Training in progress** — clones that are still verifying or training show a spinner and update automatically.
- **Verification Failed** — verification did not pass; there is no further action here. See [Troubleshooting](#troubleshooting) below.

## Troubleshooting

<AccordionGroup>
<Accordion title="You've reached the limits of your current plan">
You are out of Professional Voice Clone slots. Upgrade your plan to get more slots.
</Accordion>

<Accordion title="A clip was rejected">
Usually background noise, no clear speech, or multiple speakers. Re-record in a quiet room
with a single speaker and upload again.
</Accordion>

<Accordion title="Verification keeps failing">
Read the prompt word for word in the selected language, in a quiet room, using the same
voice as your uploaded clips. If verification cannot pass, the clone is marked
**Verification Failed** on the Create Voice hub, the model is locked and cannot be deleted,
and you will need to contact customer service for help.
</Accordion>

<Accordion title="You changed clips after analysis">
Editing your audio clears the current analysis results. All clips reset and must be
analyzed again before you can continue — confirm when prompted.
</Accordion>
</AccordionGroup>
104 changes: 104 additions & 0 deletions overview/platform/create-voice/voice-design.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: "Voice Design"
description: "Generate a custom voice from a text description in the Fish Audio web app"
icon: "wand-magic-sparkles"
---

{/*
DRAFT NOTE (remove before publishing): screenshots are placeholders shown as italic lines.
Capture each on https://fish.audio/app (logged in), save to the path below, then replace the
italic line with a <Frame>, e.g.:
<Frame><img src="/images/platform/create-voice/vd-01-describe.png" alt="..." /></Frame>
Comment on lines +7 to +11

SCREENSHOT MANIFEST — file (under images/platform/create-voice/) | where | what to show
- 01-hub.png | DONE | the hub with its creation-method cards
- vd-01-describe.png | TODO /app/voice-design | the description box, starter prompts, Generate button
- vd-02-pick.png | TODO /app/voice-design (after generate) | the two sample cards, one Selected, Continue enabled
- vd-03-details.png | TODO /app/voice-design (step 3) | the Voice Details form with visibility + confirmation
*/}

Voice Design turns a written description into an AI voice. You write a prompt, generate a
couple of samples, pick your favorite, then add a few details and save. It is the fastest way
to make an original voice — best for characters rather than cloning a real person.

<Note>
To clone a real, verified voice at studio quality instead, see [Professional Voice
Clone](/overview/platform/create-voice/professional-voice-clone).
</Note>

## Before you begin

- **Sign in** at [fish.audio/app](https://fish.audio/app).
- **Credits.** Each generation costs credits (or is free during a promotion — the cost is
shown next to the **Generate Samples** button).

## Open Voice Design

Go to [fish.audio/app/create-voice](https://fish.audio/app/create-voice) and, under **Pick a
method**, choose **Voice Design**.

<Frame>
<img src="/images/platform/create-voice/01-hub.png" alt="The Create Voice hub with the Instant Voice Clone and Voice Design methods" />
</Frame>

## Create the voice

<Steps>
<Step title="Describe the voice">
Under **Describe the voice you want**, type one or two specific sentences — for example,
*"A warm, slightly raspy middle-aged narrator with a soft American accent. Calm, deliberate
pace, great for late-night audiobooks."* Your description must be between 20 and 1000
characters.

To start faster, click a starter prompt such as **Warm late-night radio host**,
**Documentary narrator**, or **Children's storyteller**. To control exactly what the
samples say, click **Set preview text** and enter a short line (5–150 characters);
otherwise Fish Audio writes a sample for you.

Check the cost note (for example, **2000 credits per generation**, or **Free for a limited
time** during a promotion), then click **Generate Samples**.

_Screenshot: the description box, starter prompts, and the Generate button._
</Step>

<Step title="Generate and pick a sample">
When generation finishes, the **Pick a voice** section shows two samples. Click **play** on
a card to listen, and click the **waveform** to jump to a spot. Not happy with either?
Click **Re-generate Samples** for a new pair.

Click **Select** on the sample you want — it changes to **Selected** — then click
**Continue**.

_Screenshot: the two sample cards with one Selected and Continue enabled._
</Step>

<Step title="Add details and save">
On the **Voice Details** screen, a preview card updates as you fill in the form:

- **Voice Name** (required).
- **Cover**, **Description**, **Use Cases**, and **Voice Style** (optional — some are
pre-filled from the sample).
- **Gender** (Female, Male, or Neutral) and **Age** (Young, Middle-Aged, or Elderly).
- **Type** — visibility: **Public** (shown in discovery), **Unlisted** (shareable by
link), or **Private** (only you). On the free tier, Unlisted and Private are locked with
an upgrade hint.

Tick the confirmation box (*"I confirm this voice does not impersonate a real, identifiable
person and complies with Fish Audio's usage policy."*), then click **Create Voice**. Your
voice is saved and you land on **My Voices**.

_Screenshot: the Voice Details form with the visibility options and confirmation box._
</Step>
</Steps>

## Next steps

<CardGroup cols={2}>
<Card title="Professional Voice Clone" icon="microphone-stand" href="/overview/platform/create-voice/professional-voice-clone">
Clone a real, verified voice at studio quality.
</Card>

<Card title="Manage your voices" icon="user" href="https://fish.audio/app/my-voices">
Find, edit, and use the voices you have created.
</Card>
</CardGroup>
Loading