VoiceCraft Pro: Advanced AI Voice Solutions

Clone voices with precision and convert text to speech using your custom or available voices with VoiceCraft Pro!

Overview

VoiceCraft Pro offers advanced AI-powered capabilities for voice cloning and text-to-speech conversion. It allows users to create highly realistic voice clones from audio samples and then use these cloned voices, alongside other available voices, to convert text into speech with fine-grained control over parameters like engine, quality, speed, and emotion.

Clone Voice

Create a digital replica of a voice from uploaded audio samples.

Text to Speech

Convert text into natural-sounding speech using cloned or pre-defined voices.

Advanced Controls

Fine-tune voice cloning and text-to-speech generation with various parameters.

History Tracking

Keep track of your cloned voices and generated audio.

Clone Voice: Create a Voice Clone

This feature allows you to create a new AI voice by cloning it from one or more uploaded audio samples of the voice you want to replicate.

Name Your Voice

In the Voice Name field, enter a name for the voice clone you are creating.

Describe the Voice (Optional)

In the Description field, you can optionally provide a description of the voice.

Select Audio Files

Click the Select Audio Files button to upload one or more audio samples of the voice you want to clone. Higher quality and longer samples generally yield better results.

Remove Background Noise (Optional)

Check the Remove Background Noise box if you want the AI to attempt to clean up the audio samples during the cloning process.

Clone Voice

Click the Clone Voice button. The AI will process your audio samples and create a new Voice ID for the cloned voice. This process costs credits.

Clone Voice Input Parameters:

Voice Name
string
required

A name for the voice clone.

Description
string

(Optional) A description of the voice.

Audio Files
array of files
required

One or more audio samples of the voice to clone.

Remove Background Noise
boolean

Toggle to enable background noise removal during cloning.

Text to Speech: Convert Text to Audio

This feature allows you to convert text into speech using a selected voice, including voices you have cloned or other available voices.

Enter Text to Speak

In the main text area, enter the text you want to convert into speech. The cost in credits will be estimated based on the character count.

Select Voice Source

Choose the source of the voice:

  • Use Cloned Voices: Select from voices you have cloned using the Clone Voice feature.
  • Use Custom Voice URL: Enter a URL for a custom voice model.
  • (If neither is selected, you will likely select from a list of pre-defined voices).

Choose Voice

Select the specific voice you want to use from the dropdown list, based on your selected voice source.

Adjust Parameters

Configure various parameters to fine-tune the speech generation:

  • Voice Engine: Select the underlying AI model for speech synthesis (e.g., PlayHT 2.0, Play 3.0 Mini, Play Dialog).
  • Quality: Choose the desired audio quality (e.g., Draft, Low, Medium, High, Premium).
  • Speed: Adjust the speaking speed.
  • Voice Guidance / Style Guidance: (Availability may depend on Voice Engine) Control how closely the AI adheres to the voice’s original characteristics or a specific style.
  • Emotion: (Availability may depend on Voice Engine) Select an emotion for the voice to convey (e.g., Happy, Sad, Angry).

Generate Speech

Click the Generate Speech button. The AI will convert your text into audio using the selected voice and parameters. This process costs credits based on the length of the text.

Text to Speech Input Parameters:

Text to Speak
string
required

The text content to be converted to audio.

Use Cloned Voices Checkbox
boolean

Toggle to select from your cloned voices.

Use Custom Voice URL Checkbox
boolean

Toggle to enter a custom voice URL.

Custom Voice URL
string

(If “Use Custom Voice URL” is checked) The URL of a custom voice model.

Voice (Dropdown)
Enum
required

Select the specific voice ID or name to use.

Voice Engine
Enum

Select the speech synthesis model (e.g., PlayHT 2.0).

Quality
Enum

Select the output audio quality.

Speed
float

Adjust the speaking speed.

Voice Guidance / Style Guidance
integer

Control adherence to voice characteristics or style (range may vary).

Emotion
Enum

Select an emotion for the voice (options may vary by engine).

Text to Speech Credits:

Converting text to speech costs credits based on the length of the text. The cost is approximately 1 credit per 20 characters. The exact cost is displayed before generation.

History

VoiceCraft Pro keeps track of your cloned voices and generated audio.

Credits

Your total credit balance is displayed at the top left of the interface. Cloning voices and generating speech both consume credits.

Click the Buy More button to purchase additional credits if needed.

Tips for Best Results

High-Quality Audio Samples (Clone)

For voice cloning, use clean, high-fidelity audio samples with minimal background noise. Longer samples (up to a reasonable limit) can also improve quality.

Clear Text Input (TTS)

For text-to-speech, ensure your text is well-formatted and free of typos for the most natural-sounding output.

Experiment with Parameters (TTS)

Adjust Voice Engine, Quality, Speed, and Emotion settings to find the perfect combination for your specific audio needs.

Manage Credits

Be mindful of the credit cost for both cloning and text-to-speech, especially for longer text inputs.

Conclusion

VoiceCraft Pro provides powerful tools for advanced voice manipulation, from creating highly realistic voice clones to generating customized speech from text. Its detailed controls and history tracking make it a comprehensive solution for professionals and enthusiasts working with AI audio.