Voice Cloning Process:
- Data Collection: The process begins with collecting a substantial amount of audio samples of the target voice.
- Preprocessing: The collected audio data is then preprocessed. This involves cleaning the audio (removing noise, etc.), segmenting it into smaller parts, and sometimes converting it into a spectrogram or other formats suitable for analysis.
- Training the Model: An AI model, typically a deep neural network like a convolutional neural network (CNN) or a recurrent neural network (RNN), is trained on this preprocessed data.
- Synthesis: Once trained, the AI model can generate speech that mimics the target voice.
- Refinement and Evaluation: The synthesized voice is refined for naturalness and fidelity to the original voice
Vulnerability of Indians to Voice Scams:
- A significant percentage of Indians would respond to urgent financial requests from family or friends, as per a McAfee report.
- Common scam tactics involve messages claiming robbery, accidents, or urgent travel needs.
Regulatory Actions and Challenges:
- The U.S. FTC launched a Voice Cloning Challenge and is considering an Impersonation Rule to combat deceptive voice cloning.
- Rapid advancements in AI voice cloning technologies are outpacing regulatory efforts.
Why in News:
- Recently a report titled ‘The Artificial Imposter’ highlighted that in India, 47% of surveyed individuals have been affected by AI-generated voice scams, with the country experiencing the highest number of such frauds globally.