Meta Platforms, the parent company of Facebook, Instagram, WhatsApp, and Oculus VR, has released a voice cloning program called Audiobox. Audiobox is a free program that can replicate a person’s vocal stylings and generate custom audio using voice inputs and text prompts. It is built upon the self-supervised model Audiobox SSL, which uses self-supervised learning to generate its own labels for unlabeled data. The researchers relied on 160K hours of speech, 20K hours of music, and 6K hours of sound samples for training Audiobox. Meta has also released interactive demos to showcase the capabilities of Audiobox, allowing users to clone their own voice or generate new voices from text descriptions. However, the demos are restricted for commercial use and are not available to residents of Illinois or Texas due to state laws. Meta plans to invite researchers and academic institutions to conduct safety and responsibility research with Audiobox in the future. Commercial versions of voice cloning technology are expected to emerge soon.
