DeepDrake ft. BTS-GAN and TayloRVC: An Exploratory Analysis of Musical Deepfakes and Hosting Platforms

Abstract

Recent advancements in voice conversion and text-to-speech technology have facilitated the creation of musical deepfakes, audio tracks featuring the voices of celebrity artists—typically without the artists’ involvement. Several deepfakes have already gone viral, leaving the music industry scrambling to sort out the potential impacts. While the media have primarily focused on specific high-profile incidents, there has been less attention from journalists and researchers surrounding the broader trends in musical deepfakes, including the communities creating them, the modeling techniques that they employ, and the sites on which they congregate. In this paper, we investigate two leading sources of musical deepfake models, the AI Hub Discord server and the Uberduck website, which are dedicated to the training, utilization, and distribution of these deepfakes. Interestingly, musical deepfakes target hundreds of artists of different backgrounds, levels of success, and musical styles. In light of the economic, legal, and ethical issues raised by deepfakes of so many artists, we provide warnings about the generation of discriminatory forms of content and potential financial and contractual problems for artists. We recommend more research should be conducted in this area, especially to probe peoples’ perceptions of this technology and devise approaches that mitigate potential harms

Publication
In the Workshop on Human-Centric Music Information Research
Chris Donahue
Chris Donahue
Dannenberg Assistant Professor