Communications Lab: Hypercinema - Week 3

kantakanvay
Sep 22, 2022
2 min read

Synthetic Media

1. Synthetic Media Example: Never Before Heard Sounds/Holly+

Never Before Heard Sounds is a music studio founded by Yotam Mann and Chris Deaner that is powered by AI. It is designed for musicians to make use of machine learning in a fun and exciting new way. It not only makes voice cloning and adapting distinctive singing styles possible but allows you to manipulate sounds in countless other ways including changing the sounds of one instrument to another, timbre transfer (voice to an instrument), and rhythm generation.

Holly+ is a project by Holly Herndon in partnership with Never Before Heard Sounds. It's meant to be a first-of-its-kind digital twin of Holly, in the sense that we can upload any audio clip and it will be sung back in Holly's distinctive voice. Holly+ has been trained on hours and hours of Herndon’s speaking and singing voice. She created Holly+ in an effort to experiment with new technology while still retaining control of her digital self.

2. How was it made?

Never Before Heard Sounds started as real-time hardware that does timbre transformations on the sounds of a performer. They used machine learning on pre-trained models to automatically change what instrument it sounds like is being played.

In a later interview, the creators mentioned that the instrument is a custom machine learning model, trained on recordings that they have carefully sourced from permissively-licensed datasets or recorded themselves. The model resynthesizes any input audio using a neural net trained on vocal and instrumental recordings. The result is an AI interpretation of the audio which retains the pitches and rhythms of the original but adds the texture and timbre learned from the training set.

A few sample models they used when creating their Features Music track:

* Choir model, trained on unaccompanied church choirs

* String Quartet

* Alto and Tenor solo voices from the VocalSet dataset

* 8bit model trained on Nintendo game scores.

3. Ethical Ramifications

* Someone/multiple people 'owning' the sound of your voice.

* Others being able to copyright your voice.

* Musicians may not get paid if someone can just use their recorded instruments and/or vocal models.

Communications Lab: Hypercinema - Week 3

Recent Posts

Comments