Last updated: Oct 20, 2020
For this sketch, I was thinking about time-stretched audio and video. At one point, slowing down or speeding up audio meant affecting its pitch. Now we have algorithms that are a lot stronger at granular processing, essentially holding the pitch in one place while stretching the texture of the audio. For video, our tools capture frames in sequence. Even an extremely high framerate camera is only capturing a higher number of frames per second——there is no way to stretch out the space between one frame and the next.
Machine Learning has recently improved our ability to slow down video in this manner however, but what does it mean for the algorithm to use information that is there to manifest an idea about what is not there? I slowed down a sequence from Mamoru Oshii’s film Angel’s Egg, that I’m linking here. The first time through, we see three shots at standard speed, around 24fps. Then it’s slowed down to 12.5% by moving from one frame to the next. Then it’s slowed down ten, then around a hundred, and finally close to one thousand times. The last 30-40 seconds of the clip are the algorithm interpolating between two frames, adding approximately one thousand frames of its own creation.