READ MORE: How an award-winning AI film was brought to life by text-to-video generation (The Next Web)
Sooner or later an AI, or several of them, is going to make an entire narrative film from script to screen. A step closer to that inevitable day has been provided by computer artist Glenn Marshall.
Marshall’s works are entirely created through programming and code art. In 2008 he won the prestigious Prix Ars Electronica for a music video he created for Peter Gabriel — unique in that it was created entirely out of programming and algorithms. He also created an AI-generated Daft Punk video.
READ MORE: This AI-generated Daft Punk video is the perfect tribute to the electro pioneers (The Next Web)
The Crow is a finalist for The Lumen Prize, considered to be two of the most prestigious digital arts awards in the world, and is also eligible for submission to the BAFTA Awards.
“I had been heavily getting into the idea of AI style transfer using video footage as a source,” Marshall told The Next Web. “So every day I would be looking for something on YouTube or stock video sites, and trying to make an interesting video by abstracting it or transforming it into something different using my techniques.
“It was during this time I discovered Painted on YouTube — a short live-action dance film — which would become the basis of The Crow.”
Marshall fed the video frames of Painted to CLIP, a neural network created by OpenAI.
He then prompted the system to generate a video of “a painting of a crow in a desolate landscape.”
Marshall says the outputs required little cherry-picking. He attributes this to the similarity between the prompt and underlying video, which depicts a dancer in a black shawl mimicking the movements of a crow.
“It’s this that makes the film work so well, as the AI is trying to make every live action frame look like a painting with a crow in it. I’m meeting it half way, and the film becomes kind of a battle between the human and the AI — with all the suggestive symbolism.”
Marshall says he’s exploring CLIP-guided video generation, which can add detailed text-based directions, such as specific camera movements.
That could lead to entire feature films produced by text-to-video systems. Yet Marshall believes even his current techniques could attract mainstream recognition.
Deep learning is not coming to Hollywood. It is already here.
See for yourself in the video below: