The AI Generation: Cinema and Audiovisual Tune into Artificial Intelligence

Federico Bo
5 min readNov 28, 2023
Image generated via DALL-E 3; inspired by the Lumière brothers’ early “animated views”.

This text is an excerpt from an article that will be published soon in the magazine of the Italian Sentieri Selvaggi film school.

It has been several years now that Artificial Intelligence (AI) has been used in the world of cinema and audiovisual by the major studios, streaming platforms and in general by the large companies in the audiovisual sector.

Why has AI, in particular generative AI, only now become a central topic among authors, screenwriters, actresses/actors, technicians? Why is curiosity about this latest revolution in the film and audiovisual sector tinged with fear and hostility?

One reason could be that the new AI arrivals seem to undermine a dogma, that creativity is an essential characteristic of human nature, not replicable by machines.

Yet in many ways — often unknown — this technology could democratize the production, distribution and promotional pipeline: until now reserved for major production and distribution companies due to the high costs and skills required to use them, the aforementioned resources (together with new possibilities) are now available for filmmakers and small and medium-sized companies in the sector.

Why has the quality of Artificial Intelligences in recent years made such a leap, considering that research in the field began about 70 years ago? Large Language Models (“LLMs”) are one of the reasons for this qualitative leap. These models — along with “diffusion” models — are at the basis of generative AI, able to produce new content (text, images, videos, sounds) starting from textual descriptions or requests. How do they do it? By identifying patterns and recurrences among the billions of data (for example texts and words) that make up their knowledge acquired during training.

Given a word, they infer (in a probabilistic way) what the next one should be. The process may appear “mechanical” (and indeed it is) and in no way intelligent, but it can produce interesting results. This “intelligence” does not directly interact with the physical reality (for now …) but is based on human knowledge, developing an indirect (“second order” perspective, as I defined it) and alternative perspective. It is not something totally alien nor completely familiar. It is “other”.

This sense of estrangement, this difficulty in framing Artificial Intelligences in existing categories, can make us suspicious of them. Even hostile. Considering them only as advanced digital tools does not help: their usefulness in accelerating the workflow, saving time and money, and offering new technical and creative possibilities is not perceived, as personal computers did forty years ago. What authors, screenwriters, visual effects technicians fear is being replaced by machines, while actors and actresses, extras, voice actors and even stuntmen are afraid of being cloned from live and dead counterparts into digital, economical and eternal servants.

Generative AI can however be very useful: it is able to create content based on a textual “prompt”; this makes it an excellent tool to help develop plots and screenplays, refine existing ones, suggest ideas and changes. The ability to transform text into images can help filmmakers, authors, production designers, directors of photography to easily visualize scenes; in a short time it will be possible to generate modifiable storyboards, for example to choose the right shooting angle. Even sounds and very short videos can be easily generated, allowing other professionals such as sound engineers, composers, costumers to have advanced drafts available.

In addition to generating special effects, algorithms based on artificial intelligence can analyze and modify footage, apply visual effects and assist in color grading, sound design and video editing. Although, as mentioned, some of these functions are already available in complex and expensive software used by major production and post-production companies, the innovation is that AI-based services will allow a significant reduction in costs as well as a simplification and speeding up of processes. Paradoxically, even films with certainly not meager budgets use these functions as needed: in the Oscar-winning film “Everything Everywhere All At Once” a popular suite of “magic tools” from the startup Runway was used to create a video that would have been too expensive and time-consuming to produce on a film set or as CGI effect.

AI-based generative algorithms can write production plans directly from screenplays, help predict box office and TV-VOD performances of similar content, anticipate critic reviews and audience engagement to provide insights and recommendations for writing, production and distribution strategies and revenue projections. With specific tools, casting and extras selection can also take place directly from scene analysis or search for locations, taking into account precise constraints (such as the need to shoot in a certain area or recreate particular settings). Production times can be optimized (think of the complex animation production cycles) or post-production processes (like visual effects) can be almost automatically and in real-time programmed and scheduled.

All this through the analysis of much less data than previously required: “fine-tuning” allows AI trained on huge datasets to focus on more circumscribed data sets, maintaining their broad knowledge but making them more precise and efficient in the reference context.

And what about content creation, i.e. plots and screenplays? AI can be used as a source of inspiration, as an assistant to storytelling, constantly interacting with it, refining some scenes, some dialogues, some settings. You can create a storyboard extracted from the screenplay or have a draft soundtrack composed to be developed by musicians. There are many potentials, some already concrete, others that will arrive with the evolution of models and software that will be implemented on them.

Remaining issues concern the training of LLM models, especially regarding copyright of the contents used and created, biases inherited by AI from the contents used for training, and new contractual forms that will have to be defined for authors, screenwriters and all those working in the audiovisual sector.

According to a survey by Variety, 30% of industry professionals and US companies are already using or plan to use generative AI within the next two years; 51% do not plan to use it in the short term. More or less, the situation is the same in Europe. However, speaking “off the record” with production company managers or filmmakers reveals that statements are influenced by the hostile climate and that, underground, the film and audiovisual industry is studying and experimenting with AI-based tools, techniques and processes.

In my opinion, within five years, they will be an integral part of the entire production and distribution workflow.

The introductory image was generated via DALL-E 3 and is inspired by the Lumière brothers’ early “animated views”.

Originally published on sophia.vision (Artificial Intelligence Audiovisual Lab)

--

--

Federico Bo

Computer engineer, tech-humanist hybrid. Interested in blockchain technologies and AI.