Part three: An illustrative demo measuring prominence of TV characters

From presence to prominence: How can computer vision widen evidence base of on-screen representation

The final part of this blog series shows two illustrative examples of measuring character prominence in broadcast TV. We start with a simple, more intuitive example to understand the measurement of screen time. It shows 30 seconds of an episode of the TV show, Mock the Week (BBC2), along with a chart showing which individuals are relatively more and less prominent.

Demo of measuring prominence through screen time

We apply a face detector, a type of machine learning model, to identify faces on screen. Faces are indicated with a rectangular ‘bounding box’. The sequence of detected faces are called ‘face tracks’. The same person can have multiple face tracks in a clip, for example if the camera cuts to them multiple times. Each contains additional data like timestamps and face sizes. This is then aggregated and processed to compute an overall measure of prominence.

We define prominence as time spent on screen (as a clear and big enough face), with longer duration indicating higher relative prominence. This ‘relative prominence score’ is mostly driven by screen time, but also combines different aspects of prominence: e.g. positive weights were added for a larger face in a sea of faces, as well as for longer periods of screen time when the face is a solo face on-screen. The number value is not meaningful on its own, but allows for an approximation of who the most (and least) prominent people are in a particular video clip or episode.

Demo where measuring prominence is more difficult

While panel shows usually have many clear faces towards the camera, other programmes can pose more of a challenge. We use an episode from the American sitcom Black-ish Season 1 (ABC) to test and discuss the feasibility of generating character prominence metrics when there is a greater variety of camera angles and face sizes.

The short video clip is similarly processed to create a relative prominence score for faces that appear on screen. The video is broken down into even smaller components called ‘scenes’, as shown in the bottom-left chart. In this short clip, the grandma (played by Jenifer Lewis) was the most prominent, followed by the grandchildren next to her. This ranking of prominence is based entirely on the visual information from the short video. It takes quicker than real-time to process when using only one image per second.

Naturally, there are limitations around measuring prominence according to a computer model: faces that are partially out of view or faces as viewed from the side are sometimes not identified, for example. Missed detections can also be due to a face being too small or blurry. Also, a character can be a scene stealer with limited screen time, or say or do something which is highly impactful. The prominence scores can be expanded to include information like who is speaking. To identify all occurrences (face tracks) of the same character, we manually annotated these for the short demo clips we show here. But recent techniques such as face clustering with unknown numbers of characters can help scale up this analysis.

Despite the limitations, the illustrative demo shows how computer vision can be used to measure relative prominence on screen. The method provides a ranking of more (and less) prominent characters which can be incorporated with further analysis to generate new insights about representation on-screen.

How to widen the evidence base with computer vision

The methods here can be extended in many directions: we propose potential applications to widen the evidence base around on-screen representation for four groups.

First, for diversity leads and monitors, computer vision could be used, supplemented with manual review, to generate more frequent and richer data about representation. Through plugging evidence gaps (beyond presence, and across under-represented groups), computer vision can generate real-time insights of on-screen representation in broadcasts. Measurements can prompt rethinking around the stories which are told and funded.

Second, for content producers, computer vision may represent new opportunities to create new features for viewers to look for major and minor characters.

It is not just the responsibility of the regulator or diversity groups to request and compile data. The content producer can potentially use richer data around character prominence to create new production features. This can generate additional value for viewers and fans, e.g. allowing viewers to interact with a visual summary of more and less prominent characters across a series.

Third, for editors and commissioners, computer vision can be used to analyse character prominence before a show airs. Currently, evidence is gathered long after the broadcast date. As processing an episode for prominence metrics is quicker than real time, especially with downsampling, it is possible to run it post-production, so some concerns around representation can be addressed upstream. These methods can be helpful during commissioning, or during screen-writing or editing in between series.

Finally, researchers can form partnerships to better understand the models under which content rights holders can open up broadcast data for research, so that collections can be more commonly treated as data, and responsibly opened up to answer important social research questions.

For these areas of applications to develop, a great deal more research is needed addressing the ethical and logistical barriers to wide deployment across different types of programmes. Current benchmark datasets, which research methods are evaluated against, often include just a few TV programmes. Faces on screen have great variation in viewpoint, head pose, face size, skin reflectance and lighting. We need a better understanding of how these methods scale to a range of programmes. More annotated datasets can be shared and data standards for on-screen representation can be developed.

Inclusion is more than just numbers

Increasingly, screen industry bodies are formally addressing diversity. In 2020, for example, the new BAFTA diversity steering group was established and several broadcasters (the BBC, Channel 4, ITV, and Sky) renewed their inclusion and diversity commitments, all referencing the global anti-racism movement.

In this series of blogs, we focus on diversity data and computer vision. This is because a big part of evidencing progress revolves around effective measurement. But it bears repeating that representation is just one part of inclusion. And inclusion is of course far more than a numbers game.

If measurements are used for box-ticking, resulting representations will be tokenistic. Truly embedding inclusion into cultural production requires pushing systemic levers: such as re-evaluating creative risk, giving opportunities to and funding storytellers from different backgrounds, and regularly re-thinking what stories are worth telling and how these should be told. The ultimate goal of the measurements discussed here is that no group of people persistently feel mis-/under-represented by our mass media. As the BFI says, inclusion “fuels creativity, engages new audiences and makes good business sense.” A measurably more representative broadcast landscape is one step towards that goal.

The broadcast content we used was made available via the Learning on Screen’s BoB archive with permission from the Educational Recording Agency.

Mock the Week (Season 15 episode 6) was shown on BBC2 in March 2017 (originally aired in July 2016). It was produced by Angst Productions and Ewan Phillips. Black-ish (Season 1 episode 10) was shown on E4 in June 2019 (originally on ABC in December 2014). It was directed by Elliot Hegarty and produced by ABC Studios.

The next issue of ViewFinder – Learning on Screen’s specialist online magazine dedicated to the moving image and education – will be exploring AI and its relationship with audiovisual media.

Photo by Joseph Redfield

Related Research Reports

Image of video game controller under pink lighting with a purple background

International, Trade, and Immigration

The impact of overseas mergers and acquisitions on UK video games industry

A new scoping study on the economic consequences and potential market failures The BFI’s Resear…

/ International, Trade, and Immigration, Research

International, Trade, and Immigration

Post-Brexit migration and accessing foreign talent in the Creative Industries

The UK’s departure from the EU has changed the way that British firms trade and work with Euro…

/ International, Trade, and Immigration

International, Trade, and Immigration

12 facts about the UK’s international trade in creative goods and services

Worldwide exports of creative goods exceeded 500 billion USD in 2015, with a 150% increase since 200…

/ International, Trade, and Immigration

International, Trade, and Immigration

The migrant and skills needs of creative businesses in the UK

This report details the results of a survey of employers commissioned by the Creative Industries Cou…

/ International, Trade, and Immigration, Skills, Jobs and Education

Authors

Raphael Leung

Data Science Fellow at Nesta
View all posts
Bartolomeo Meletti

Creative Director for CREATe at the University of Glasgow
View all posts

Related Research Reports

The impact of overseas mergers and acquisitions on UK video games industry

Post-Brexit migration and accessing foreign talent in the Creative Industries

12 facts about the UK’s international trade in creative goods and services

The migrant and skills needs of creative businesses in the UK

Authors

NewsletterSign up

Newsletter
Sign up