The term “big data” may bring to mind swaths of private information held by tech companies. But lots of big data is, in fact, visible to all – we just may not think of it as “data”.
If you’ve been to the movies recently, you will have seen a dataset of credits – listing the cast and crew members alongside their roles. While the credits from any one film may not be that useful, the credits from every film can form a big dataset. At Nesta and the PEC (a new policy and evidence centre for the creative industries), we have been exploring how these types of non-confidential big datasets can shine new light on gender representation in the creative industries.
Gender representation has traditionally been gauged using surveys of workers. But most surveys haven’t been going for that long and it can take several years (after launching a new survey) before we can tell how the gender mix is changing. Also, surveys often don’t go beyond counting the number of women and men – and so can’t shed light on how prominent each group was in the creative process, or how they were portrayed in a particular art form.
Digging deep
We looked recently at the media’s reporting of women in the creative industries using more than half a million articles from The Guardian newspaper, published between 2000 and 2018, from sections of the paper relating to the creative industries (such as Books, Film, Fashion and Games).
In the past five years, there has been a large increase in references to women. From 2000 to 2013, less than one-third of gendered pronouns within articles (for example, “he” and “she”) referred to women. But this began to change in 2014 – and by 2018 the percentage of gendered pronouns that were female had reached 40%. By contrast, the gender mix among workers in the UK’s creative industries has remained flat in recent years, and sits at around 37%.
We also studied the words that followed the pronouns “he” and “she”, to gain insight into the media’s portrayal of creative workers. This led us to discover that, compared to men, there was greater focus on particular sounds made by women, such as “laughs”, “cries”, “giggles”, and “coos”, and non-verbal reactions, such as “smiles”, “grins” and “nods”. These words were never used frequently, but when they were used, they were more likely to be referring to women than men (compared to other words).
In contrast, words relating to past creative achievements and leadership activities more frequently referred to men. For example, you’re much more likely to see “he directed” than “she directed”, and similarly “he performed”, “he designed”, “he managed” and “he founded”. This finding is consistent with the long-running gender imbalances in the creative industries.
In another study, we used a dataset from the British Film Institute (BFI) that contained the credits from every UK feature-length film released to cinema.
After the BFI inferred people’s gender from their first names, we found that the on-screen gender mix hasn’t changed meaningfully since the end of World War II – and in 2017 women still only made up around 30% of cast members and 34% of crew members.
This dataset also showed gender-based differences in the jobs of on-screen characters. Since 2005, for example, only 16% of on-screen “doctors” (in unnamed roles) have been played by women, which jars with the fact that women make up 46% of doctors in the UK.
Creative fairness
We are by no means the only researchers showing the potential of non-confidential sources of big data to inform gender metrics in the creative industries. Researchers at Google, in collaboration with the Geena Davis Institute, used facial and speech recognition technology to show that in the 100 highest-grossing live-action films in the US, in each year from 2014 to 2016, women occupied just 36% of screen time and 35% of the speaking time.
While big data studies can enrich diversity measures, there are two important sources of potential bias. First, we’re almost always inferring gender – from a face, a first name or a single pronoun – and so we may get a person’s gender wrong. Second, these inference methods typically only detect “male” and “female”, excluding or misclassifying anyone who identifies with a non-binary gender. For these reasons, big data methods are not a replacement for surveys – as surveys allow people to self-identify and opt out entirely.
Even bearing in mind these potential biases, there are still many big data sources that could shed new light on gender imbalances, if only they were made available to researchers. For example, access to the stills and subtitles of films and television programmes could be used to evaluate diversity schemes, while access to the content of more newspapers would enable a broader study on the media’s reporting of creative workers.
To realise the potential of these new methods, we need to encourage and support creative organisations to securely share their non-confidential data. That will hopefully allow researchers to get a little more creative about measuring gender equality in the UK’s creative industries.
First published by The Conversation on 28th August 2019.
Related Blogs
Creative PEC: A Year in Review
Looking back at Creative PEC in 2024 – a year of policy, research and industry achievements, events,…
Measuring the economic value of digital culture
What are consumers are willing to pay for digital streaming services, and how do we measure it?
Lifelong learning in the creative industries, part 2: the solutions
In part 2 of the blog, our Industry Champions discuss possible solutions to the challenges of lifelo…
Lifelong learning in the creative industries – part 1: the challenges
Our Industry Champions discussed the challenges faced in creative industries education and lifelong …
Creative Corridors: Connecting Clusters to Unleash Potential
Introducing the Creative Corridors framework.
Creative UK Access to Finance Survey: Share Your Views
Professor Hasan Bakhshi, Director Creative PEC and Josh Siepel, Research Lead, R&D, Innovation a…
Unlocking the power and potential of the U.S. creative industries
Cellist Yo-Yo Ma in conversation with Upstart Co-Lab Founding Partner Laura Callanan at “Inves…
Reflecting on a year of State of the Nations reports
We’ve now published a full cycle of our new ‘State of the Nations’ series – which use th…
Copyright protection in AI-generated works: Evolving approaches in the EU and China
Prof Kristofer Erickson discusses the different approaches the EU and China have taken in response t…
Introducing the World Creativity Organization
Edna dos Santos-Duisenberg (member of Creative PEC's Global Creative Economy Council) & Lucas Foster…
Island in Transition: The Journey from Reggae Music Mecca to Creative Economy Hub
Andrea Dempster Chung, Co-founder and executive director of Kingston Creative A blog from Creative P…
UK engagement in Central Asia: Education and the creative economy in the territories of the ‘new Silk Roads’
Dr Martin Smith and Dr Gerald Lidstone look at the history of the British Council's work in Central …