‘It’s a Struggle’: Understanding the Challenges of Fact Checking Audio Content

As audio content has boomed in popularity over the last few years, so too have incidents of misinformation.

Although exact numbers are hard to come by, a preliminary analysis by the Brookings Institute last year found that one in 10 episodes of political podcasts contained potentially false information while false and misleading claims about COVID-19 and the effectiveness of vaccines have become all too common, buried in longer podcasts and voice notes.

This poses a serious and growing problem for fact checkers, who struggle to keep up with the output of almost 2.5 million podcasts and some 65 million episodes, according to Buzzsprout, and thousands of audio clips that go viral every day.

This week, hundreds of organisations gather in Oslo, Norway, for Global Fact 9, the world’s largest fact-checking summit, to share lessons and discuss how to respond to the growing challenge of audio misinformation.

Kinzen will be taking part in those discussions but, before we joined the wider community, we talked to three fact checkers about the obstacles they faced when dealing with misinformation in both podcasts and voice notes.

This blog post summarises some of the major issues we heard in those conversations, which we hope will help shape conversations this week and as the practice of audio fact-checking evolves.

Time-consuming and slow

Unlike text and images, audio does not provide immediate clues about whether a piece content is harmful and, unless transcribed, cannot be searched easily for keywords. This makes it difficult to monitor and analyse.

“With audio, you have to pinpoint to the exact minute or second when [misinformation] happened,” explains Doreen Wainainah, managing director of PesaCheck, an initiative of non-profit organisation Code for Africa that has a presence in 14 African countries. “There’s a lot of sifting through content”.

“There was a recent example with a radio interview of an aspirant for the Kenyan presidency. Somebody pulled out a clip from it and misinterpreted what he said. So we had to listen to the full hour-long interview to see if what he said was actually true. With images and text, it’s much more straightforward.”

Creating quality transcriptions is also difficult to get right and, even then, finding false narratives within that text is slow and manual work.

“It’s much more challenging to debunk false narratives shared on audio,“ explains Andrés Jimenez, debunking editor at Spanish fact checker Maldita. “On a three-hour podcast, you have to listen to the full thing to find one particular hoax. We’re definitely missing that [step]. I’m very aware we’re missing a lot of misinformation shared on podcasts”.

Limited tools to aid fact-checking

The sharp rise of podcasting and audio-focused platforms such as Clubhouse and Twitter Spaces means that fact checkers currently have a limited pool of tools to find problematic content.

Simon Muli, senior fact-checker for Pesacheck, uses YouTube’s transcribe function and Google Translate when fact-checking a viral voice note but is often none the wiser about the language being spoken.

“What I’ve discovered is that the structure of languages can be similar and that’s something that Google cannot always detect. I did a story on Madagascar and Google detected it as French but it was Malagasy. Better detection of languages is key [to being able to do my job].”

Transcription from the original into English or other languages is also a stumbling block for fact-checking teams.

“We’ve found more and more, in both video and audio content, that [what the] person is saying and what the transcription is saying are very different,” explains Wainainah. “We always have to check the transcription in detail.”

“Amharic is a major language in Ethiopia and, due to Ethiopia’s size, one of the most widely spoken in Africa. Yet most artificial intelligence tools struggle to pick it up or find the correct meaning. Swahili is also widely spoken but we cannot get a proper translation. We normally sit and laugh at the transcriptions coming from these tools.”

Maldita’s Jimenez uses a range of techniques to detect false narratives and echoed these concerns.

“There isn’t straightforward tools we can use,” he explained. “That’s a big issue because when we face a misinformation crisis like the one in Ukraine, the first three days we didn’t stop fact-checking and we didn’t have time to listen to a three-hour podcast. That’s just how it works. Moving forward, I hope there’s a technology that can help with that.”

He added that a viral audio targeting a candidate in the recent Andalucia local election took a particularly long time to debunk.

“They had gone to an old interview, taken several parts and put them together to make him appear to say something that he hasn’t said. When that happens, you have to transcribe those, find what the original interview was and work out what was said. The biggest challenge with that interview was 40-minute long so we had to listen to the whole thing.”

The sheer quantity of content

Audio, and in particular podcasts, boomed during the COVID-19 pandemic, as people around the world sought to fill the time spent locked down or off work. But that comes with new responsibilities.

“In Spain, we’ve had a recent increase in people consuming podcasts,” explained Andrea Arnal, a scientific fact checker at Catalonia-based Verificat. “It’s around 50% of the population so there are many people listening [to podcasts] but nobody is reviewing them. People are looking at social media, television, radio, and newspapers for the kind of misinformation that is there but not podcasts. It’s a huge audience that is increasing.”

Platforms such as Spotify and Clubhouse have clarified their community guidelines but ensuring users adhere to the rules is far from easy at scale.

Verificat and Kinzen partnered for a project on Mexico’s 2021 mid-term elections and will now work together to find false information about climate change in Spanish-language podcasts, funded by the International Fact-Checking Network. More information will be shared later this year.

Limited platform co-operation

The fact checkers we spoke to expressed a desire for greater openness and transparency with the platforms to aid their efforts keeping listeners safe online.

“We would like to know what they’re already doing about [misinformation]”, explained Jimenez. “Maybe they’re already bringing down podcasts with large audiences that share misinformation and we’re not aware of it. It would be useful to establish more fluid communication with these platforms”.

Part of that is knowing which podcasts have a large listenership so fact-checking teams can prioritise those for review, added Arnal.

“Data related to a podcast’s audience is not open, so it’s super hard to know if a podcast is listened to by a big audience. You can more or less know by the community that they have on social media but you cannot have a real number of listeners because platforms don’t share that. You have some lists but you have to pay for the data”.

The reflections of fact checkers working on the front line of the misinformation crisis serve as a warning and a reminder of how far tools and processes must improve if they are to be used to find harmful narratives before they have real-world harm.

‘It’s a Struggle’: Understanding the Challenges of Fact Checking Audio Content

What to read next