How to Detect a Disinformation Campaign

Groups have been running disinformation campaigns on Facebook and Twitter for almost as long as those platforms have been popular. In this episode, we talk to Ray Serrato, Gianluca Stinghini, and Fabio Giglietto about how they find suspicious behavior they can tie to campaigns in places from Myanmar to Italy to the United States.

Guests: 

Ray Serrato, Gianluca Stringhini, Fabio Giglietto

Subscribe directly on Apple Podcasts or Spotify, or via our RSS feed. If you're enjoying the show, please be sure to leave the show a rating on your podcast provider of choice.

Groups have been running disinformation campaigns on Facebook and Twitter for almost as long as those platforms have been popular. In this episode, we talk to Ray Serrato, Gianluca Stinghini, and Fabio Giglietto about how they find suspicious behavior they can tie to campaigns in places from Myanmar to Italy to the United States.

Transcript

Gianluca Stringhini

So, if you're only looking at YouTube, that's it. We see a number of comments coming out of the blue, but it's difficult to figure out where are they coming from, what's the purpose behind them, and all of that. Instead, what we did was looking at YouTube links being posted on some of these communities for 4chan's politically incorrect board, with the goal of coordinating a harassment attack, a hate attack, and so on.

Introduction

You’re listening to Viral Networks, a look at the current state of mis- and disinformation online, with the scholars studying it from the front lines. We’re your hosts, Emily Boardman Ndulue and Fernando Bermejo.

Emily Boardman Ndulue

Hi everybody, and welcome back. You just heard from Gianluca Stringhini, a researcher from Boston University studying how communities online make deceptive content go viral.

Fernando Bermejo

Today on Viral Networks we’ll be talking about deceptive behaviors, the B in Camille Francois’s Disinformation ABC. Behaviors is a broad category. We’ll touch a bit on disinformation content and the people responsible for it, but mainly in this episode, we’re focused on how it moves around.

Emily Boardman Ndulue

That’s right. The big question today is: how does disinformation go viral? It certainly doesn’t happen by accident. If you’ve seen a viral piece of disinformation, there’s a good chance that it was the result of a coordinated disinformation campaign, and today we’ll be talking to three people who are trying to understand exactly how that coordination works.

You’ll hear from Fabio Giglietto, who studied the spread of COVID-19-related mis- and disinformation in Italy, and Gianluca Stringhini, who has developed methods for detecting how such campaigns are usually developed on one platform and carried out on another.

Fernando Bermejo

But first, we wanted to understand the basics of coordination and inauthentic behavior with the help of someone who has been studying the phenomena since before there was really language to describe it.

Emily Boardman Ndulue

That would be Ray Serrato Ray works at Twitter now, but for years he’s studied influence operations in Indonesia, Pakistan, Sri Lanka, and Myanmar.

Ray Serrato

I am Ray Serrato. I’m an open source investigator using data analysis and open source research methods to investigate human rights violations, information operations and maybe more broadly abuse that takes on platforms.

I think the term that I like most is maybe influence operations or information operations. Influence operations’ it primarily describes kind of a coordinated effort or aim to manipulate or in some way impact public debate for some kind of goal. That goal can be political, social, or economic. That’s what I would call an influence operation.

And I think at first, post Brexit, post 2016 elections, there was this singular focus on fake news, that was the watchword all over the world I think. A framework began to emerge where I could understand this a bit better, personally. I kind of realized what was happening but I didn’t think I could name it properly in a way.  I think back then maybe in 2015, I wasn’t reading a lot of literature. Well, there wasn’t a lot of literature anyway on the topic, at least as it relates to information operations.   .

Fernando Bermejo

2015 was the year that Ray started looking at disinformation campaigns in Myanmar surrounding the country’s elections. We wanted Ray to start at square one explaining what he found.

Emily Boardman Ndulue

So what characteristics are you looking for in your research when you’re trying to identify coordinated behavior?

Ray Serrato

So I can tell you that the first time that I looked at data, was again around 2016, 2017, where I was looking at Facebook posts from Myanmar – from Facebook groups in Myanmar.

Back then, Facebook had a rather open API, so you could retrieve data from groups, and that data would include the posts, it would also include user IDs, timestamps. Today you can’t get user IDs out of posts right, but it would include user IDs. You could attribute, you could really look at the frequency of maybe somebody posting in a group, or a page posting. But you know someone posting in a group in a way that you could look at the frequency of posting of an account on Twitter.

So that was quite fascinating because you could look at some of this activity and say to yourself, “OK do I see a number of accounts posting repeatedly in a group with a specific frequency or time? How does this differ from what the average person posts in the group? Since you had the user IDs, you could also look at a manual investigation and look at the accounts themselves to see you know are these accounts being operated by real people? Is there some form of automation?

So that’s one aspect of it is indeed to look at temporal activity. I think what I’ve learned recently is that coordination is not only about temporal activity, but it could be coordinated narrative messaging.

But when you looked at the content of the post, the text was almost either identical or nearly identical. The images that they used for their texts — so the images or banners – were sometimes identical and sometimes slightly different. And this was the only thing that linked them together. In my view, it was sufficient to say that it appears these accounts were engaging in some kind of coordinated messaging, simply based on the content, and I would say also the near similarities in that content and the you know images and posts. It seemed they were drawing from a kind of like, “Here’s our marker. Here’s our board of social media posts that we mark up, and here’s how you can go ahead disseminate them.

Emily Boardman Ndulue

Are there ways to know if those coordinated either narrative- or timing-based posts are coming from humans who are working together in some type of network, or whether they’re bots? What is the entity that’s pushing this content and how do you investigate that?

Ray Serrato

You know that’s actually one of the limitations of this work, I think as a public, researchers use the data that platforms release, is that coordination patterns alone are really insufficient for detecting an information campaign that is directed by a state actor. I think trying to attribute a specific activity to any specific actor is extremely difficult, unless they leave other kinds of traces. I think increasingly, that doesn’t happen anymore. By traces, I mean, are they reusing content that was used in a previous disinformation campaign in some way? Maybe they were using memes that were used previously. So the IRA did that just in 2019, if I recall.

Emily Boardman Ndulue

That’s an important takeaway. While researchers can trace coordinated behaviors, they can’t necessarily attribute campaigns to specific actors. There’s no way to determine who is behind a, say, a batch of misleading memes, if there isn’t some kind of direct evidence tying disinformation campaigns to specific actors. And Ray wanted to be sure to point out that it’s absolutely not always Russia’s Internet Research Agency planting disinformation campaigns in the West.

Ray Serrato

I believe the Capitol riots in some way, people are now, which I find it a bit odd, like ‘Oh yeah, it was domestic disinformation that was all along the problem in the U.S.” I guess because the year before I worked on the EU elections and all the stuff I had seen then was domestic in origin. It was things that were happening in Poland online were largely the result of people in Poland who operated these things. The same in Spain, UK the same, France the same, for the EP elections.

So that’s one where I thought OK, it seems that I think the myth is being obliterated that disinformation operations are the realm of Russia and China. And I think in part now at least the Capitol riots are something that people are seeing. Not only the Capitol riots but also you know anti-lockdown protestors, COVID-19 conspiracists. I think people are waking up to the idea, to the reality of that this is coming from inside the house, not outside. I guess the one thing to emphasize is that because we have those vulnerabilities that they are easily exploited by external actors.

Fernando Bermejo

Ray’s point there is well taken. Disinformation can be home grown, and sometimes the people creating it are groups of politically motivated users who gather on social media networks.

Emily Boardman Ndulue

Our next guest is Gianluca Stringhini, who studies disinformation campaigns. He found that often groups of users will target specific social networks where they’re confident some content will go viral. The interesting part, though, is these groups get together on social media to organize these campaigns.

Fernando Bermejo

In other words, Gianluca doesn’t just study how disinformation goes viral, but how it leaps across platforms.

Gianluca Stringhini

I'm Gianluca Stringhini. I'm an assistant professor in the electrical and computer engineering department at Boston University. And broadly speaking, I do research on data trade and security. So my work is on analyzing large scale data sets to figure out and mitigate all sorts of bad activity online, which disinformation and misinformation is one of them.

We always need to be careful, because coordination is not necessarily an indicator of bad activity happening. So if you see probably a lot of activism and political campaigning and whatnot, may look quite similar to coordinated disinformation, if you don't look at it close enough, right? So we always need to keep that in mind.

So the big challenge of web content in general is that it's impossible for people to tell where something comes from. And they kind of need to take it at face value, right? So if I see a new story. Just the text, imagine for now, I kind of have either to believe it or not. I don't have much infoma... Maybe I need to base my judgment on my past experience, or whatever, my view of the world or whatever. So, automatically identifying if something is misinformation or not just based on content is very challenging and honestly I'm not sure if it's even possible.

We always need to be careful, because coordination is not necessarily an indicator of bad activity happening. So if you see probably a lot of activism and political campaigning and whatnot, may look quite similar to coordinated disinformation, if you don't look at it close enough, right? So we always need to keep that in mind.

So in our work, we focus on kind of taking a holistic view of the web and web communities, so we don't look at single online communities like Twitter, or Reddit or Facebook. But we look at how different communities discuss topics and organize activity and basically influence each other. So what we found in our work is that basically, there are polarized online communities, where a lot of these hateful activities actually organized, orchestrated, as well as conspiracy theories.

Emily Boardman Ndulue

Gianluca gave us examples of a few of the more influential polarized online communities, and they’re ones you’ve probably heard about in the news: 4Chan, 8chan, Parler, Gab, and subreddits like r/The_Donald.

Gianluca Stringhini

So there is this whole conspiratorial type of talk and toxic type of behavior and so on, which eventually trickles to mainstream communities. And these can be people organized to go and attack, people or mainstream communities go and post toxic content and harass whatever YouTube posters, or Instagram stars, or Twitter personalities, or whatever. But it can also be them actually going and pushing these narratives that somehow became mature on these communities and actually push them to the mainstream.

So if we look at a single community, and we base all our research on the single community, we miss a lot of context. So you can think of it as like we look at Twitter, we look at how misinformation spreads on Twitter. That's kind of the tip of the iceberg. We see popular accounts on Twitter posting about all sorts of crazy conspiracy theories. But actually, they did not come up with them, right? So these conspiracy theories were created and discussed and sort of evolved from other communities. And so by putting all these things together, we can actually get a better context and say, "Yes, that specific politician posted this crazy conspiracy theory, but this actually comes from 4Chan, or comes from r/The_Donald or whatever.

Fernando Bermejo

Can you explain a little bit about that work, and how you guys detected that, and what are the signals that are pointed to you that there was some coordination going on?

Gianluca Stringhini

So imagine that there is a YouTube video that suddenly starts receiving hate comments. So, if you're only looking at YouTube, that's it. We see a number of comments coming out of the blue, but it's difficult to figure out where are they coming from, what's the purpose behind them, and all of that. Instead, what we did was looking at YouTube links being posted on some of these communities for 4chan's politically incorrect board, with the goal of coordinating a harassment attack, a hate attack, and so on.

So what we would see is that there is a thread being posted with a link to this YouTube video, which typically starts with something like "You know what to do," or whatever. And then we will see these comments appearing, but you would see that there is basically a temporal correlation between the discussion on the 4chan thread and the comments appearing on YouTube. So what we did as engineers, we modeled these as signals. So you can see the timing of the comments being made on the 4chan thread, and the timing of the comments on the YouTube video as signals, and you can look for coordination.

We have a lot of techniques that can tell you whether two signals are coordinated, synchronized and all of that. This is interesting, because this allows us to quantify how synchronized YouTube comments are to a 4chan thread. And what we find is that the closer the synchronization, the higher the hate speech, basically. So we find that if the video is posted on 4chan for any other reason than calling for an attack. People may go and comment there, but the hate speech will be low. While if actually the purpose is a coordinated attack, we see a spike in hate speech.

Fernando Bermejo

But again, Gianluca isn’t necessarily interested in the content itself. Understanding how to detect coordination could be just as helpful for platforms trying to moderate disinformation.

Gianluca Stringhini

But these coordination measure allows us basically to identify these attacks without even looking at the hate speech. And so this solves a number of problems because language is difficult to model and there are newsletters appearing all the time, which we may not be modeling and all these kinds of things, and different communities have different jargon. So they might use different terms, and so on. So just by looking at these coordination, we can identify which videos are being attacked.

And then the next step there, so once we identified these videos that have been attacked, we took a step back and we looked at the videos, and tried to figure out which characteristics do these videos have, which may tell us something about the type of videos that are actually being attacked. So can we use this to develop tools that actually will tell YouTube, as soon as somebody uploads a video, you should be careful, because this video may actually attract certain types of attacks. So you may be whatever, give priority to it when doing moderation and things like that.

Fernando Bermejo

Gianluca finds hope in techniques like this, since it is much more difficult for an automated moderation system to detect disinformation when it’s embedded in a video or meme.

Gianluca Stringhini

Instead of reading a wall of text, can you convey the same information with an image, or a meme, or a short video and so on? And that is happening, and we are seeing that happening in a number of contexts, including mis and disinformation. And so the challenge there becomes, can we make sense of other types of media, and analyze them at scale and so on, and figure out how things are actually spreading online.

Emily Boardman Ndulue

Gianluca described to us an especially sophisticated and networked approach to coordinating viral campaigns. What you have to understand, though, is that there is no playbook for running one of these campaigns. Oftentimes, actors have to adapt to ways platforms are changing.

Fernando Bermejo

That’s right. Social media companies make attempts to silence user accounts, URLs, and even types of posts often associated with disinformation. The people pushing that content therefore have to adapt.

Emily Boardman Ndulue

Our next guest Fabio Giglietto studies how actors respond to social media platforms attempts to crack down on coordinated behavior. In the past few years, his research has focused on COVID-19 mis- and disinformation in Italy, but his findings apply to a much wider variety of content too.

Fabio Giglietto

I am Fabio Giglietto, associate professor at the University of Urbino, and I’m working in the field of mis- and disinformation since the last few years.

When Facebook started using the term coordinated inauthentic behavior they hit an interesting point. Maybe they know about it, maybe they were just lucky, but the point is that looking at the coordination between social media actors writing certain content, it is definitely one of the signals that you can use to spot mis- and disinformation.

And it is a useful kind of signal, not only because it is effective but also because it is completely different from the traditional signals used in this kind of field. So basically, most of the time we deal with analysis of content because you want to know if something is true or false, or something in the middle.

Fernando Bermejo

In 2020, Fabio and his colleagues examined what they call coordinated link sharing behavior – in which networks of public accounts on Facebook share the same news articles around the same time.

Fabio Giglietto

So for example, you deal with the list of actors, well-known for having spread disinformation before. In both cases the analysis needed to reach the point where you can actually use this kind of approach requires a lot of work because even to analyze a single content requires a lot of work from people trained to fact check things.

Emily Boardman Ndulue

The challenge with the research, though, is that oftentimes those fact-checkers and researchers like himself aren’t as fast as the bad actors.

Fabio Giglietto

It means that someone fact-checked certain contents and they end up finding what Facebook calls repeated offense – so users, websites, pages which repeatedly publish false information. The problem with this list is that this kind of list tend to get outdated very soon. At least this is what comes out from our experience when we look at the history of one specific network.

You can clearly see that the same network of pages change the domain they share multiple times over the years. What happens is that when a domain becomes too well known for its fame to spread problematic content, they simply drop that domain and they move to some other domains.

They retain the power of the infrastructure they built in social media, for example a network of Facebook Pages, and they start sharing a different domain. So if you design a study which is based on a list of domain built two years ago, it is highly possible that you are basically observing something which is not operative anymore, and there is the risk of underestimating the phenomenon.

Emily Boardman Ndulue

Fabio and his team also have to contend with the challenge of actors changing their techniques to make disinformation go viral, often adapting to new constraints imposed by platforms.

Fabio Giglietto

What they are doing, these people now, is to avoid the post be recognized that there is a link, both because they probably want to avoid certain controls, but also because the alternative to posting a link, as a type link, is to post a photo and to add a link in the description of the photo.

And we all know that photos and images generally speaking tend to perform better within Facebook, so they get better performances for those posts. And the very latest evolution of this technique is to post the link in the comment of the post. So nothing more in the description. They just post the photo which usually includes a clickbait title, and an emoji pointing down, so as to say “Look at the link in the comment.” And they immediately publish automatically of course as the link as the first comment on the post.

Fernando Bermejo

It sounds like coordinated campaigns are constantly adapting to social media platforms. The actors behind them are always on the lookout for new ways to exploit these systems.

Emily Boardman Ndulue

Exactly. It’s why studying disinformation is by no means a simple task. Next episode we’ll zoom out a little further, trying to understand how these platforms are manipulated by taking a look at the viral content itself – and how that content is often tailored for specific platforms.

Fernando Bermejo

Until next time, thanks for joining us on Viral Networks.

Credits:

Viral Networks is a production of Media Ecosystems Analysis Group. We’re your hosts Emily Boardman Ndule and Fernando Bermejo. All episodes are produced and edited by Mike Sugarman. Julia Hong joined us as a script writer and provided additional research. Music on this show was composed by Nil and our producer Mike. Funding to produce this series was provided by the Bill and Melinda Gates Foundation. And last but certainly not least, we want to give a big thank you to all of the experts who joined us for interviews on this show.