Episode 4: What You Think Is What You Find
We like to think of ourselves as savvy searchers, but the truth is that most of us have no idea how search engines work—especially given how much we rely on them. For example, do you know whether different people get personalized results for the same searches? What are data voids, and what do they mean for how we assess information we find online?
Search isn’t magic, and in this episode, host Francesca Tripodi discusses the ins and outs of how search engine algorithms work, how media manipulators game the results, and how our own perceptions and biases shape our results before we even open the search bar.
About our experts
Host: Francesca Tripodi
Francesca Tripodi is a sociologist and media scholar whose research examines the relationship between social media, political partisanship, and democratic participation, revealing how Google and Wikipedia are manipulated for political gains. She is an Assistant Professor at the UNC School of Information and Library Science (SILS) and an affiliate at the Data & Society Research Institute. She holds a PhD and MA in sociology from the University of Virginia, as well as an MA in communication, culture, and technology from Georgetown University. In 2019, Tripodi testified before the U.S. Senate Judiciary Committee on her research, explaining how search processes are gamed to maximize exposure and drive ideologically based queries. This research is the basis of her book, which is under contract with Yale University Press. She also studies patterns of gender inequality on Wikipedia, shedding light on how knowledge is contested in the 21st century. Her research has been covered by The Washington Post, The New York Times, The New Yorker, The Columbia Journalism Review, Wired, The Guardian and The Neiman Journalism Lab.
Guest: Clement Wolf
Clement Wolf is Google’s global public policy senior manager for information integrity. He helps develop policies, products, and initiatives across Google and YouTube to address misinformation; combat influence operations; and develop further dialogue with experts across civil society, academia, and government. Prior to this position, he advised Google’s Search and News teams on product and policy development and worked in communications at Google France. He was also a 2020 and 2021 Assembly fellow with Harvard’s Berkman Klein Center for Internet and Society.
Guest: Danny Sullivan
Danny Sullivan is Google’s Public Liaison for Search. His role is to help the public better understand how Google Search works and to engage with the outside community to hear feedback on how search can be improved. Sullivan joined Google in 2017, after retiring from journalism. He began his career in 1989 with daily newspapers, first with the Los Angeles Times and then the Orange County Register. In 1996, he started his own publication, Search Engine Watch, and a decade later launched Search Engine Land, which both focused on covering search engines and the search marketing space.
Guest: Eszter Hargittai
Eszter Hargittai is a Professor in the Department of Communication and Media Research at the University of Zurich, where she holds the Chair in Internet Use & Society and heads the Web Use Project research group. Hargittai’s research focuses on the social and policy implications of digital media with a particular interest in how differences in people’s Internet skills/digital literacy influence what they do online. Hargittai is editor of Research Exposed: How Empirical Social Science Gets Done in the Digital Age (Columbia University Press, 2021), co-editor (with Christian Sandvig) of Digital Research Confidential: The Secrets of Studying Behavior Online (The MIT Press, 2015), and editor of Research Confidential: Solutions to Problems Most Social Scientists Pretend They Never Have (University of Michigan Press 2009).
Guest: Jennifer Mercieca
Dr. Jennifer Mercieca is a historian of American political rhetoric and a Professor in the Department of Communication at Texas A&M University. She writes about American political discourse, especially as it relates to citizenship, democracy, and the presidency. Jennifer is the author of three books about political rhetoric: Founding Fictions, The Rhetoric of Heroic Expectations: Establishing the Obama Presidency, and Demagogue for President: The Rhetorical Genius of Donald Trump, which has been highly recommended by Politico; called a “must read,” by Salon; one of the “best books of summer” and the “most anticipated books of 2020,” by LitHub; and, “one of the most important political books of this perilous summer,” by the Washington Post.
CITAP panelist: Alice Marwick, Associate Professor – UNC Chapel Hill Department of Communication, CITAP Principal Investigator
CITAP panelist: Will Partin, PhD candidate at UNC – Chapel Hill, CITAP Affiliate
In this episode, we talked about or referred to:
- Searching for Alternative Facts: Analyzing Scriptural Inference in Conservative News Practices. Data & Society. Francesca Tripodi
- Demagogue for President: The Rhetorical Genius of Donald Trump. Texas A&M University Press. Jennifer Mercieca.
- “The basics of how Search works.” Google.
- Strangers in Their Own Land: Anger and Mourning on the American Right. The New Press. Arlie Russell Hochschild.
- Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press. Safiya Noble.
- How wide a Web? Inequalities in accessing information online. Princeton University (dissertation). Eszter Hargittai.
- “Searching for a “Plan B”: Young Adults’ Strategies for Finding Information about Emergency Contraception Online.” Policy & Internet. Eszter Hargittai, Heather Young.
- Data Voids: Where Missing Data Can Easily Be Exploited. Data & Society. Michael Golebiewski, danah boyd.
- “Stop the Presses? Moving From Strategic Silence to Strategic Amplification in a Networked Media Ecosystem.” American Behavioral Scientist. Joan Donovan, danah boyd.
- “How to Verify Online Census Media.” Data & Society.
Francesca Tripodi 0:00
We have to realize, it’s like search engines aren’t there for intellectual discovery or questioning, or existential beliefs. They’re literally programmed to best match the keywords we put into them, as well as the click-through traffic of thousands, of millions of other people in your geographic location.
Kathryn Peters 00:28
Welcome to Does Not Compute, a podcast about technology, people and power from the Center for Information, Technology, and Public Life at the University of North Carolina. Our host this week is Dr. Francesca Tripodi, an assistant professor in the School of Information and Library Science and a Senior Faculty Researcher here at CITAP. She’s exploring how we understand and misunderstand search engines, and how what we search for determines what we find. Our guests today are Dr. Eszter Hargittai, Danny Sullivan and Clement Wolf of Google, and Dr. Jennifer Mercieca. We’ll also talk to CITAP researchers Alice Marwick and Will Partin.
Many of us have probably heard about filter bubbles—those ominous silos that close us off from seeing information that tech companies think we’ll disagree with. Many people worry they’re killing democracy, but what if we have the power to pop them? Today, we’re going to peel back the layers of algorithmic ordering and consider the way in which our keywords are coated with bias before they even hit the search bar. It starts by understanding how search engines work, because most of us really don’t know what we’re doing when we head to Google or DuckDuckGo or Bing.
Jennifer Mercieca 01:42
One of the things that surprises my students every semester is how naive they are. These are people who have grown up online and they know not to trust anything. They’ve been told, “Don’t go to Wikipedia,” or whatever. They know all that stuff, the basic stuff that people have told them. What they don’t really realize is that like the info wars are ongoing every minute of the day, and every time you do a search, every time you hop online, every time you get on Facebook, every time you do anything on the internet, you are participating in the info wars. Most people, I think, are really naive to that. They don’t know that the way that search works is gamed.
That’s Jennifer Mercieca, she’s a professor of communication at Texas A&M and studies the way Trump used rhetorical strategies to gain followers. We talked about her research, but also about how little we understand the information systems we’ve become so reliant. It got me thinking, if college kids don’t get how search works, what’s the chance people with less formal education, or adults of legal drinking ages, get what’s going on? The word algorithm might seem complicated, but it’s not magic. It can be understood as a set of instructions given to a computer.
Since Google’s algorithms are proprietary, we can’t definitively know how the search engine models the internet to match users with relevant information, but we can interview people that do. I reached out to Google directly and I had the chance to sit down with Danny Sullivan, the Public Liaison for Search, and Clement Wolf, the Global Public Policy for Information Integrity at Google. I wanted to know, how does Google work really?
Danny Sullivan 3:36
We have a million people who are constantly looking at all the questions that come in and answering them in real time.
Just kidding. It’s not that weird or that reliant on human beings. That was just Danny making a joke. In reality, it is pretty simple. First, the company makes copies of webpages and stores them in an index, kind of like a big book of the internet. Then they narrow down the results to try and facilitate searches that are relevant and helpful.
Clement Wolf 4:06
As a search engine, you try to understand what people are doing, how the web evolves, how they create content, and try to then figure out what’s the most useful content for users in response to their searches.
This is where the whole idea of a filter bubble comes in. The company says it ranks information based on relevance, but through this matching process, it also controls what we see and what we don’t. The idea that Google might be hiding information from us is a huge matter of contention. Many people also think that if two people sitting next to each other search the same keyword, the company will return very different results, tailored to our perceived interests.
Yes, this is a big myth that the search results are personalized or tailored to individual people. They really are not. When you do a search, it’s very rare that the actual search results ranking and what appears will change. It’s extremely rare that it would change based on who you are.
That’s Danny Sullivan again, being serious this time. He says that our returns are much more likely to be driven by localization than personalization.
What tends to happen is that results can be localized. Someone who is in Des Moines, and someone who is in Los Angeles, both search for pizza and they get different results, or they both search for earthquakes and they get different results. Some of them will still have some commonality, but some of them will be different because the location makes sense. If I’m in California and looking for earthquake information, what resources I need tend to be much different than someone who’s in the Des Moines. It’s just the same thing if I want to order a pizza.
The only thing I would add is sometimes I believe, it is the case that your search may, depending on how it’s routed, and that infrastructure may hit a different data center, which might have a one millisecond different version of the cache or whatever it is, which might explain also some minor discrepancy when that happens. That’s very infrequent. All of this is to say, when you’re seeing a discrepancy with someone you’re side by side with, that’s more likely to be the explanation than anything else.
According to Google, it’s not giving conservatives and progressives different sets of information as some senators might have you believe. Now, that’s not to say our political leanings don’t keep us from seeing information that we might disagree with. To think more about why our starting points are so important, I spoke with Eszter Hargittai, professor and chair of Internet Use & Society, in the Department of Communication and Media Research at University of Zurich. She summed it up nicely.
Eszter Hargittai 6:45
What you think is what you get. I actually arrived at this topic through studying how people search for information, and through people’s information seeking. About two decades ago, when I was working on my dissertation, I was interested in seeing how people find content on a web that looked very different, but already had plenty of information. It wasn’t clear that different people would find the same information.
I put together a study where I came up with tasks for people to, so-called solve, basically information for people to seek out online and see if they could find the different types of tasks. Some of this had to do with how they would do their taxes, getting information about that, or finding information about emergency contraception, all sorts of questions. Some of these were more neutral than others. I think this actually dovetails well with what you work on, Francesca, which is really fascinating and important work about how someone’s political ideology influences what they search for.
What Eszter is getting at here is that how we see the world and the phrases we use to describe our positions, what I refer to as ideological dialects, shape how, what, and when we search. This is especially true when we seek confirmation and connection to what Arlie Hochschild calls, our deep stories, the ones we’ve heard and told ourselves for so long, they already feel like truth.
Take, for example, immigration in the United States. Does your deep story connect immigration to the American dream? Is it one that welcomes those foreign born and advocates for their rights, or is yours a story that feels threatened by the arrival of foreigners and is concerned about what they will do to your vision of the United States? These worldviews shaped the kind of information returned.
For example, if someone seeks out information about illegal aliens and voter fraud, you’re going to see very different returns than searching for something like, “let immigrants vote.” More importantly, these kinds of silos can’t be fixed by getting off Google. Keywords shape returns regardless of your search engine of choice, but really, what’s all the harm in this siloed thinking? Well, keywords can really lead you into some dark places on the internet, especially if they’re connected to dangerous conspiracy theories.
Joining me today is Eszter Hargittai. I wanted to talk with you about search engines and the important role of inputs. We think a lot about filter bubbles, but not too many of us consider the role we play in shaping these silos. You have done a lot of work regarding internet access, internet skills. I was wondering if you could just talk a little bit about your work. Your article about how people search for information regarding emergency contraception I thought was so informative. Would you mind talking just a little bit about that specifically? Sometimes what you see– well not what you see is what you get, but I don’t know, what’s a better way of saying it?
What you think is what you get? I have pulled up the paper– so it was published in 2012, but the data come from 2007 and 2008, so it’s really quite a while ago, and these were also college students, in some ways more educated than many other people out there.
Basically, what you were saying is, or what I got from this article, was that if people already knew about the morning-after pill or about emergency use contraception, they would Google things like, “Where do I go to get emergency contraception?”, and I’m sorry I said Google, but search for things. What your finding is though sometimes people use this prompt directly and would just do very general searches for prevent pregnancy, but that phrase was not really what they needed.
Exactly. Again, it goes really well with your work about how important those search terms are and how much that could be coming from a political perspective.
Some of the people sought out words that said adoption. They weren’t even in a mindset of abortion being a thing that you would look for. I would notice even in your article, just the way that you see the world really shapes these kinds of returns. Then in turn, the way you see the world, how you value that information, or how you framed your query, really does drive the kinds of returns that you receive.
Yes, absolutely. It’s also worth mentioning that it’s actually quite something that this came through in this dataset, which, just to clarify, is over 200 young adults, but they were all at an urban university, so they’re probably nowhere as diverse on the political spectrum as you could have in certain other areas of the country.
Also something I think that’s really shown the test of time is this obsession with the rank of results, that people really value the top information as somehow more accurate, more relevant, better information. You talk about this in much of your work. Why do you think people think that top return is so much better or factual?
As you see in your work as well, as we see in this paper, people do trust Google in particular. In this case, it really is that brand in particular that they’ve trusted for a long time. Presumably, that’s partly because they have been happy with the results. There was a time in there, and I think I may have done studies at the time and would have screenshots, when some websites were doing a pretty “good” job, good in quotes, manipulating Google search results and things were ending up on top that were actually definitely not the best.
A really important example of this which is no longer the case, but was the case for way too long was that when you search for information about Martin Luther King, Jr. You’d get this website called martinlutherking.org, which was sponsored by a white supremacist organization. Actually, many people would say, “Oh, .org, it must be reliable.” Just to bring that back to this paper, the condom breaking situation, that there too, we found some .org websites that seemed like they were trying to help people. It would come up when you would search for, say, the morning-after pill, but it was actually pointing you to adoption or something.
It was called the morningafterpill.org. I think what’s interesting is a lot of researchers right now are thinking about this gamification of SEO, but it’s been around for so many years. I think we aren’t really skeptical of digital first content and how it might also be manipulated in those same ways. Tagging it with that key content, tricking search engine optimization, that they rise to the top and really understanding the way people think about information. I think it’s extremely strategic that this organization that was a pro-life organization had purchased the domain name morningafterpill.org, knowing how the internet works and how information seeking works.
One of the things that we haven’t touched upon explicitly is the focus I’ve had on socioeconomic differences, and how educational background and other background factors, in addition to political ideology, which is your focus, influences what people search for, how they search for things, and then how they interpret what they find.
One of my favorite things that you talk about is like, it’s not just access, it’s how they assess information. It’s not just accessing information. It’s how are you assessing information? The second level digital divide, which I still think is such a fantastic way of thinking about digital inequality, that there are these multi– so many levels of skill that really thinking about it in terms of, “Well, do you have access to internet or not?”, just so far misses that point of how people interact and engage with online. This has really been delightful. Thank you so much for your time. It’s a pleasure.
Oh, it’s been so fun. Thank you so much.
Joining us today is Danny Sullivan, the public liaison for Search and Clement Wolf, the Global Public Policy Lead for Information Integrity, both at Google. Thank you so much for joining us today. One thing that I noticed in my research and many other information seeking studies is that people really don’t seem to know how search works. Might it be feasible for either of you to break it down for someone who might not have a computer science background, what are we really doing when we go to Google?
In its basic form, we go out and look for content across the web. We visit trillions of webpages. It’s just enough to go to the webpage, then you have to go back to the webpage, say, each day or every week, because they change a lot. We basically read the content that’s on the pages. We store the content of what we read, web pages or to the degree that we can, images, things that we understand from videos in what you consider to be a big book of the internet. We call it an index.
Then lastly, we can’t just give you 100,000 or 100 million pages that all match the words that you entered, so then we have to apply some ranking systems to it. That comes into where people sometimes hear about the algorithm. It’s really where the search engine gets its name from, that we searched through this index, and then we try to find the pages that we think are most relevant according to our systems that just determine, “Wow, did this page use the words you were looking for?” “Did it use the words higher on the page rather than lower?” “Did it use them a little more frequently than other pages, but not so frequently that you think someone just trying to game the situation?”
Also, what can we tell about the page to determine if other pages or other sites seem to think that it’s very relevant or very helpful information?
Francesca 17:53 This idea of gaming the system. Can you talk to us a little bit about, what does that mean?
From the early days of Google, it’s been the case that as you design a search engine people want to come up on top on searches that relates to their business, or their activity, or whatever interest they have. Some do this in the best way possible, by trying to create great contents and trying to and make sure it’s easily readable for search engines. Some of them don’t want to do this the right way, and are very key to try to deceive search engines, to understand exactly how search engines operates, what kind of criteria they look at, what kind of mechanisms they use, and then optimize for deceiving this mechanism into thinking that your page is extremely relevant, extremely authoritative, extremely high-quality, on topics that it may not be.
I’m trying to contextualize this idea of a filter bubble, and people are always like, “Oh, the algorithm’s not allowing me to see what I want to see.” Almost everybody I talk to is like, “Yes, but your search return is totally different from mine.” I feel like in some ways, that might be true, but also in some ways, that’s a gross misunderstanding of what Google’s doing.
There’s been this assumption that that’s the case, and everyone’s in filter bubbles and so, “You better use incognito mode so you can get away from it or whatever.” It’s like, you really don’t have to. While the localization will happen, it is the same localization for everybody in that area. Everybody in Los Angeles is not getting it. It’s like you would all see the same localization. That’s why it’s not personal because it’s not personal to you, it’s local to the area that you’re in.
One of my central arguments in thinking about filter bubbles is thinking a little bit less of the role algorithms play and a little bit more of the role the human plays and the context plays. We’ve touched about this a little bit, but I would love for one of you to walk through the importance of query, and why your search term matters.
The core of what we do still tends to be word matching. You enter words and we go out and we try to find content that is related to those words. We’ve gotten smarter over the years. One of the examples we use, if you type in like, “volleyball island movie”, we’ll figure out you meant Cast Away, even though you didn’t use the word cast away.
When someone does a search, one thing that’s really helpful for us is they can provide as much as they can, provide as much information as you can, however you can think about it, that may help us. The difficulty we can run into is that sometimes people will search for things, and there’s just not a lot of content that’s out there, so we might return it, or we used to return it.
We match the query you were looking for, but the information might not have been as helpful. It might not even been accurate. Part of what our systems try to do is figure out, “Wow, you’re searching for something, but we’re getting enough signals that make us think, even though you’re looking for these specific words, the helpfulness or the relevance of the content that may be coming might not be as helpful.”
In addition to sometimes there just not being very much information, there also seems to be a process where content creators recognize that, create a whole bunch of content so the automated system might not necessarily recognize it as signals bouncing around to nothing, but instead might return a ton of information that might not necessarily hold the highest level of information integrity. Have you run up against this, or how would you talk about this?
Sometimes it is the case that there is a deliberate strategy to crowd out of a space using terms, or context that perhaps echo these searches you expect to be popular and try to supercharge. First, I think, over the years, we’ve gotten better at anticipating this context, writ large. There was a time when Google bombs were really a thing. You hear much less about those these days, and there’s a reason for that, the ranking team has been hard at work.
There’s another example of a context in which that happened, which is the case of a change I believe we made two or three years ago now, to the way we display results in response to searches that arise after valid events. The long and short story of it is we took note that when such events occur, it’s typically the case that bad actors are faster to the keyboard than good actors.
That’s a known problem. The good news is that because it’s something that is definable, it’s something that our systems can try to understand it and address. Now, in response to such events, our systems try to emphasize authority, sometimes possibly at the expense of freshness, to make sure that we don’t inadvertently elevate random pages casting out someone’s name, absent more robust information in this space.
Obviously, nothing is foolproof, no ranking system is perfect. It gets a bit harder with terms that are those of a niche community, because one, they usually go fast and second, because the function of a search engine is to give people what they’re searching for. That’s not something we never, ever lose track of. Ultimately, you, the user, are in control of the results you see, and that’s good. That’s what we want to build and want to preserve.
As much as we can try to elevate authoritative information in response to searches, and when we have it, using things like, “Oh, maybe the words you think here is a synonym for that or the word we have better content for,” or whatever it is. Ultimately, if what you tell us is, what I really want to see is this piece of content, or the set of pieces of content that our systems tell us are not trustworthy or authoritative, well, that’s your prerogative. We should absolutely show them to you in the sense that this is what the search engine is for. I think that’s the balance we’re trying to reconcile.
When there are these diametrically opposed search terms that are really driven by the way you see the world or what I talk about, like the deep stories that animate our search. Deep stories being the whole notion of truths that you already know or you think you know. One thing I think a lot about is ideologically opposed language, pro-life, pro-choice, or there’s the language about this with gun reform, and estate tax, there’s so many things. What role, if any, do you think a company like Google has, in this idea of parallel internets forming around these very specific keywords or very just ideologically driven keywords?
Even though we’re doing a lot of work in these spaces, it’s not the case that we have this fully figured out. As ranking team would tell you that right here, like the amount of work that still goes into improving search is staggering and often also misunderstood. People tend to think of search as a solve problem, it is not. It continues not to be one for a very long time. That’s not because these are such hard questions in and of themselves, but because the nature of the web, the ways people create content, the ways people find information or like to ask for information keeps evolving. I’m pretty sure that we have this conversation again a year or two from now, we’ll have more things to say on both sides.
Awesome. All right, Danny, Clement, thank you so much for being here today. It was really fantastic to talk with both of you. Thank you so much for your time.
Here, we have today with us, Jennifer Mercieca, who is a professor of communication at Texas A&M. I wanted to thank you so much for taking the time to be with us. She has written one of the most fascinating books I’ve read in a long time, Demagogue for President. Something that I see a lot in the conservative media that I study is not only drawing on the rhetorical strategies that you outline in your book, but also this idea of don’t trust us, go out there and do it for yourself.
Something that, in my own research, I found that conservatives leverage a form of media literacy that I call Scriptural inference, which is this compare and contrast of textual documents in order to find the truth, and seek out the truth. Even though it’s linked to very Protestant biblical practices, it’s really trickled down into most of conservatism, constitutional conservatism. I think it’s really interesting and you hear Trump often evoke the same kind of idea. Don’t just trust me, go out there and do your own research.
I write about him as a connoisseur and a purveyor of conspiracy. He loves the stuff. He’s constantly recirculating it, amplifying it when it serves his needs. He didn’t ever urge his followers to do their own research, instead he positioned himself again, as this authoritarian truth teller, and always within the conspiracist frame. It was always like, “They don’t want you to know this. No one’s going to tell you this but me. Here I am telling you the truth and you’re not going to believe this.” He always positioned himself as the one who had done the research, and was now going to convey these truths to his followers. I didn’t actually hear him say like, “Look it up.”
One thing I noticed, even as he tries, I guess what I was trying to say is some of the ways that politicians who were involved in the Stop the Steal narrative for example, had been able to maintain a plausible deniability, because they aren’t saying this election was stolen. Some of them were, but they were also saying the facts just aren’t out, and we need to do more research on this stolen election.
Something that I think is really fascinating is most of the way people verify information is via Google. The way you see the world is really going to shape how you shape your query. If I see this as an election fraud, that’s going to very much shape the kinds of keywords that I put into it. I would love to hear your thoughts.
You’re right. The way that issues are framed, whether it’s by politicians or whether it’s by other kinds of institutions, media, advocacy groups, people who are trying to game the internet itself, people fighting the info wars, all of that is very strategic and it’s done very purposefully. Of course, you hear interesting phrase that somebody has probably engineered to be interesting and pique your curiosity. Of course, you’re going to throw it into the Google and see what comes up. Then, of course, you’re going to find the content that they have asked Google to show you. They’re really really good at it.
I would love to talk to you also a little bit more about the narrative laundering. Is a new concept for me. I’d like to think more about it, if you don’t mind talking about it.
My next project is the way that propaganda works today and how different it is from persuasion. I’m somebody from a communication department, so we are always talking about effective and ethical communication practices in the things that we study, and the ways that we try to understand communication. Propaganda is, at least the way I understand it, is persuasion without consent. Narrative laundering seems to be a more recent iteration of the same idea, which is to say, like money laundering, you put in dirty narratives, you put in dirty information and you cycle it through multiple sources, and it comes out clean, it comes out looking innocent.
Obviously, with 2016, there are tons and tons of examples of Russian propaganda ending up being a part of the American information space, and the way that those things, that narrative laundered into mainstream political discourse via WikiLeaks and things like that. In my book, I tell a couple of stories. Alex Jones says something on info wars, and then Donald Trump starts to say it almost immediately. It happens so many times that Alex Jones actually did a segment of all the times that Donald Trump started to use his talking points to claim how powerful he was in Donald Trump’s rise.
Another example is, you brought this up earlier, the Stop the Steal movement, and that is great, because you have Roger Stone and Alex Jones trying to get Trump to launder their narrative. They have these special bulletins and broadcasts where they’re like, “Alert Donald Trump,” like they’re speaking to him, and they’re like, “You need to start saying that Hillary Clinton is going to steal this election. You need to say that if there are these irregularities, that we’re going to protest,” and he, of course, does that.
I don’t know that your average Republican or Fox News watcher in 2016 would have taken Info Wars seriously. I don’t know that they would be like, “Oh, Alex Jones, he’s my guy. I believe him to be a credible source of news and information.” But by the time it gets repurposed and recirculated through Donald Trump in his Twitter, and all of his ads and such, it looks very different.
Where I see this connecting to search engine optimization is when you seek out more information on these concepts. I first came into thinking about this was Safiya Noble’s Algorithms of Oppression where she talks about how Dylann Roof was looking at Black on white crime, and then became connected to the Council of Conservative Citizens. To me, I think the idea of squatting, or holding, preemptively understanding and knowing phrases, that’s different than just the way the internet works.
I’ve been reading about white nationalists and how they create a propaganda playbook for their followers. There’s a thing called Swarm Front. It’s like storm front but swarm front. They have an entire propaganda manual where they tell their stormers, or swarmers, or whatever, which words to use, which words not to use, how to engage in these kinds of debates, ways to create fake accounts and have arguments with yourself and comments, so that people jump in and think—this is what I was trying to say—most of us are so naive as we stumble through our online reality, where we just accept that it is what it is. A lot of times, it is what it isn’t. I think that people who are really savvy at gaming all of it, and making it appear to be something that it’s not, they’re perverting our public sphere.
Yes. There are systems that are retooling the information, and yes, it’s being manipulated by other people, but also, we play a role in our information environment. I remember you saying about being complicit, and I thought that was really powerful.
A lot of us are very naive about how the internet works on the one hand, why it works the way it works, and then also that there are people who understand how it works and are trying to game it constantly. I think that a lot of us are naive about that, but even for those of us who know some of this, you have to wonder at some point like are we complicit in the fact that the propaganda circulates? Are we complicit in the way that search works for us?
This episode is thinking more specifically about the role users play in shaping these “filter bubbles” or silos and how keywords play an important part of that process. Thinking a little bit more about the role of inputs in shaping, Will and I have talked about this, is how these processes are exploited by conspiracy theorists and thinking explicitly about the way QAnon believers seek out information, and how these starting points can lead them in these algorithmic rabbit holes where they’re ultimately just being confirmed. This notion of searching for alternative facts, or starting with a point that you think you know the answer to is not necessarily problematic but can take you into some dark corners of the internet.
Alice Marwick 35:21
When people think about conspiracy theorists, they often assume they’re gullible, or they’re dumb, or they don’t do their due diligence, they believe anything that’s given to them. When you drill down into these kind of conspiratorial communities, you find the opposite, that there’s a great deal of people providing what they see as facts and evidence. There’s these enormous databases that QAnons construct that’s full of information. They create these books of proofs that in some cases are like 200 or 300-page PDFs and they’re all annotated with diagrams and snippets from Wikipedia, or news stories.
If you go and you search for some of these QAnon terms, you’re going to bring up these big treasure troves of QAnon related information. It’s going to lead you down this path where you can see the evidence being put together, and you can see it being constructed, and that’s supposed to make it more legible or more credible to these QAnons than something would be if it was coming from an elite source or from the top down.
In 2018, Michael Golebiewski and danah boyd coined this term “data void,” which is basically when there aren’t search engine results for something, and that’s when manipulators, or conspiracy theorists, or whoever can rush in there and populate that content. For example, in QAnon, there’s this belief that elites are murdering children, and harvesting them for something called adrenochrome, which is like, I don’t know, some kind of chemical in the pituitary gland and that somehow makes people immortal or something.
This is how ridiculous these theories are, but the problem is if you Google for adrenochrome, you mostly get QAnon related stuff that talks about this, because nobody is putting up pages saying, “Hey, adrenochrome is just this naturally occurring chemical, and people don’t ingest it, and it’s not being consumed by elites, and it doesn’t do anything.” Because why would you create that information? Until someone actually gets around to doing that debunking, you have these sorts of holes and you certainly have groups and communities that are very eager to fill them.
I actually think about this tension between the strategic silence that danah boyd, and Joan Donovan, and also Whitney Phillips have talked about in this idea of, in order to make sure you don’t amplify hate, this very distinct journalistic practice has been passed down, where you don’t talk about these things, almost for exactly the reason you were saying. Because why would you give credibility and weight to conspiracies that are not necessarily legitimate? Something I think about is, there is this tension because by not reporting on it, or by not giving any metadata on it, you effectively keep that void exclusive.
Will Partin 38:29
I can talk to some of the work we did on combating communication threats against the 2020 census, which might seem a little distant, but actually, I think, overlaps here quite a bit. One of the things we did try to do was try to plug data voids in a way that created alternative search results around highly charged terms, but in ways that were not necessarily trying to do so, through like amplification or putting it on the front page of The Times. I think you’re right to point out that there is a really interesting tension. These data voids can clearly be exploited.
I think this is also an opportunity for working directly with platforms because it provides an opportunity to actually indicate these are the things we are concerned about. These are the kinds of things where you might make interventions before this becomes a major public issue. As part of a lot of that work, we did frequently have meetings with platforms where we spoke with them about, “Here’s what we’re seeing. Here’s what we think could be concerning. Here’s a potential narrative that might emerge.”
I think there are two distinct problems here. The first is people who are really embedded in communities that have epistemologies that don’t map to the reality-based community or what’s being covered in the press, or what the government is talking about. The second is exposing new people to these ideas or getting new people involved in these communities. I think the second is easier to solve than the first. I think the second is about making these things less discoverable, plugging the data voids, things like that.
With the first, a lot of people are enmeshed in these communities because they want to seek out people who share their views, their political views, or their views of the world. They are often people who, they have a very distinct identity, a political identity that’s very part of their personal self-concept. By denying, for example, that Trump really won the election, you’re threatening their very self-concept and also their sense of community. Because people in these communities don’t just talk about the conspiracy theory, they talk about their life.
In terms of how these expectations are how searching out for information can weave us in directions that we might not expect, is there more you can say on that?
I’m very interested not just in how social platforms are hosts to discourse, but also in the ways they actively shape it through their design. There was a really interesting case called the Home Affairs hoax where in South Africa’s census a couple of years ago, two enumerators, or two people were impersonating the actual door to door census workers, and ended up robbing homes while impersonating census workers.
There was a post that was originally made, it’s this pink flyer, it was uploaded to Facebook. It was all over South Africa then, but it reoccurred in the US, and a very, very sharp reader might look and say, “We don’t have a Department of Home Affairs. We call it something else.” Nevertheless, it was, census, Home Affairs, it seems specific enough to be a real threat even though it wasn’t.
Many places went to go and write, debunking it, saying, “Hey, you might see this but it’s not true.” Then people would go and share it, but because of the very design of Facebook, what got shared wasn’t necessarily the debunking text, it was the flyer itself. You’d actually have to go and click through, hit See More to see the debunking thing. To me, that was a really interesting case of where there’s this unintended impact of something that was good to do, trying to provide credible and accurate information, was actually undermined by the technical design of the platform.
I think that’s such a good point because I’ve been thinking an awful lot about how the way Google orders information is changing dramatically, and that it used to be just this list of links where people would explore for themselves what was on these hyperlinks, but it’s turning less of an intermediary, and with the knowledge graph, and it’s direct answering your questions. I do wonder the extent to which, as we seek out more information on our own, when the platform looks to answer the question for us, instead of provide us the direction to find the answers ourselves, how that might shape the kinds of conclusions people walk away with.
Among this community of researchers in the broader QAnon universe I mentioned before that there’s this need to constantly reaffirm, that things like Q’s drops and Trump’s tweets are reliable sources of information. They’re constantly looking back, “Oh, this tweet from whenever, predicted this thing that happened later,” and often there’s all kinds of confirmation bias going here. One of the ways that they did this was through actually a kind of instrument. It’s not just keywords necessarily, but it was this basically a map of Q’s synced drops and it was mapped onto a clock.
You started at twelve o’clock with the very first day Q dropped, and then the next second was the next day. Then you could go and look at all of the drops on a single access, and basically keep iterating on that until you found, “Oh, these two things use the same word,” or, “These two things are talking about the same thing, so we need to look at them, it’s part of the same narrative.” Essentially, it was a machine for creating context.
When you have this textual focus, and a belief that you can find the truth on your own and leveraging individualist interpretations over elite purveyors of knowledge, I think it lends itself to the kind of information exploration that you’re describing.
I think It’s really important for us to not just talk about what different people say they know, but to get one level deeper, to understand how they know it and how they get there.
This idea of encouraging exploration is a huge part of closing the propaganda feedback loop, because it empowers these individuals to think. No one likes the idea that they’re being manipulated, but if you provide them the tools to think that they are coming to the conclusions on their own, it makes it seem like their own idea and it makes it a lot easier to follow the breadcrumbs or the drops. The IKEA effect is when people put together the furniture themselves, they value the furniture more. Even though it’s a crappy piece of furniture, if you give them a equally low quality piece of furniture, they will value their low quality piece of furniture more if they’ve put it together.
Likewise with information. If you’re providing them the tools and saying, “Do it yourself,” even if the instructions are super bad, people get really excited afterwards, because they did it on their own, and they’ll feel good about what they’ve made, more invested in their misinformation. Definitely. Thank you Will and Alice.
If the IKEA effect means we treasure that uneven legged coffee table or creaky chair because we put in the work of making it, how much more do we value the information we know because we searched for it and sought it out? It’s really something to think about how much the words we use determine the results we find, and how much work others will do to put their preferred terms at our fingertips. Thank you, Francesca, for illustrating all these pieces. Thank you also, to our guests, Eszter Hargittai, Danny Sullivan, Clement Wolf, Jennifer Mercieca, for bringing their expertise and insight to bear. Thanks to Alice Marwick and Will Partin for helping us pull all the threads together. Thank you, for listening and joining us for this episode.
Next week on Does Not Compute, Rachel Kuo will delve into how disinformation targets Asian American communities online, with guests who are working to counter those campaigns. If you aren’t yet familiar with auntie information networks, you’re in for a treat.
Does Not Compute is the work of many hands, including our researcher hosts, our wonderful guests, and executive producer, Jacob Kramer-Duffield, senior producer and editor, Amarachi Anakaraonye, and CITAP project coordinator and production assistant Joanna Burke. Our music is by Ketsa. You can find us on Apple Podcasts, Spotify, or your favorite podcast listening platform as Does Not Compute. On the web, visit us at citap.unc.edu or connect with us on Twitter @unc_citap.
Does Not Compute is supported by a grant from the John S. and James L. Knight Foundation.
Made by: Amarachi Anakaraonye (Senior Producer), Joanna Burke (CITAP project coordinator), Jacob Kramer-Duffield (Executive Producer), and Kathryn Peters (etc)
Music: Ketsa, Parallel Worlds
Art: Logo by Contia’ Janetta Prince