Article

Data voids: What are they and do they threaten democracy?

data voids are intricate systems in which media manipulators find opportunities to spread content that might be problematic and therefore harmful to our democracy.

Published date
Courses
Interaction in the hybrid media system
Copyright
Read time
11 minutes

IntroductionData voids can be a major problem for search engine companies like Google, Bing and YouTube, media manipulators use them in order to spread problematic content. The implications of false information encoutered online can have impact on our democracys, companies, they are should therefore try everything they can in order to stop misinformation spread though the data void.

Back to top

What is the data void?

Since the emergence of the Internet, a lot has changed in our society. More and more people turn to search engines like Google, Bing, and YouTube as a primary source for getting information. Golebiewski and boyd explain in their report Data Voids: Where Missing Data Can Easily Be Exploited (2019): “There are many search terms for which the available relevant data is limited, nonexistent, or deeply problematic. Recommender systems also struggle when there’s little available data to recommend. We call these low-quality data situations “data voids.” (p. 5). The data void, consequently, can lead to “low quality or low authority content because that’s the only content available.” (p. 5). In this article, I will explain how the data void come to exist and what their implications are on our current media system and our democracy.

The concept of the data void was first coined by Golebiewski and boyd (2019). “These voids occur when obscure search queries have few results associated with them, making them ripe for exploitation by media manipulators with ideological, economic, or political agendas.” (Golebiewski & boyd, 2019, p. 2). The data void itself is not a problem, but the space that it creates for media manipulators to share false information is a threat.

Users of these platforms can stumble upon a misinformation while searching, or could be led to them through the interference of what they call ‘media manipulators’.  Search engines work with massive amounts of data that are meant to provide the searcher with the information they are looking for,

but it depends on the specific term, or combination of terms, how many results you will encounter. Furthermore, quality of the information as provided can vary from high-quality, legitimate news, to low-quality, problematic content.

One example is that of the word ‘basketball’. When googling this word you will encounter many results that will vary from the basketball as an object, to the nearest and most recent basketball game. The results you will get will be based on your current location and on the search engine technology that bases its logic on what those that searched before you were looking for.

Search engines work with massive amounts of data that are meant to provide the searcher with the information they are looking for,but it depends on the specific term, or combination of terms, how many results you will encounter.

In the case of searching for something common like a basketball you will probably not encounter too many problems, but what happens when you type in random letters? You will probably find little to no results. “There is a long tail between a term like “basketball,” which promises a seemingly infinite number of results, and one with zero results. In that long tail, there are plenty of search queries that can drop people into a data void rife with existing but deeply problematic results.” (Golebiewski & boyd, 2019, p. 5). Problematic content might be of low quality or low authority and “some of these data voids are intentionally exploited to introduce disturbing content, while others are created to promote political propaganda.” (Golebiewski & boyd, 2019, p. 5).

According to Golebiewski and boyd (2019), there are five different sorts of data voids. I will discuss briefly how these occur and how media manipulators might look for them to spread content that we consider to be problematic.

  1. 1. Breaking news data voids come to exist after the occurrence of a major news event. The combinations of different terms associated with this specific event will lead to gaps in search inquiries that need to be filled with content. In this process of content production, media manipulators have the opportunity to link their content to these specific terms.
  2. 2. Some media manipulators will create what Golebiewski and boyd call strategic new terms, in this case, information is created and linked to terms that only connect to this problematic content. The coronavirus, for example, has opened doors for manipulators to spread problematic content as is explained in an article by Maly (2020).
  3. 3. Outdated terms are less likely to be searched for, they can, however, be used to connect new, problematic information to these terms. There is no more new content made that applies to outdated terms and this will lead to openings for those that want to spread content.
  4. 4. Fragmented concepts are data voids that are used by manipulators to connect seemingly unrelated terms to the content they want to spread. As an example, we can look at an article by Maly (2019) in which the Alt-right used an interview with Charlie Kirk to spread a new term that was directly linked to content against him.
  5. 5. Searching for a problematic query will automatically lead to content that might not be factual. Those that are looking for information that could be linked to conspiracy theories for example could stumble upon information that ‘proves’ those theories rather than disprove them. “Many conspiracy theorists, hate groups, and media manipulators attempt to push searchers to use specific, problematic search queries they know will lead to these voids” (Golebiewski & boyd, 2019, p. 34).
Back to top

The influence of the internet on democracy

In our democracy, relevant and reliable information is important  maintain a healthy public sphere. The idea of the public sphere as Habermas (1991) constructed in his book The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society, is that people in a society can come together to discuss important issues that are common to all. This concept helps us visualize how democracy is built and how we can maintain it. One of the affordances of technological development and especially is that it opens up the public sphere. We can encounter others more easily on for instance social media when we function online and information spreads faster than ever. This could possibly be beneficial as we can share  and encounter ideas outside of our own perception more easily. But differentiating true from false information can be more difficult for those that come across it online.

Not only social media play a role in this changed handling of information online. The influence of search engines and the massive amounts of information that are in reach for everyone with access to the internet has an impact on society. As is explained by Vaidhyanathan (2012) in his book The Googlization of Everything: (And Why We Should Worry), he argues,  that Google c idea of  information that is useful for its users: “its process of collecting, ranking, linking and displaying knowledge determines what we consider to be good, true, valuable, and relevant. (Vaidhyanathan, 2012, p. 7). Because it is completely integrated into our society we see that companies try to be on the top of the most searched results on google. “Google will affect the ways that organizations, firms, and governments act, both for and at times against their “users”. (Vaidhyanathan, 2012, p. 3). But we should not forget that google and other search engines are not neutral: “its biases (valuing popularity over accuracy, established sites over new, and rough rankings over more fluid or multidimensional models of presentation) are built into its algorithms.” (Vaidhyanathan, 2012, p. 7).

The influence of search engines and the massive amounts of information that are in reach for everyone with access to the internet has an impact on society.

Now that we know that we use Google has fasprime source for looking up information, imagine what could happen when people start Googling a politician. Information about this person can vary from trivial stories about thepersonal life of the politician to campaign programs. But it gets tricky when we cannot be sure that the information we read and see is true or manipulated. False information cgreatly impact our democracy as it could potentially influence how we vote.

 Deepfake technology that keeps occuring more and more. The Deepfake, a product of a technolog that can alter the way we perceive a person by layering their face over that of another person and therefore changing their behavior is being used to create fake news. As Deepfakes are often videos, they are a direct threat to democracy because we are inclined to believe a video directly when we see it ,and: “disinformation can have a much greater effect than its subsequent debunking.” (Parkin, 2019). These videos are most likely shared either through social media or spread through the use of a data void.

Back to top

How do des it work in our media system?

In his paper The hybrid media system. Politics and power, Chadwick (2017) explains how our media system has become hybrid through new technologies that make shifting between these different forms of media more easily. He says: “the hybrid media system is built upon interactions among older and newer media logics—where logics are defined as bundles of technologies, genres, norms, behaviors, and organizational forms—in the reflexively connected social fields of media and politics.” (Chadwick, 2017, p.4).

Information today flows on different levels, an example that Chadwick uses is that of political election campaigns such as, for instance, the one from Obama in 2008 “Campaign teams can no longer assume that they will reach audiences en masse. They now create content targeted at different audience segments and they disseminate this content across different media.” (Chadwick, 2017, p. 9). Elections are in this case making use of the hybrid media system to reach their audience on different channels and levels to enhance their influence on society.

The implication in using these different (online) media is that media manipulators also find their way in spreading false or low-quality information. So when one can encounter data that is true and of high quality when searching,  can also easily stumble across false information that is spread through a data void. In this way, data voids can be a possible threat to our democracy.

Another impact that the data void can have on our media system and, likewise, our democracy is that of bias. Potential bias by the designers can lead to reflection of biases in society, for example, searching on the term businessman youpicture of white males. And we should be more careful when dealing with political content: “For politically charged content, any decision by search engine designers becomes political itself.“ (Golebiewski & boyd, 2019, p. 32). One can rarely be truly objective, but we need to make sure that search engines are as objective as possible.

Back to top

How do we deal with data voids?

The conflict with data voids is that they are difficult to detect and even more difficult to solve. In the big amount of data available online, it is hard to filter out what is false and what is legit.

Search engine optimization (SEO) is used by content makers to prepare their content for uptake on search engines. But of course, this can also be used by media manipulators to make their misleading information more easily found or link it to popular content.  YouTube recommendation part of the platform you can sometimes encounter videos with questionable content. “Major search engines consistently struggle to alter their systems so that they return high-quality results under a constant barrage of manipulation attempts.” (Golebiewski & boyd, 2019, p. 43). But even this cannot prevent data manipulators from finding opportunities to spread false information.

The main problem lies in how search engines work. “Bing and Google do not produce new websites; they bring to the surface content that other people produce and publish elsewhere on third-party platforms. Without new content being created, certain data voids cannot be easily cleaned up.” (Golebiewski & boyd, 2019, p. 44). Some data voids are easier to solve than others. For example, the breaking news data void usually as more high-quality information about the breaking news event is created and shown with a higher page rank on the search engines.

Other data voids might be hard to detect and this makes for a constant battle between search engine companies and media manipulators. As hard as they are to detect, companies like Google and Bing should work on improving their systems to detect low-quality information.

To protect bias from interfering in these search engines, companies should make sure that the content that is shown is inclusive and as free of bias as possible. Making sure that the company has high values in preserving this inclusivity would be an important notion in the process of designing the systems. Governmental interference  difficult because Google and Bing are companies and not related to state control in any way.

As we have seen above, data voids are intricate systems in which media manipulators find opportunities to spread content that might be problematic and therefore harmful to our democracy. In our hybrid media system, search engines play a bigger role than ever since many of us use it as a reliable source when they are looking for information. Data voids exist in many different forms and media manipulators do their uttermost best to use these voids to share content with the broader public. The availability of reliable, high-quality news is important  create a healthy democracy and therefore we do not want media manipulators to interfere within politics or any other part of our society. Data voids are hard to detect, and even harder to solve. We should be aware of th fact that it is happening and search engine companies should do  in their power to solve the voids.

 

Back to top

References:

Chadwick, A. (2017). The hybrid media system. Politics and power. Oxford University Press.

Golebiewski, M., & boyd, d. (2019, October). Data Voids: Where Missing Data Can Easily Be Exploited. Data & Society.

Habermas, J. (1991). The Structural Transformation of the Public Sphere: An Inquiry into a Category of Bourgeois Society (Studies in Contemporary German Social Thought) (Sixth Printing ed.). The MIT Press.

Parkin, S. (2019, June 11). The rise of the deepfake and the threat to democracy.T

Vaidhyanathan, S. (2012). The Googlization of Everything: (And Why We Should Worry) (First Edition, Updated ed.). University of California Press.

Back to top

MA Art and Media Studies alumni. Working as a researcher and curator in the field of digital art, new media, and artistic research.

More from this author

Content ID

Published date
Course
Interaction in the hybrid media system