Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Bias, Skew and Search Engines Are Sufficient to Explain Online Toxicity

Association for Computing Machinery. Scholar One Manuscripts. Bias, Skew and Search Engines Are Sufficient to Explain Online Toxicity: “U.S. political discourse seems to have fissioned into discrete bubbles, each reflecting its own distorted image of the world. Many blame machine-learning algorithms that purportedly maximize “engagement” — serving up content that keeps YouTube or Facebook users watching videos or scrolling through their feeds — for radicalizing users or strengthening their partisanship. Sociologist Shoshana even argues that “surveillance capitalism” uses optimized algorithmic feedback for “automated behavioral modification” at scale, writing the “music” that users then “dance” to. There is debate over whether such algorithms in fact maximize engagement (their objective functions also typically contain other desiderata). More recent research offers an alternative explanation, suggesting that people consume this content because they want it, independent of the algorithm. It is impossible to tell which is right, because we cannot readily distinguish the consequences of machine learning from users’ pre-existing proclivities.How much demand comes from algorithms that maximize on engagement or some other commercially valuable objective function, and how much would persist if people got information some other way? Even if we can’t answer this question in any definitive way, we need to do the best we can. There are many possible interface technologies that can help organize vast distributed repositories of knowledge and culture like the Web. These include:

  • Traditional systems of categorization (like the Dewey Decimal System, or the original Yahoo!)
  • Systems such as Wikipedia and Reddit in which human volunteers collate, organize,present and revise information, providing an information resource, and a means for searching it, and human-selected links to external sources.“Traditional” search algorithms like Google’s original PageRank algorithm that rank items in terms of relevance, estimated as a function of the text of the options and the query, the number and “quality” of inbound links, etc.
  • Modern social media algorithms: machine-learning driven systems that rank content to maximize some observable notion of users’ engagement with it or other profit-related measure, updating the ranking model depending on user responses to the options presented.
  • Large Language Model driven interfaces that generate outputs based on a set of statistical weights that losely summarize some larger corpus of text and associated data…”

Sorry, comments are closed for this post.