Ⅱ. Current Technology and Problems with YouTube Algorithm
2.1 Circumstance of YouTube Korea
Before delving into the issues arising from the YouTube algorithm, it is essential to examine the impact of YouTube in South Korea. According to Playboard, a specialized YouTube statistics analysis firm, as of the end of 2020, South Korea had 97,934 revenue-generating YouTube channels (channels with over 1,000 subscribers), resulting in a ratio of one channel for every 529 individuals. In comparison, the United States, considered the birthplace of YouTube, had a ratio of one channel per 666 individuals. Additionally, according to DataReportal, an internet data analysis agency in Singapore, South Korea recorded an overwhelming usage rate of 39.9% on YouTube, as revealed in the 2022 Monthly Social Media App Usage report[Fig 1]. This far surpassed other platforms, with TikTok at 16.9%, KakaoTalk at 11%, Facebook at 7.8%, and Daum at 7.7%. Notably, TikTok experienced the highest growth at 22%, while YouTube demonstrated a 5% increase in the annual growth rate. Despite the relatively modest growth rate, the substantial user base of YouTube remains evident, consistently maintaining its dominance in themarket[1].
These statistics reflect the global trend, indicating continuous growth of video platforms in South Korea. Furthermore, Mobile Index, an app analysis service based on Data Management Platform (DMP), reported that as of September 2022, the monthly active users (MAU) of YouTube in South Korea surpassed 43.19 million, accounting for 83% of the total population of 51.78 million. The average monthly viewing time on YouTube in South Korea was reported as 30 hours and 34 minutes, significantly exceeding the global average of 23 hours and 24 minutes. Moreover, on average, YouTube users in South Korea accessed the platform for 16.9 days per month. The frequency of usage was observed to increase with younger age groups, with teenagers accessing the platform for 20 days, individuals in their 20s for 19.1 days, those in their 30s for 16.7 days, 40s for 16.1 days, 50s for 16.3 days, and those aged 60 and above for 15.8 days. This indicates a higher frequency of YouTube access among the younger demographic[1,2].
Time spent using social media apps in South Korea
2.2 YouTube Algorithm
YouTube has not publicly disclosed the specific algorithm it employs when recommending videos to users. However, it is widely known to utilize machine learning methods based on collaborative filtering and content-based filtering technologies.
Collaborative filtering stands out as a robust recommendation system that capitalizes on user behavior records or preferences to deliver personalized suggestions. By scrutinizing interactions among users, collaborative filtering excels at predicting or recommending items favored by users with similar tastes. Despite its effectiveness in providing recommendations based on extensive user and item data, this method faces challenges such as the “cold start problem” for new users or items and the “long tail problem” of emphasizing popular items. To address these challenges, YouTube has integrated content-based filtering [Table 1] into its recommendation system. Content-based filtering recommends videos based on the intrinsic characteristics and content of the videos themselves, irrespective of user preferences or behavior. This technique analyzes the features of each video, compares them to user profiles, and selects relevant items for recommendation. Although content-based filtering effectively tackles the cold start problem and offers personalized recommendations, its reliance on item characteristics alone can limit its ability to capture diverse perspectives and individual preferences[3]. In response to these limitations, YouTube has developed hybrid recommendation systems that seamlessly blend collaborative filtering and content-based filtering. This hybrid approach aims to leverage the strengths of both methods, providing a more comprehensive and accurate recommendation system. By combining collaborative filtering's ability to capture user preferences based on past interactions with content-based filtering's focus on video characteristics, YouTube enhances the overall recommendation accuracy. This integration ensures a more nuanced understanding of user preferences and effectively addresses the challenges posed by the cold start problem and the long tail problem. The continuous evolution of YouTube's recommendation system reflects its commitment to delivering a personalized and engaging user experience. The synergistic combination of collaborative and content-based filtering techniques contributes to YouTube's ability to cater to the diverse preferences of its vast user base[3,4].
Method of content-based filtering
The results obtained from collaborative filtering and content-based filtering undergo a nuanced ranking process before being presented to users. This crucial step involves the consideration of various factors to ensure a curated and personalized list of recommended videos. The key factors taken into account during the ranking process include[5]:
· Personalized Interest: Calculating personalized interest by analyzing user preferences, behavior, and past interactions.
· Current Popularity: Prioritizing videos that are currently popular among a large number of users.
· Diversity: Ensuring diversity in the recommendations by including videos from various genres and topics in the ranking.
· Playback Time: Considering the playback time of videos to recommend content that aligns with user time preferences.
· User Behavior: Adapting the recommendation list based on the user's previous video views, clicks, and liked videos.
· Recommendation Algorithm Weights: Assigning weights to the results of collaborative filtering and content-based filtering, determining the significance of each video in the recommendation list.
This meticulous ranking process, integral to YouTube's dynamic recommendation system, is designed to continually enhance user experience. It not only reflects the platform's commitment to delivering personalized content but also underscores its ongoing efforts to refine and improve the recommendation algorithms over time. By considering a diverse array of factors, YouTube aims to present users with a thoughtfully curated list of videos that align with their individual interests and preferences, thereby fostering a more engaging and tailored user experience.
2.3 definition of disinformation
The terminology used to characterize news containing inaccurate information encompasses a range of terms, and the absence of clear-cut criteria contributes to the interchangeable use of these terms. Commonly encountered expressions include “fake news,” “rumor,” “misinformation,” and “disinformation,” collectively falling under the umbrella of false information. However, these concepts are often used interchangeably and lack strict delineation. Among these terms, “fake news” stands out as the most frequently employed, signifying instances where media reporting deviates from factual accuracy. “Misinformation” denotes unverified information that may possess some basis in truth, while “rumor” closely aligns with misinformation, predominantly spreading through informal channels like word of mouth. Terms like “misinformation” or “disinformation” involve the deliberate distortion of facts with the intent to deceive[6].
Despite the distinctions, the definitions of these concepts remain fluid, contributing to potential confusion. In this paper, false information is defined as any information lacking a clear source and not verified as true. Even if the content itself may hold factual accuracy, it is categorized as false information if sufficient reliability is not established. It is crucial to note that false information, regardless of the specific term used, has the potential to instigate various societal issues. These may include public confusion, erosion of trust in information sources, and the manipulation of public opinion. False information can contribute to the spread of baseless beliefs, exacerbating social divisions and hindering informed decision-making processes[7].
Building upon this foundation, the paper aims to explore the interconnected societal issues arising from false information, particularly as related to the YouTube algorithm.
2.4 South Korea’s Seriousness of Fake News
Overall, the viewership of YouTube news among Koreans is 1.5 times higher than the global average. However, a significant majority of the Korean population, accounting for 89%, perceives fake news as a serious issue. It is evident that there is a general aversion to fake news among the citizens. Nevertheless, according to a global survey conducted by Ipsos, encompassing 25 countries to investigate perceptions of “fake news,” 85% of respondents in Korea reported having fallen victim to fake news[8]. This statistic underscores the undeniable impact of fake news, emphasizing the importance of not underestimating its influence. Notably, YouTube emerged as the primary platform for disseminating false information, with 22% of internet users admitting to distributing misinformation on the platform[9]. Additionally, 34% reported having watched or received videos on YouTube that were perceived as fake news. This signifies that one in three YouTube users has encountered or shared content classified as fake news. Notably, this trend was particularly pronounced among individuals in their 20s and the elderly, representing both ends of the age spectrum[10].
the route through which false information is circulated the most
Fake News Access Experience
Recognition towards fake news
Furthermore, in a survey targeting 1,200 adults aged 20-59, both male and female, the most crucial factor identified in fake news was its perceived political intent. This suggests a significant likelihood that the production of fake news is driven by political motives. Additionally, a study conducted by Embrain Trend Monitor among men and women aged 19-59[Fig. 4] found that over 85% of respondents expressed concern about the excessive prevalence of fake news, attributing it to the deepening societal divisions[8,11]. Despite the continuous increase in YouTube viewing time in South Korea, the production of fake news persists, and the public recognizes it as a pressing issue in society.
2.4.1 News/Politics on YouTube and YouTube Algorithm
The operation of the YouTube algorithm is not determined by individual choices but is rather essential and automatically executed during the use of YouTube. The provision of tailored content in this manner has significant advantages from a commercial perspective, as it can extend user engagement on the platform. However, it raises concerns about the limitation of the types and content of the shown material within a certain framework. Currently, there are over 500 channels dedicated to the “News/Politics”[4] category on YouTube. Among them, channels are operated not only by media organizations but also by individuals, including politicians. Although media channels generally have higher viewership than individual channels, as of December 2023, channels operated by individuals are not significantly lagging behind. Moreover, the sponsorship feature on YouTube, known as Super Chat, is popular enough to involve substantial monetary transactions[Fig. 5]. According to prior research, videos produced by media outlets and those created by individuals are not consistently classified under the same category.
Additionally, there was prior research aimed at verifying recommendation tendencies through AI recommendation algorithms. In this experiment, two accounts were created, with Account 1 exclusively viewing news from media outlets and Account 2 exclusively consuming User-Generated Content (UGC). Three keywords were searched, and four videos were selected for each keyword. After watching these videos, the initial recommended videos at the time of account creation and the subsequent recommendations after the experiment were compared on the YouTube homepage. The experiment involved selecting four videos for each keyword, and the test was conducted by searching for keywords in the top search bar to find and watch the videos. The chi-square test results for the experiment indicated an association between YouTube algorithm recommendations and the type of content viewed (news or UGC) based on the accounts[3].
YouTube Korea News/Political SuperChat Ranking
Experiment Outcome of YouTube Algorithm
Specifically as [Fig 6], if a user exclusively watches videos created by media outlets, the recommendation ratio between media outlet-produced and user-generated content (UGC) was 6:4. On the other hand, if the user only consumed videos created by individuals, this ratio shifted to 9:1[3]. Although videos unrelated to the searched keywords also appeared, the overarching conclusion is that when users predominantly watch personally created videos with a focus on the 'News/Politics' category, YouTube assumes an interest in channels operated by individuals.
2.5 Echo-Chamber Effects
In tandem with the surge of fake news on YouTube, the emergence and fortification of echo-chamber effects among the platform's user base pose a substantial concern. An echo chamber denotes a scenario in which individuals encounter information that aligns with and reinforces their pre-existing beliefs, creating a self-perpetuating loop of confirmation bias.
YouTube's recommendation algorithm, designed to amplify user engagement, inadvertently contributes to the formation of echo chambers. The algorithm, propelled by collaborative filtering and content-based filtering, frequently suggests videos akin to a user's past engagements. While this personalized strategy seeks to cater to individual preferences, it unintentionally restricts the range of perspectives users are exposed to[6].
Users may discover themselves confined within a content bubble that mirrors their current worldview, curtailing exposure to diverse opinions and alternative viewpoints. This echo-chamber effect can deepen the entrenchment of existing beliefs, fostering polarization, and impeding open discourse. The algorithm's emphasis on user engagement metrics, such as watch time and clicks, may prioritize content that evokes strong reactions or aligns with pre-existing beliefs, thereby amplifying the echo-chamber phenomenon.
Research underscores that individuals entrenched in echo chambers are more likely to have their existing beliefs reinforced while being less exposed to information challenging their perspectives. This holds broader societal implications, including heightened polarization, diminished understanding between distinct ideological groups, and a potential erosion of shared realities[12].
Recognizing and remedying the echo-chamber effects within YouTube's recommendation system becomes pivotal for cultivating a more diverse and inclusive information environment. Striking a delicate balance between personalized content recommendations and ensuring exposure to a spectrum of perspectives is indispensable for nurturing a robust, healthy, and well-informed public discourse on the platform.
2.6 Filter Bubbling
While echo-chamber effects on YouTube contribute to reinforcing existing beliefs by presenting users with content that aligns with their preferences, another noteworthy phenomenon is the emergence of filter bubble effects. The filter bubble, coined by internet activist Eli Pariser, refers to the personalized information ecosystems that result from algorithmic curation, where users are selectively exposed to content that matches their past online behavior, preferences, and demographics.
In contrast to echo chambers, filter bubbles are characterized by a more individualized and personalized information experience. The YouTube recommendation algorithm, in its pursuit of enhancing user engagement, tailors content suggestions based on an individual's watch history, likes, and clicks. This personalized curation can create a unique information bubble around each user, where their exposure to diverse perspectives is selectively limited[3].
The filter bubble effect is driven by algorithms prioritizing relevance and user engagement metrics. As a result, users may find themselves in a curated digital space where their pre-existing preferences arecontinually reinforced, and dissenting or diverse viewpoints are underrepresented. Unlike echo chambers, filter bubbles focus on the individual's online journey, crafting a digital environment that aligns with their preferences.
The implications of the filter bubble extend beyond reinforcing existing beliefs. Users may be unintentionally shielded from information that challenges their worldview, hindering a comprehensive understanding of various issues. While the echo chamber tends to foster polarization by reinforcing group beliefs, the filter bubble accentuates an individual's isolation from diverse content, potentially leading to a more personalized yet narrow information landscape[12].
The implications of the filter bubble extend beyond reinforcing existing beliefs. Users may be unintentionally shielded from information that challenges their worldview, hindering a comprehensive understanding of various issues. While the echo chamber tends to foster polarization by reinforcing group beliefs, the filter bubble accentuates an individual's isolation from diverse content, potentially leading to a more personalized yet narrow information landscape[12].
2.7 Political Abuse
The confluence of filter bubble and echo chamber effects in online platforms, particularly in the political domain, has given rise to a concerning phenomenon known as political abuse. This phenomenon manifests when individuals, groups, or entities exploit the algorithmic mechanisms of platforms like YouTube to manipulate or weaponize political content for their gain. Understanding the intertwined dynamics of filter bubbles and echo chambers within the political landscape is crucial for comprehending the potential consequences of political abuse. The combination of filter bubble and echo chamber effects in the political landscape sets the stage for a multifaceted and complex phenomenon – political abuse[13]. This phenomenon involves the strategic exploitation of algorithmic tendencies to curate personalized content and foster ideological echo chambers. Recognizing the interplay between these effects is essential for devising strategies to mitigate political abuse, preserve information diversity, and uphold the integrity of democratic discourse in the digital age.
2.7.1 Filter Bubble and Political Abuse
The filter bubble effect, driven by personalized content curation, can be strategically employed for political manipulation. Bad actors can capitalize on the algorithm's tendency to prioritize relevance and engagement, tailoring political content to cater specifically to users' existing beliefs. In this context, political abuse involves the deliberate creation and dissemination of content that reinforces a particular political narrative, isolating users within their personalized information bubbles.
By leveraging the filter bubble, political abusers can ensure that users are consistently exposed to content that aligns with a specific ideological agenda. This strategic manipulation of information flow can amplify confirmation biases, limit exposure to alternative viewpoints, and foster an environment where individuals are more susceptible to propaganda or misleading political narratives[14].
2.7.2 Echo Chamber and Political Abuse
Echo chambers, characterized by the reinforcement of existing beliefs within homogeneous communities, offer a fertile ground for political abuse. In a politically charged echo chamber, misinformation or politically biased content can spread rapidly and be amplified by like-minded individuals. Political abuse within echo chambers involves the intentional dissemination of content to exploit the emotional resonance of a particular political narrative within a closely-knit community.
The echo chamber effect facilitates the creation of ideological silos where dissenting opinions are marginalized or excluded. Political abusers can manipulate these closed environments to disseminate propaganda, sow discord, or even engage in targeted disinformation campaigns. By capitalizing on the echo chamber's tendency to strengthen group identity, political abuse can deepen societal polarization and erode trust in democratic processes[14].