In addition to the issue of in-person racism, it has been discovered that chatbots may also be susceptible to racism. The Washington Post recently published an article that uncovered a troubling truth: one of the most widely-used data sets used to train AI chatbots contains a plethora of right-wing content.
For those who are unfamiliar with artificial intelligence, it is important to understand that AI programs such as ChatGPT are not capable of independent thinking. Instead, companies provide AI programs with a large amount of data collected from various sources on the internet. The AI then utilizes this data to simulate human thinking. Therefore, if your chatbot friend starts sharing conspiracy theories about the September 11 attacks, it is likely that the data set it was trained on contained an excessive amount of content from far-right sources like Alex Jones.
According to the Washington Post investigation, the news websites utilized in one of the most widely used AI data sets include numerous far-right and non-reputable sources. The data set in question is Google’s C4 data set, which powers some of the largest AI models in the world, including Facebook and Google’s AI models. The list of sources includes Breitbart, the Russian-state propaganda website RT.com, and the anti-immigration group Vdare.com.
It is widely known that Breitbart has been accused of pushing racist content. In 2016, right-wing commentator Ben Shapiro expressed his disdain for the website, stating that it promoted “white ethno-nationalism” content. Additionally, the lack of citation of sources by AI programs poses a major concern. As a result, one may ask a chatbot a question and be unaware that the answer is coming from a right-wing site that spreads hateful content.
MSNBC’s Sarah Posner, who covers the right, expressed concern about the potential dangers of incorporating these inputs into an algorithm. The building blocks of chatbots are scraped from the internet, which means that they may include bigoted content or disinformation.
While an individual may choose to navigate away from a toxic site, chatbots do not have the same capability. Therefore, content from sites like Breitbart and VDare, which publish transphobic, anti-immigrant, and racist content, may be integrated into a chatbot’s responses to users.
One of the major problems with AI is that it is influenced by our own biases and judgments. Until we are able to address this issue or implement better safety measures, there is a risk that racist chatbots will continue to exist.