Facebook has developed the “Blender” bot from chats. The program also copied racism and fake news from people.
The ideal by which robots and artificial intelligence (AI) are commonly measured is that of humanity. That seems to apply to science fiction films of the 1920s as well as to our everyday life, which voice assistants and smart apps have long had a say in. In April, Facebook’s AI department announced that it had developed “Blender”, a chatbot that was particularly human in conversations and thus outperformed similar applications. The program was fed with more than 1.5 billion conversation entries from publicly accessible portals such as Reddit. Intensive training for a wide variety of topics, but also for natural language and behavior patterns.
However, the freely available amounts of data also have their dark side: The answers of the chat machine in tests consistently included insults, discriminatory statements and fake news. Nevertheless – or precisely because of this – Blender was evidently able to impress with his humanity. 67 percent of the testers said the bot was more convincing and more humane than its Google equivalent Meena, which is considered one of the most developed of its kind to date. Blender could not distinguish 49 percent from a human conversation partner.
Chatbots are mainly used in the digital customer service of companies, where they are supposed to provide information on certain subject areas. Anyone who wants to be informed about e-banking, telephone tariffs or insurance conditions by direct message is speaking less and less to real people. In recent years, bots have increasingly served developers as an experimental area to bring artificial intelligence closer to people. Microsoft’s “Tay” or “Mitsuku” from Pandorabots were trained using chats that led millions of users with them. In addition to knowledge, personality and empathy also play a decisive role in ensuring that people find the conversation with the machine enriching. Through machine learning, the programs can develop continuously and almost independently during conversations. The more they communicate with people, the more they learn from them.
Insults and propaganda
But that has already caused difficulties in the past. The Microsoft product “Tay” was overwhelmed with insults and propaganda in direct conversation with users and integrated these accordingly into its vocabulary. With the enormous amount of data in existing conversations with which Blender was fed, this is obviously no different. The culture of conversation on social media or forums such as Reddit lives not least from insults or the dissemination of false information. This makes manual censorship indispensable if you want to prevent chatbots from adopting and disseminating elements such as racism, sexism or conspiracy theories. However, the workforce that would be required for this cannot even cope with Facebook.
The blog Israellycool recently published unsettling excerpts from a chat log with Blender. When asked what he thinks of Jews, the bot answers that they are “terrible people” who are constantly killing non-Jewish people. Such anti-Semitic statements are commonplace in other forums of hate in forums like Reddit, and are now part of Blender’s repertoire. In contrast to explicit swear words, the prejudices that flow in this way are likely to be much more difficult to filter out of the program – and are even more fatal to spread. According to Facebook, the fact that Blender acted anti-Semitic in this case does not allow any direct conclusions to be drawn as to how Judaism is most frequently commented online. The answers generated are not taken over directly, but based on a complex combination of different conversation parameters.
Monitoring and responsibility
Marion Weissenberger-Eibl, head of the Fraunhofer Institute for Systems and Innovation Research, sees the problem in the learning process itself. It is still difficult for AI to correctly classify linguistic and content subtleties. She says: “To prevent chatbots from acting in this way, one would first have to teach them what racism is. A problem that has not yet been solved, because a chatbot, for example, cannot differentiate between black humor and racism.” For them, the question arises how the learning of the program can be monitored – and also who is responsible for it.
The research team behind Blender announced at the launch that they had dealt with protection against toxic language, but there was still a lot to do. Blender was published as open source software and is therefore freely available in code. At the request of the SZ, Facebook informed that the Blender bot was a pure research project, which as such was not released for the market but initially only for the developer community in order to jointly optimize the conversation AI. The version used by Israellycool was provided by the company Cocohub, which changed the code and thus also removed built-in protection mechanisms. In the original version, these would prevent an answer to the question of what Blender thinks of Judaism from being generated. According to Facebook, the fed-in Reddit data record is too large for manual sorting of toxic content and is therefore unsuitable. This is also why the bot is currently not intended for private individuals or for commercial use.