Facebook’s BlenderBot chat AI no longer has the mental capacity of a goldfish

The bot can now recall past conversations that span weeks or even months.

David Aubrey via Getty Images

Last April, Facebook’s AI research lab (FAIR) announced and released as open source its BlenderBot social chat app. While the neophyte AI immediately proved far less prone to racist outbursts than previous attempts, BlenderBot was not without its shortcomings. For one, the system had the recollection capacity of a goldfish — any subject or data point the AI wasn’t initially trained simply didn’t exist in its online reality, as evidenced by the OG BB’s continued insistence that Tom Brady still plays for the New England Patriots. For another, due to its limited knowledge of current events, the system had a strong tendency to hallucinate knowledge, like a digital Dunning-Kruger effect. But the advancements BlenderBot 2.0 displays, which FAIR debuted on Friday, should make the AI far more sociable, knowledgeable, and capable.

While BlenderBot 1.0 could only maintain its memory for a single discussion, its successor can remember topics of conversation over the course of multiple talks that can take days, weeks or even months to complete thanks to the implementation of a long-term memory module. What’s more, the AI can actively update its knowledge base by searching the internet for the latest news and details on any subject that the user wishes to speak about.

“BlenderBot 2 queries the Bing API for search results based on a generated search query, and conditions its response on the top few results,” Kurt Shuster, Research Engineer at Facebook AI, told Engadget. “We rely on Bing to provide high quality search results.” As such, BlenderBot 2.0 is now capable of speaking coherently about breaking news and new media, not just the data it was trained upon.

how blenderbot 2.0 works

“BlenderBot 2 is limited only by what a powerful search engine can provide,” Jason Weston, Research Scientist at Facebook AI, added. So for example, if you are more interested in learning about Tom Yewcic (the Patriot’s combo QB/Punter from the 1962 season) than you are about Tom Brady, BB 2.0 has you covered. It’s the same with more scholarly subjects, like photosynthesis or redox reactions, Weston continued. So long as the information is available on the web, “there is no reason BlenderBot 2 cannot discuss this.”

By actively searching the internet for information, BlenderBot 2.0 can also reduce the instances in which it hallucinates knowledge. “Providing the system with more commonsense reasoning will allow BlenderBot to make sure it does not confuse subtle concepts,” Weston explained, “such as a movie director versus a producer or a pitching coach versus a hitting coach.”

the crown

The only wrinkle really occurs when discussing non-english based media, such as Demon Slayer: Kimetsu no Yaiba. “It is reasonable to conclude Bing will surface information about it and BlenderBot 2 can use that information accordingly,” he said. “We currently focus on english-based search results, so non-english references may not be fully covered.” The system will, however, recognize that Demon Slayer is of interest to you and will be more likely to bring up manga-centric subjects in future discussions.

FAIR has taken multiple steps to ensure that BlenderBot does not become the next Tay. “BlenderBot 2 does not learn directly from user input, as Tay did,” Shuster said. “We have taken extensive safety steps to ensure that BlenderBot 2 can handle adversarial users. Specifically, we employ both baked-in and two-stage techniques. BlenderBot 2 can detect itself if the incoming context will result in an offensive response, and additional safety layers where a safety classifier can detect if either the user input or the bot's output is offensive. Each handles the response appropriately.”

And while the system is currently focused on chewing its way through the English language corpus, FAIR does see BlenderBot does eventually extend to other languages as well. “While not in our immediate plans, the goal of our team is to build a superhuman conversationalist,” Shuster said. “This kind of agent requires multilingual understanding.”

Recent internal benchmarking processes found that BlenderBot 2.0 outperformed its predecessor by 17 percent in its engagingness score and 55 percent in its use of previous conversation sessions according to human evaluators, per a Friday blog from FAIR. What’s more, BlenderBot's rate of knowledge hallucinations dropped from 9 percent (!) in BB 1.0 down to just 3 percent in the current iteration.

Looking ahead, “humans interacting with AI systems via discourse is the future of AI,” Weston asserted, “and ensuring that humans have an engaging, informative experience is critical to that future. BlenderBot 2 combines the engagingness of BlenderBot 1.0 with the knowledge capabilities of a system with access to the entire internet, so ostensibly we are on the right track.”