In the likes of chess and Go, everything is laid out in the open. But in poker, there's hidden information, namely the cards your opponents have. That brings different, complex strategies to poker not seen in other games, including bluffing. As such, AI bots have typically struggled to account for hidden information and effectively act on it.
Bluffing poses a particularly interesting challenge. A successful bluff can dramatically change a poker game in your favor, but do it too much and your deception becomes predictable. So the bot has to balance bluffing with betting on legitimately strong hands.
Pluribus is a more advanced version of Carnegie Mellon's bot Libratus, which beat pros in heads up play a couple of years ago. There's a new online search algorithm that let Pluribus look at the available options for a few moves ahead, and not just the end of the game. It also had "faster self-play algorithms for games with hidden information," Facebook said, meaning that it was more efficient in learning how to deal with hidden information in games the bot played against copies of itself.
As a result, it's a lot more efficient than many other AI game-playing bots. It uses less than 128 GB of memory and runs on just two GPUs while playing. In 2016, AlphaGo tapped into 1,920 CPUs and 280 GPUs while facing off against Go professional Lee Sedol. Pluribus typically plays twice as fast as pros, taking an average of 20 seconds per hand when it played copies of itself.
In 10,000 hands of poker over 12 days, Pluribus faced off against several pros, including World Series of Poker Main Event champions and World Poker Tour winners. Among them were Chris Ferguson, Greg Merson, Darren Elias and Jimmy Chou. All have won at least $1 million in pro play, and had a monetary incentive to play their best.
"If each chip was worth a dollar, Pluribus would have won an average of about $5 per hand and would have made about $1,000/hour playing against five human players," Facebook wrote. "These results are considered a decisive margin of victory by poker professionals."
The pros seemed intrigued by the types of strategies Pluribus employed, such as the atypical (for humans) move of kicking off a round with a bet after calling the previous go-round. "It was incredibly fascinating getting to play against the poker bot and seeing some of the strategies it chose," Michael Gagliano said. "There were several plays that humans simply are not making at all, especially relating to its bet sizing."
"Pluribus is a very hard opponent to play against," Ferguson said. "It's really hard to pin him down on any kind of hand. He's also very good at making thin value bets on the river. He's very good at extracting value out of his good hands."