Darren Elias knows poker. The 32-year-old is the only person to have won four World Poker Tour titles and has earned more than $7 million at tournaments. Despite his expertise, he learned something new this spring from an artificial intelligence bot.
Elias was helping to test new software from researchers at Carnegie Mellon University and Facebook. He and another pro, Chris “Jesus” Ferguson, each played 5,000 hands over the internet in six-way games against five copies of a bot called Pluribus.
At the end, the bot was ahead by a good margin. Along the way Elias noticed something: Although machines are often thought of as uninspired, this bot was ballsier than your typical poker pro. “It will bet two or three times the pot, which humans don’t do very much,” Elias says. “These huge bets are interesting to me and something I will incorporate into my own play.”
Pluribus is significant not just because a new bot taught an old pro new tricks. The software is the first to beat top professionals at multiplayer No-Limit Texas Hold ‘Em, seen as the elite form of poker. A paper in the journal Science Thursday describes how Pluribus took on Elias and Ferguson, and also won handily in scenarios where a single copy of the bot played five human professionals for 10,000 hands.
“If you sit this bot down with five elite professional humans, it is going to beat them and make money off them,” says Noam Brown, a researcher in Facebook’s AI lab and co-creator of Pluribus. “This is really the gold standard as far as poker goes.”
Michael Littman, a professor at Brown University who has worked on computer poker but wasn’t involved in the project, agrees. Poker has long been seen as a grand challenge for AI researchers, with properties similar to many real world situations. Unlike in chess, poker players must choose actions without knowing what cards their opponents hold—as is the case in politics, business, and war. The complexity that creates in a six-way game has previously put multiplayer Hold ‘Em out of reach for AI researchers. Most work has been on two-player games. Now the last major milestone for poker AI has fallen, Littman says. “This is really the end of a multi-decade effort involving many researchers,” he says.
Brown built Pluribus with Tuomas Sandholm, a Carnegie Mellon professor. Brown was previously a grad student in Sandholm’s lab, where the pair built a 2017 bot called Libratus that became the first software to beat professionals at the much simpler, two player, form of No-Limit Hold ‘Em.
“If you sit this bot down with five elite professional humans, it is going to beat them and make money off them.”
Noam Brown, co-creator of the bot
Brown started the Pluribus project after joining Facebook, but says the social media giant doesn’t have specific applications of the technology in mind. “The goal is fundamental research on imperfect information and large-scale multiagent systems,” he says—a phrase that also aptly describes Facebook’s main service. Longer term, ideas tested in Pluribus could help self-driving cars predict the actions of other drivers, or improve fraud detection algorithms, he says.
Sandholm at CMU says he has already proven the commercial—and even national security—value of software that can strategize. He has established two companies to commercialize work on AI-strategizing techniques from his lab.
One of those companies, Strategic Machine, works on uses such as improving bots in videogames, and helping companies set optimum prices that take into account how competitors will respond. The other, Strategy Robot, signed a two-year contract worth up to $10 million with the Pentagon in 2018; Sandholm and the Pentagon decline to discuss the contract. But Sandholm has said one of Strategy Robot’s selling points is using ideas proven in poker and his other AI projects to make simulated—or even real—battlefield strategies more robust against enemy actions. Nothing from the project with Facebook will be licensed to either of Sandholm’s companies, although some techniques central to Pluribus pre-date the project.
Pluribus is similar to Libratus in that it built up its skills by playing trillions of hands against versions of itself. After each hand, the system reviews what happened and what might have worked better—any improvements are added to its core strategy.
The new bot is able to play a much more complex game than its predecessor in large part because it’s better at fine-tuning that core strategy by projecting possible outcomes from a particular point in a game—known as the search function. Brown and Sandholm’s earlier bot attempted to map out all the possible twists and turns of a game to the end. But it would take too much computing power to explore the almost endless possibilities of a six-player game.
Instead, Brown and Sandholm developed a search function that looks only a few moves ahead at a time. To avoid nasty surprises, it also takes into account how the value of different actions would change if opponents shifted their strategies. This kind of search had not previously been adapted so well to a game like poker, where some information is hidden.
Brown says that new approach also has the advantage of requiring less computing power, making Pluribus relatively cheap to run. The bot needed eight days of playing against itself on a single powerful server with 64 processor cores to master the game. AI bots developed for complex videogames such as Dota 2 have required weeks of training on hundreds of thousands of processors. “You could develop something like this on a cloud computing service for about $150, which makes it really feasible to apply this to other domains,” Brown says. The comparable figure for Libratus, which played against itself on a supercomputer for two months, would be on the order of $1 million, he says.
One application the pair don’t have in mind for their code is winning money at poker. “We’re not going to release the code in part because this would have a major impact on the online poker community,” Brown says. “We’re trying to make this accessible to people in the AI community, not people who want to make poker AIs.”
All the same, he admits that the techniques will likely spread anyway. A year from now, will other people have developed Pluribus-style bots? “I think it’s entirely possible,” Brown says.
Elias, the human poker champ, expects it. Since the arrival of Libratus, he says, people don’t play high stakes online games as much because bots have become more sophisticated. “If you’re playing a high stakes sit-and-go online you’re likely playing against a bot or a human being helped by a bot,” Elias says.
Elias says poker pros and fans shouldn’t be deterred from playing the game by the latest AI advance, and that it can enhance the game. He was happy to help test Pluribus because he appreciates the science of AI—and the potential for new insights like the value of betting bigger. The bot’s penchant for “donk betting,” in which a player who matched the betting in one round switches to raising in the next, also questions poker lore that the tactic is a bad idea.
All the same, Elias admits to a little sadness. The arrival of Pluribus, the ultimate poker bot, marks a historical way point for the game. “I’ve done nothing but play poker since I was 16 years old and dedicated my life to it so it’s just very humbling to be beaten by a machine,” he says. “The first time the AI wins is the last time the human will ever win.”