Researchers at the University of Alberta have just announced
that “heads-up limit hold ‘em”, the simplest form of poker commonly played for
cash stakes, has been “solved” (science paper, summary on
nature.com). This means that the researchers have found a complete strategy,
which describes what plays to make in every possible situation of the game, and
that is essentially unbeatable. This strategy will not lose against any other
strategy, a defensive strategy that therefore makes it almost certain to win
over the long run. This is a landmark in artificial intelligence, as another
game is dominated by computers and not humans at the highest level (think
chess). But what does this mean for the psychology of how humans play poker,
and does it mean that downloading the perfect strategy will make you rich?
Game theory is the study of optimal decisions involving two
or more competing interests (players). Games such as noughts-and-crosses, or
indeed poker, were the initial inspiration for game theory, but now the theory
of games covers any situation where the strategies and payoffs for two or more
players can be formally modeled. The current research marks a landmark in at
least one way: this is the first time that a significant game of “incomplete
information” has been solved. Previous games to be solved (e.g. checkers) are
games of “complete information”, where both players know the exact state of the
game. In poker, each player’s private cards mean that neither player knows the
exact game state (until the hand is over), and this leads to the phenomenon of
bluffing where players pretend to be strong when they’re actually weak. Games
of incomplete information are more realistic as models of real-life games,
where the assumption of complete information can be far-fetched.
Computers solve games via brute-force, trialing different
strategies until they find one that cannot be beaten. The complexity of most
games is what makes this difficult. To make progress with complex games, researchers
generally need to combine a massive amount of computing power with some clever
programming tricks to design strong strategies. These were the main factors also underlying this
breakthrough: a new solution algorithm and over two months of calculations
over 4,000 CPUs were enough to do the job. This was enough computing power to
surpass previous heads-up limit hold ‘em agents (which have been involved in
annual competitions for some time).
This means that the latest breakthrough is excellent work
but is not a real game-changer. In terms of playing style, the latest agent
plays pretty much the same strategy to a neutral observer as previous agents.
The only difference is now that the new agent is so close to the optimal game
theoretic strategy, that researchers have called this form of poker “solved” in
a bid to move research to more complex games. There are many more complex forms
of poker that have not been solved, and will become a larger object of focus
after this result. Merely adding one extra player to the game (the term “heads-up”
refers to when only two players are involved in a game), or changing the
betting rules slightly (in “limit” poker a player can only bet a set amount at
each situation, in “no-limit” players can choose their bet size) is sufficient
to create a much more complex game. These more complex forms of poker are
unlikely to be solved any time soon.
As a game played for cash stakes, heads-up limit hold ‘em
poker has been declining in popularity for some time. One reason is the
relative simplicity of this form of poker; in fact poker agents have been
(illegally) playing this game online for some time against unsuspecting punters
across various poker networks. You can even play this form of poker against
computers on modern betting terminals in Las Vegas. These games are free of
charge, but the owners are so confident their strategy is better than any human
that all-comers are accepted. So this
breakthrough is unlikely to make you a fortune if you get hands on a copy: the
low-hanging fruit have been harvested some time ago in this form of poker.
Of course, as humans we have no way of remembering the
terabyte-or-so of information (in heavily compressed form) required to describe
this unbeatable strategy. So how can humans possibly hope to compete? Well,
humans will never be as good as the computer’s strategy in a straight battle,
but the approximations and heuristics that skilled humans use have proven
remarkably effective. Many professional players use two key heuristics on the first
round of heads-up limit hold ‘em: raise-or-fold as the first player (never
calling), and always call as the first player if the second player re-raises.
It turns out that the computer agrees with these simple heuristics well over
99% of the time. A boundedly-rational human plays almost exactly the same way
as a computer with complete knowledge! (Without requiring the million hours of
CPU time.) These heuristics have been an open secret among top players for more
than five years.
Humans have further advantages over computers. Tweak a few
parameters of the game and most expert humans can easily adjust, but an
optimising agent has to completely start from scratch. Secondly, humans can
often win a lot more money against imperfect opponents than this agent (using
what are called exploitive strategies), an important criterion in the
real-world where winning money and not prestige are the main goals of a
professional gambler. To sum up, solving heads-up limit hold ‘em is a great
achievement in artificial intelligence, but one that just goes to underscore
how amazing human intelligence can be.
No comments:
Post a Comment