⚠️ This is an experimental API, it will most definitely contain bugs, but that's why you are here!
pip install holdem
Afaik, this is the first OpenAI Gym No-Limit Texas Hold'em* (NLTH) environment written in Python. It's an experiment to build a Gym environment that is synchronous and can support any number of players but also appeal to the general public that wants to learn how to "solve" NLTH.
*Python 3 supports arbitrary length integers 💸
Right now, this is a work in progress, but I believe the API is mature enough for some preliminary experiments. Join me in making some interesting progress on multi-agent Gym environments.
There is limited documentation at the moment. I'll try to make this less painful to understand.
env = holdem.TexasHoldemEnv(n_seats, max_limit=1e9, debug=False)
Creates a gym environment representation a NLTH Table from the parameters:
n_seats- number of available players for the current table. No players are initially allocated to the table. You must call
env.add_player(seat_id, ...)to populate the table.
max_limit- max_limit is used to define the
gym.spacesAPI for the class. It does not actually determine any NLTH limits; in support of
debug- add debug statements to play, will probably be removed in the future.
Adds a player to the table according to the specified seat (
seat_id) and the initial amount of
chips allocated to the player's
stack. If the table does not have enough seats according to the
n_seats used by the constructor, a
gym.error.Error will be raised.
(player_states, community_states) = env.reset()
env.reset resets the NLTH table to a new hand state. It does not reset any of the players
stacks, or, reset any of the blinds. New behavior is reserved for a special, future portion of the
API that is yet another feature that is not standard in Gym environments and is a work in progress.
The observation returned is a
tuple of the following by index:
tuplewhere each entry is
tuple(player_info, player_hand), this feature can be used to gather all states and hands by
(player_infos, player_hands) = zip(*player_states).
player_infos- is a
intfeatures describing the individual player. It contains the following by index: 0.
0- seat is empty,
1- seat is not empty.
[0, n_seats - 1]- player's id, where they are sitting.
[0, inf]- player's current stack.
[0, 1]- player is playing the current hand.
[0, inf]the player's current handrank according to
0- player has not played this round,
1- player has played this round.
0- player is currently not betting,
1- player is betting.
0- player is currently not all-in,
1- player is all-in.
[0, inf]- player's last sidepot.
player_hands- is a
intfeatures describing the cards in the player's pocket. The values are encoded based on the
listby index: 0.
[0, n_seats - 1]- location of the dealer button, where big blind is posted.
[0, inf]- the current small blind amount.
[0, inf]- the current big blind amount.
[0, inf]- the current total amount in the community pot.
[0, inf]- the last posted raise amount.
[0, inf]- minimum required raise amount, if above 0.
[0, inf]- the amount required to call.
[0, n_seats - 1]- the current player required to take an action.
community_cards- is a
intfeatures describing the cards in the community. The values are encoded based on the
treys.Cardinteger representation. There are 5
intin the list, where
-1represents that there is no card present.
import gym import holdem def play_out_hand(env, n_seats): # reset environment, gather relevant observations (player_states, (community_infos, community_cards)) = env.reset() (player_infos, player_hands) = zip(*player_states) # display the table, cards and all env.render(mode='human') terminal = False while not terminal: # play safe actions, check when noone else has raised, call when raised. actions = holdem.safe_actions(community_infos, n_seats=n_seats) (player_states, (community_infos, community_cards)), rews, terminal, info = env.step(actions) env.render(mode='human') env = gym.make('TexasHoldem-v1') # holdem.TexasHoldemEnv(2) # start with 2 players env.add_player(0, stack=2000) # add a player to seat 0 with 2000 "chips" env.add_player(1, stack=2000) # add another player to seat 1 with 2000 "chips" # play out a hand play_out_hand(env, env.n_seats) # add one more player env.add_player(2, stack=2000) # add another player to seat 1 with 2000 "chips" # play out another hand play_out_hand(env, env.n_seats)