Matching python environment code for Lux AI 2021 Kaggle competition, and a gym interface for RL models.

Overview

Lux AI 2021 python game engine and gym

This is a replica of the Lux AI 2021 game ported directly over to python. It also sets up a classic Reinforcement Learning gym environment to be used to train RL agents for creating agents.

Features LuxAi2021
Lux game engine porting to python ✔️
Documentation
All actions supported ✔️
PPO example training agent ✔️
Example agent converges to a good policy ✔️
Kaggle submission format agents ✔️
Lux replay viewer support ✔️
Game engine consistency validation to base game ✔️

Installation

This should work cross-platform, but I've only tested Windows 10 and Ubuntu.

Important: Use Python 3.7.* for training your models. This is required since when you create a Kaggle submission, the Kaggle competition will run the code using Python 3.7.*, and you will get a model deserialization error if you train the model with Python 3.8>=.

Install luxai2021 environment package by running the installer:

python setup.py install

You will need Node.js version 12 or above: here

Python game interface

To directly use the ported game engine without the RL gym wrapper, here a couple example usages:

from luxai2021.game.game import Game
from luxai2021.game.actions import *
from luxai2021.game.constants import LuxMatchConfigs_Default


if __name__ == "__main__":
    # Create a game
    configs = LuxMatchConfigs_Default
    game = Game(configs)
    
    game_over = False
    while not game_over:
        print("Turn %i" % game.state["turn"])

        # Array of actions for both teams. Eg: MoveAction(team, unit_id, direction)
        actions = [] 

        game_over = game.run_turn_with_actions(actions)
    
    print("Game done, final map:")
    print(game.map.get_map_string())

Python gym environment interface for RL

A gym interface and match controller was created that supports creating custom agents, and a framework to submit them in kaggle submissions. Keep in mind that this framework is built around one action per unit + city_tile that can act each turn. Creating a basic gym interface looks like the following, however you should look at the more complete example in the examples subfolder:

import random
from stable_baselines3 import PPO  # pip install stable-baselines3
from luxai2021.env.lux_env import LuxEnvironment, SaveReplayAndModelCallback
from luxai2021.env.agent import Agent, AgentWithModel
from luxai2021.game.game import Game
from luxai2021.game.actions import *
from luxai2021.game.constants import LuxMatchConfigs_Default
from functools import partial  # pip install functools
import numpy as np
from gym import spaces
import time
import sys

class MyCustomAgent(AgentWithModel):
    def __init__(self, mode="train", model=None) -> None:
        """
        Implements an agent opponent
        """
        super().__init__(mode, model)
        
        # Define action and observation space
        # They must be gym.spaces objects
        # Example when using discrete actions:
        self.actions_units = [
            partial(MoveAction, direction=Constants.DIRECTIONS.CENTER),  # This is the do-nothing action
            partial(MoveAction, direction=Constants.DIRECTIONS.NORTH),
            partial(MoveAction, direction=Constants.DIRECTIONS.WEST),
            partial(MoveAction, direction=Constants.DIRECTIONS.SOUTH),
            partial(MoveAction, direction=Constants.DIRECTIONS.EAST),
            SpawnCityAction,
        ]
        self.actions_cities = [
            SpawnWorkerAction,
            SpawnCartAction,
            ResearchAction,
        ]
        self.action_space = spaces.Discrete(max(len(self.actions_units), len(self.actions_cities)))
        self.observation_space = spaces.Box(low=0, high=1, shape=(10,1), dtype=np.float16)

    def game_start(self, game):
        """
        This function is called at the start of each game. Use this to
        reset and initialize per game. Note that self.team may have
        been changed since last game. The game map has been created
        and starting units placed.

        Args:
            game ([type]): Game.
        """
        pass

    def turn_heurstics(self, game, is_first_turn):
        """
        This is called pre-observation actions to allow for hardcoded heuristics
        to control a subset of units. Any unit or city that gets an action from this
        callback, will not create an observation+action.

        Args:
            game ([type]): Game in progress
            is_first_turn (bool): True if it's the first turn of a game.
        """
        return
    
    def get_observation(self, game, unit, city_tile, team, is_new_turn):
        """
        Implements getting a observation from the current game for this unit or city
        """
        return np.zeros((10,1))
    
    def action_code_to_action(self, action_code, game, unit=None, city_tile=None, team=None):
        """
        Takes an action in the environment according to actionCode:
            action_code: Index of action to take into the action array.
        Returns: An action.
        """
        # Map action_code index into to a constructed Action object
        try:
            x = None
            y = None
            if city_tile is not None:
                x = city_tile.pos.x
                y = city_tile.pos.y
            elif unit is not None:
                x = unit.pos.x
                y = unit.pos.y
            
            if city_tile != None:
                action =  self.actions_cities[action_code%len(self.actions_cities)](
                    game=game,
                    unit_id=unit.id if unit else None,
                    unit=unit,
                    city_id=city_tile.city_id if city_tile else None,
                    citytile=city_tile,
                    team=team,
                    x=x,
                    y=y
                )
            else:
                action =  self.actions_units[action_code%len(self.actions_units)](
                    game=game,
                    unit_id=unit.id if unit else None,
                    unit=unit,
                    city_id=city_tile.city_id if city_tile else None,
                    citytile=city_tile,
                    team=team,
                    x=x,
                    y=y
                )
            
            return action
        except Exception as e:
            # Not a valid action
            print(e)
            return None
    
    def take_action(self, action_code, game, unit=None, city_tile=None, team=None):
        """
        Takes an action in the environment according to actionCode:
            actionCode: Index of action to take into the action array.
        """
        action = self.action_code_to_action(action_code, game, unit, city_tile, team)
        self.match_controller.take_action(action)
    
    def game_start(self, game):
        """
        This function is called at the start of each game. Use this to
        reset and initialize per game. Note that self.team may have
        been changed since last game. The game map has been created
        and starting units placed.

        Args:
            game ([type]): Game.
        """
        pass
    
    def get_reward(self, game, is_game_finished, is_new_turn, is_game_error):
        """
        Returns the reward function for this step of the game. Reward should be a
        delta increment to the reward, not the total current reward.
        """
        if is_game_finished:
            if game.get_winning_team() == self.team:
                return 1 # Win!
            else:
                return -1 # Loss

        return 0
    

if __name__ == "__main__":
    # Create the two agents that will play eachother
    
    # Create a default opponent agent that does nothing
    opponent = Agent()
    
    # Create a RL agent in training mode
    player = MyCustomAgent(mode="train")
    
    # Create a game environment
    configs = LuxMatchConfigs_Default
    env = LuxEnvironment(configs=configs,
                     learning_agent=player,
                     opponent_agent=opponent)
    
    # Play 5 games
    env.reset()
    obs = env.reset()
    game_count = 0
    while game_count < 5:
        # Take a random action
        action_code = random.sample(range(player.action_space.n), 1)[0]
        (obs, reward, is_game_over, state) = env.step( action_code )
        
        if is_game_over:
            print(f"Game done turn {env.game.state['turn']}, final map:")
            print(env.game.map.get_map_string())
            obs = env.reset()
            game_count += 1
    
    # Attach a ML model from stable_baselines3 and train a RL model
    model = PPO("MlpPolicy",
                    env,
                    verbose=1,
                    tensorboard_log="./lux_tensorboard/",
                    learning_rate=0.001,
                    gamma=0.998,
                    gae_lambda=0.95,
                    batch_size=2048,
                    n_steps=2048
                )
    
    print("Training model for 100K steps...")
    model.learn(total_timesteps=10000000)
    model.save(path='model.zip')

    # Inference the agent for 5 games
    game_count = 0
    obs = env.reset()
    while game_count < 5:
        action_code, _states = model.predict(obs, deterministic=False)
        (obs, reward, is_game_over, state) = env.step( action_code )
        
        if is_game_over:
            print(f"Game done turn {env.game.state['turn']}, final map:")
            print(env.game.map.get_map_string())
            obs = env.reset()
            game_count += 1



Example python ML training

Create your own agent logic, observations, actions, and rewards by modifying this example:

https://github.com/glmcdona/LuxPythonEnvGym/blob/main/examples/agent_policy.py

Then train your model by:

python ./examples/train.py

You can then run tensorboard to monitor the training:

tensorboard --logdir lux_tensorboard

Example kaggle notebook

Here is a complete training, inference, and kaggle submission example in Notebook format:

https://www.kaggle.com/glmcdona/lux-ai-deep-reinforcement-learning-ppo-example

Preparing a kaggle submission

You have trained a model, and now you'd like to submit it as a kaggle submission. Here are the steps to prepare your submission.

Either view the above kaggle example or prepare a submission yourself:

  1. Place your trained model file as model.zip and your agent file agent_policy.py in the ./kaggle_submissions/ folder.
  2. Run python download_dependencies.py in ./kaggle_submissions/ to copy two required python package dependencies into this folder (luxai2021 and stable_baselines3).
  3. Tarball the folder into a submission tar -czf submission.tar.gz -C kaggle_submissions .

Important: The model.zip needs to have been trained on Python 3.7.* or you get a deserialization error, since this is the python version that Kaggle Environment uses to inference the model in submission.

Creating and viewing a replay

If you are using the example train.py to train your model, replays will be generated and saved along with a copy of the model every 100K steps. By default 5 replay matches will be saved with each model checkpoint into .\\models\\model(runid)_(step_count)_(rand).json to monitor your bot's behaviour. You can view the replay here: https://2021vis.lux-ai.org/

Alternatively to manually generate a replay from a model, you can place your trained model file as model.zip and your agent file agent_policy.py in the ./kaggle_submissions/ folder. Then run a command like the following from that directory:

lux-ai-2021 ./kaggle_submissions/main_lux-ai-2021.py ./kaggle_submissions/main_lux-ai-2021.py --maxtime 100000

This will battle your agent against itself and produce a replay match. This requires the official lux-ai-2021 to be installed, see instructions here: https://github.com/Lux-AI-Challenge/Lux-Design-2021

Comments
  • Error in Kaggle submission

    Error in Kaggle submission

    Hi,

    I have encountered error after kaggle submission. The following is error log from the game play in Kaggle. The game only plays for 1 turn and then stop. I used Python 3.7 to train the model

    [[{"duration": 9.627871, "stdout": "", "stderr": "Traceback (most recent call last):\n  
    File \"./main_lux-ai-2021.py\", line 23, in <module>\n    
    model = PPO.load(f\"model.zip\")\n  
    File \"/kaggle_simulations/agent/stable_baselines3/common/base_class.py\", line 651, in load\n    
    data, params, pytorch_variables = load_from_zip_file(path, device=device, custom_objects=custom_objects)\n  
    File \"/kaggle_simulations/agent/stable_baselines3/common/save_util.py\", line 402, in load_from_zip_file\n    
    data = json_to_data(json_data, custom_objects=custom_objects)\n  
    File \"/kaggle_simulations/agent/stable_baselines3/common/save_util.py\", line 164, in json_to_data\n    
    deserialized_object = cloudpickle.loads(base64_object)\n
    ValueError: unsupported pickle protocol: 5\n"}],
     [{"duration": 0.004913, "stdout": "", "stderr": "Traceback (most recent call last):\n  
    File \"/opt/conda/lib/python3.7/site-packages/kaggle_environments/agent.py\", line 157, in act\n    
    action = self.agent(*args)\n  
    File \"/opt/conda/lib/python3.7/site-packages/kaggle_environments/agent.py\", line 129, in callable_agent\n    
    if callable(agent) \\\n  
    File \"/kaggle_simulations/agent/main.py\", line 76, in python_policy_agent\n    
    agent_process.stdin.flush()\n
    BrokenPipeError: [Errno 32] Broken pipe\n"}]]
    

    Have any encountered this error as well?

    Thanks Jason

    opened by hokhay 11
  • AttributeError: 'NoneType' object has no attribute 'adjacent_city_tiles'

    AttributeError: 'NoneType' object has no attribute 'adjacent_city_tiles'

    Hello, I encountered the following error while training, cell.city_tile is None.

    C:\StudioProjects\Lux\LuxPythonEnvGym\examples\train.py in train(args, player, opponent, load_id)
        130                                              name_prefix=f'rl_model_{run_id}')
        131     model.learn(total_timesteps=args.step_count,
    --> 132                 callback=checkpoint_callback)  # 20M steps
        133     if not os.path.exists(f'models/rl_model_{run_id}_{args.step_count}_steps.zip'):
        134         model.save(path=f'models/rl_model_{run_id}_{args.step_count}_steps.zip')
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\ppo\ppo.py in learn(self, total_timesteps, callback, log_interval, eval_env, eval_freq, n_eval_episodes, tb_log_name, eval_log_path, reset_num_timesteps)
        308             tb_log_name=tb_log_name,
        309             eval_log_path=eval_log_path,
    --> 310             reset_num_timesteps=reset_num_timesteps,
        311         )
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py in learn(self, total_timesteps, callback, log_interval, eval_env, eval_freq, n_eval_episodes, tb_log_name, eval_log_path, reset_num_timesteps)
        235         while self.num_timesteps < total_timesteps:
        236 
    --> 237             continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
        238 
        239             if continue_training is False:
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py in collect_rollouts(self, env, callback, rollout_buffer, n_rollout_steps)
        176                 clipped_actions = np.clip(actions, self.action_space.low, self.action_space.high)
        177 
    --> 178             new_obs, rewards, dones, infos = env.step(clipped_actions)
        179 
        180             self.num_timesteps += env.num_envs
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py in step(self, actions)
        160         """
        161         self.step_async(actions)
    --> 162         return self.step_wait()
        163 
        164     def get_images(self) -> Sequence[np.ndarray]:
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py in step_wait(self)
         42         for env_idx in range(self.num_envs):
         43             obs, self.buf_rews[env_idx], self.buf_dones[env_idx], self.buf_infos[env_idx] = self.envs[env_idx].step(
    ---> 44                 self.actions[env_idx]
         45             )
         46             if self.buf_dones[env_idx]:
    
    C:\StudioProjects\Lux\venv\lib\site-packages\stable_baselines3\common\monitor.py in step(self, action)
         88         if self.needs_reset:
         89             raise RuntimeError("Tried to step environment that needs reset")
    ---> 90         observation, reward, done, info = self.env.step(action)
         91         self.rewards.append(reward)
         92         if done:
    
    c:\studioprojects\lux\luxpythonenvgym\luxai2021\env\lux_env.py in step(self, action_code)
         58         is_game_error = False
         59         try:
    ---> 60             (unit, city_tile, team, is_new_turn) = next(self.match_generator)
         61 
         62             obs = self.learning_agent.get_observation(self.game, unit, city_tile, team, is_new_turn)
    
    c:\studioprojects\lux\luxpythonenvgym\luxai2021\game\match_controller.py in run_to_next_observation(self)
        119                 if agent.get_agent_type() == Constants.AGENT_TYPE.AGENT:
        120                     # Call the agent for the set of actions
    --> 121                     actions = agent.process_turn(self.game, agent.team)
        122                     self.take_actions(actions)
        123                 elif agent.get_agent_type() == Constants.AGENT_TYPE.LEARNING:
    
    C:\StudioProjects\Lux\LuxPythonEnvGym\examples\agent_policy.py in process_turn(self, game, team)
        527         for unit in units:
        528             if unit.can_act():
    --> 529                 obs = self.get_observation(game, unit, None, unit.team, new_turn)
        530                 action_code, _states = self.model.predict(obs, deterministic=True)
        531                 if action_code is not None:
    
    C:\StudioProjects\Lux\LuxPythonEnvGym\examples\agent_policy.py in get_observation(self, game, unit, city_tile, team, is_new_turn)
        363                                     c = game.cities[game.map.get_cell_by_pos(closest_position).city_tile.city_id]
        364                                     obs[observation_index + 6] = min(
    --> 365                                         c.fuel / (c.get_light_upkeep() * 200.0),
        366                                         1.0
        367                                     )
    
    c:\studioprojects\lux\luxpythonenvgym\luxai2021\game\city.py in get_light_upkeep(self)
         37         :return:
         38         """
    ---> 39         return len(self.city_cells) * self.configs["parameters"]["LIGHT_UPKEEP"]["CITY"] - self.get_adjacency_bonuses()
         40 
         41     def get_adjacency_bonuses(self):
    
    c:\studioprojects\lux\luxpythonenvgym\luxai2021\game\city.py in get_adjacency_bonuses(self)
         46         bonus = 0
         47         for cell in self.city_cells:
    ---> 48             bonus += cell.city_tile.adjacent_city_tiles * self.configs["parameters"]["CITY_ADJACENCY_BONUS"]
         49 
         50         return bonus
    
    AttributeError: 'NoneType' object has no attribute 'adjacent_city_tiles'
    
    opened by nosound2 9
  • [Question] Reward calculation in example agent

    [Question] Reward calculation in example agent

    In get_reward function the calculations is this

    reward_state = city_tile_count * 0.01 + unit_count * 0.001
    return reward_state * (city_tile_count + unit_count)
    

    What is the reason for having it squared?

    opened by nosound2 8
  • map generation

    map generation

    Probably the hardest part and I think we may be better off for now to generate seeds and then batch generate maps from the JS engine. Maps generate in about 0.6ms so this won't be a bottleneck ever. Map generation code is a bit complicated and additionally I have 0 confidence we will get the same seeds to generate the same maps unless we just call the JS code.

    Let me know your thoughts.

    opened by StoneT2000 7
  • Problems with installation

    Problems with installation

    Hi! I saw your post in Kaggle introducing this engine and it looks very promising. I'm trying to install it to give it a try but I'm having some problems.

    In particular, there's this error that's stopping the installation:

    AttributeError: 'BuiltinObjectType' object has no attribute 'exception_value'
    

    Here is the Dockerfile that I'm using so you can see all the dependencies installed.

    FROM ubuntu:18.04
    
    # Basic setup
    RUN apt-get update && apt-get install -y -q --no-install-recommends \
            apt-transport-https \
            ca-certificates \
            curl \
            wget \
            software-properties-common
    
    # NVM environment variables
    ENV NVM_DIR /usr/local/nvm
    ENV NODE_VERSION 14.16.0
    # Install NVM
    RUN curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.31.2/install.sh | bash
    
    # Install Node and NPM
    RUN . $NVM_DIR/nvm.sh \
        && nvm install $NODE_VERSION \
        && nvm alias default $NODE_VERSION \
        && nvm use default
    
    # Add node and npm to path so the commands are available
    ENV NODE_PATH $NVM_DIR/v$NODE_VERSION/lib/node_modules
    ENV PATH $NVM_DIR/versions/node/v$NODE_VERSION/bin:$PATH
    
    # ### Install Python ###
    # Python 3.7
    RUN apt-get update && apt-get install -y \
            python3.7 \
            python3-pip \
            ipython3
    # Set Python3.7 as default
    RUN ln -s python3.7 /usr/bin/python
    RUN ln -s pip3 /usr/bin/pip
    
    ### Install Lux AI ###
    # Recommended game engine:
    # RUN npm i -g @lux-ai/2021-challenge@latest
    
    # Alternative game engine (45x faster performance)
    RUN apt-get update && apt-get install -y python3.7-dev
    RUN apt-get update && apt-get install -y git
    RUN python3.7 -m pip install --upgrade pip
    RUN python3.7 -m pip install numpy Cython
    RUN git clone https://github.com/glmcdona/LuxPythonEnvGym
    RUN cd LuxPythonEnvGym && python setup.py install
    

    Here's the last lines of the logs:

    #16 506.5 multiprocessing.pool.RemoteTraceback: 
    #16 506.5 """
    #16 506.5 Traceback (most recent call last):
    #16 506.5   File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    #16 506.5     result = (True, func(*args, **kwds))
    #16 506.5   File "/usr/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    #16 506.5     return list(map(*args))
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1249, in cythonize_one_helper
    #16 506.5     return cythonize_one(*m)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1208, in cythonize_one
    #16 506.5     result = compile_single(pyx_file, options, full_module_name=full_module_name)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Main.py", line 727, in compile_single
    #16 506.5     return run_pipeline(source, options, full_module_name)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Main.py", line 515, in run_pipeline
    #16 506.5     err, enddata = Pipeline.run_pipeline(pipeline, source)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 355, in run_pipeline
    #16 506.5     data = run(phase, data)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 335, in run
    #16 506.5     return phase(data)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 52, in generate_pyx_code_stage
    #16 506.5     module_node.process_implementation(options, result)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ModuleNode.py", line 143, in process_implementation
    #16 506.5     self.generate_c_code(env, options, result)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ModuleNode.py", line 385, in generate_c_code
    #16 506.5     self.body.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 442, in generate_function_definitions
    #16 506.5     stat.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 442, in generate_function_definitions
    #16 506.5     stat.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 3179, in generate_function_definitions
    #16 506.5     FuncDefNode.generate_function_definitions(self, env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 1986, in generate_function_definitions
    #16 506.5     self.generate_function_body(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 1748, in generate_function_body
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/UtilNodes.py", line 326, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 6715, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 6285, in generate_execution_code
    #16 506.5     self.else_clause.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 7228, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 5156, in generate_execution_code
    #16 506.5     self.expr.generate_evaluation_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 5875, in generate_evaluation_code
    #16 506.5     self.allocate_temp_result(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 721, in allocate_temp_result
    #16 506.5     elif not (self.result_is_used or type.is_memoryviewslice or self.is_c_result_required()):
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 5841, in is_c_result_required
    #16 506.5     if not func_type.exception_value or func_type.exception_check == '+':
    #16 506.5 AttributeError: 'BuiltinObjectType' object has no attribute 'exception_value'
    #16 506.5 """
    #16 506.5 
    #16 506.5 The above exception was the direct cause of the following exception:
    #16 506.5 
    #16 506.5 Traceback (most recent call last):
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 154, in save_modules
    #16 506.5     yield saved
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 195, in setup_context
    #16 506.5     yield
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 250, in run_setup
    #16 506.5     _execfile(setup_script, ns)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 45, in _execfile
    #16 506.5     exec(code, globals, locals)
    #16 506.5   File "/tmp/easy_install-jufi3avs/pandas-1.3.4/setup.py", line 650, in <module>
    #16 506.5   File "/tmp/easy_install-jufi3avs/pandas-1.3.4/setup.py", line 423, in maybe_cythonize
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1093, in cythonize
    #16 506.5     result.get(99999)  # seconds
    #16 506.5   File "/usr/lib/python3.7/multiprocessing/pool.py", line 657, in get
    #16 506.5     raise self._value
    #16 506.5 AttributeError: 'BuiltinObjectType' object has no attribute 'exception_value'
    #16 506.5 
    #16 506.5 During handling of the above exception, another exception occurred:
    #16 506.5 
    #16 506.5 Traceback (most recent call last):
    #16 506.5   File "setup.py", line 24, in <module>
    #16 506.5     tests_require=['nose2'],
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 129, in setup
    #16 506.5     return distutils.core.setup(**attrs)
    #16 506.5   File "/usr/lib/python3.7/distutils/core.py", line 148, in setup
    #16 506.5     dist.run_commands()
    #16 506.5   File "/usr/lib/python3.7/distutils/dist.py", line 966, in run_commands
    #16 506.5     self.run_command(cmd)
    #16 506.5   File "/usr/lib/python3.7/distutils/dist.py", line 985, in run_command
    #16 506.5     cmd_obj.run()
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 67, in run
    #16 506.5     self.do_egg_install()
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 117, in do_egg_install
    #16 506.5     cmd.run()
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 437, in run
    #16 506.5     self.easy_install(spec, not self.no_deps)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 679, in easy_install
    #16 506.5     return self.install_item(None, spec, tmpdir, deps, True)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 726, in install_item
    #16 506.5     self.process_distribution(spec, dist, deps)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 771, in process_distribution
    #16 506.5     [requirement], self.local_index, self.easy_install
    #16 506.5   File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 774, in resolve
    #16 506.5     replace_conflicting=replace_conflicting
    #16 506.5   File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1057, in best_match
    #16 506.5     return self.obtain(req, installer)
    #16 506.5   File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1069, in obtain
    #16 506.5     return installer(requirement)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 698, in easy_install
    #16 506.5     return self.install_item(spec, dist.location, tmpdir, deps)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 724, in install_item
    #16 506.5     dists = self.install_eggs(spec, download, tmpdir)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 909, in install_eggs
    #16 506.5     return self.build_and_install(setup_script, setup_base)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 1177, in build_and_install
    #16 506.5     self.run_setup(setup_script, setup_base, args)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/command/easy_install.py", line 1163, in run_setup
    #16 506.5     run_setup(setup_script, args)
    #16 506.5   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 253, in run_setup
    #16 506.5 Traceback (most recent call last):
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1249, in cythonize_one_helper
    #16 506.5     return cythonize_one(*m)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1208, in cythonize_one
    #16 506.5     result = compile_single(pyx_file, options, full_module_name=full_module_name)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Main.py", line 727, in compile_single
    #16 506.5     return run_pipeline(source, options, full_module_name)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Main.py", line 515, in run_pipeline
    #16 506.5     err, enddata = Pipeline.run_pipeline(pipeline, source)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 355, in run_pipeline
    #16 506.5     data = run(phase, data)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 335, in run
    #16 506.5     return phase(data)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Pipeline.py", line 52, in generate_pyx_code_stage
    #16 506.5     module_node.process_implementation(options, result)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ModuleNode.py", line 143, in process_implementation
    #16 506.5     self.generate_c_code(env, options, result)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ModuleNode.py", line 385, in generate_c_code
    #16 506.5     self.body.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 442, in generate_function_definitions
    #16 506.5     stat.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 442, in generate_function_definitions
    #16 506.5     stat.generate_function_definitions(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 3179, in generate_function_definitions
    #16 506.5     FuncDefNode.generate_function_definitions(self, env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 1986, in generate_function_definitions
    #16 506.5     self.generate_function_body(env, code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 1748, in generate_function_body
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/UtilNodes.py", line 326, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 6715, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 6285, in generate_execution_code
    #16 506.5     self.else_clause.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 7228, in generate_execution_code
    #16 506.5     self.body.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 448, in generate_execution_code
    #16 506.5     stat.generate_execution_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/Nodes.py", line 5156, in generate_execution_code
    #16 506.5     self.expr.generate_evaluation_code(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 5875, in generate_evaluation_code
    #16 506.5     self.allocate_temp_result(code)
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 721, in allocate_temp_result
    #16 506.5     elif not (self.result_is_used or type.is_memoryviewslice or self.is_c_result_required()):
    #16 506.5   File "/usr/local/lib/python3.7/dist-packages/Cython/Compiler/ExprNodes.py", line 5841, in is_c_result_required
    #16 506.5     if not func_type.exception_value or func_type.exception_check == '+':
    #16 506.5 AttributeError: 'BuiltinObjectType' object has no attribute 'exception_value'
    #16 506.5 [ 1/41] Cythonizing pandas/_libs/algos.pyx
    #16 506.5 [ 2/41] Cythonizing pandas/_libs/arrays.pyx
    #16 506.5 [ 3/41] Cythonizing pandas/_libs/groupby.pyx
    #16 506.5 [ 4/41] Cythonizing pandas/_libs/hashing.pyx
    #16 506.5 [ 5/41] Cythonizing pandas/_libs/hashtable.pyx
    #16 506.5 [ 6/41] Cythonizing pandas/_libs/index.pyx
    #16 506.5 [ 7/41] Cythonizing pandas/_libs/indexing.pyx
    #16 506.5 [ 8/41] Cythonizing pandas/_libs/internals.pyx
    #16 506.5 [ 9/41] Cythonizing pandas/_libs/interval.pyx
    #16 506.5 [10/41] Cythonizing pandas/_libs/join.pyx
    #16 506.5 [11/41] Cythonizing pandas/_libs/lib.pyx
    #16 506.5 [12/41] Cythonizing pandas/_libs/missing.pyx
    #16 506.5 [13/41] Cythonizing pandas/_libs/ops.pyx
    #16 506.5 [14/41] Cythonizing pandas/_libs/ops_dispatch.pyx
    #16 506.5 [15/41] Cythonizing pandas/_libs/parsers.pyx
    #16 506.5 [16/41] Cythonizing pandas/_libs/properties.pyx
    #16 506.5 [17/41] Cythonizing pandas/_libs/reduction.pyx
    #16 506.5 [18/41] Cythonizing pandas/_libs/reshape.pyx
    #16 506.5 [19/41] Cythonizing pandas/_libs/sparse.pyx
    #16 506.5 [20/41] Cythonizing pandas/_libs/testing.pyx
    #16 506.5 [21/41] Cythonizing pandas/_libs/tslib.pyx
    #16 506.5 [22/41] Cythonizing pandas/_libs/tslibs/base.pyx
    #16 506.5 [23/41] Cythonizing pandas/_libs/tslibs/ccalendar.pyx
    #16 506.5 [24/41] Cythonizing pandas/_libs/tslibs/conversion.pyx
    #16 506.5 [25/41] Cythonizing pandas/_libs/tslibs/dtypes.pyx
    #16 506.5 [26/41] Cythonizing pandas/_libs/tslibs/fields.pyx
    #16 506.5 [27/41] Cythonizing pandas/_libs/tslibs/nattype.pyx
    #16 506.5 [28/41] Cythonizing pandas/_libs/tslibs/np_datetime.pyx
    #16 506.5 [29/41] Cythonizing pandas/_libs/tslibs/offsets.pyx
    #16 506.5 [30/41] Cythonizing pandas/_libs/tslibs/parsing.pyx
    #16 506.5 [31/41] Cythonizing pandas/_libs/tslibs/period.pyx
    #16 506.5 [32/41] Cythonizing pandas/_libs/tslibs/strptime.pyx
    #16 506.5 [33/41] Cythonizing pandas/_libs/tslibs/timedeltas.pyx
    #16 506.5 [34/41] Cythonizing pandas/_libs/tslibs/timestamps.pyx
    #16 506.5 [35/41] Cythonizing pandas/_libs/tslibs/timezones.pyx
    #16 506.5 [36/41] Cythonizing pandas/_libs/tslibs/tzconversion.pyx
    #16 506.5 [37/41] Cythonizing pandas/_libs/tslibs/vectorized.pyx
    #16 506.5 [38/41] Cythonizing pandas/_libs/window/aggregations.pyx
    #16 506.5 [39/41] Cythonizing pandas/_libs/window/indexers.pyx
    #16 506.5 [40/41] Cythonizing pandas/_libs/writers.pyx
    #16 506.5 [41/41] Cythonizing pandas/io/sas/sas.pyx
    #16 506.6     raise
    #16 506.6   File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__
    #16 506.6     self.gen.throw(type, value, traceback)
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 195, in setup_context
    #16 506.6     yield
    #16 506.6   File "/usr/lib/python3.7/contextlib.py", line 130, in __exit__
    #16 506.6     self.gen.throw(type, value, traceback)
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 166, in save_modules
    #16 506.6     saved_exc.resume()
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 141, in resume
    #16 506.6     six.reraise(type, exc, self._tb)
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/_vendor/six.py", line 685, in reraise
    #16 506.6     raise value.with_traceback(tb)
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 154, in save_modules
    #16 506.6     yield saved
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 195, in setup_context
    #16 506.6     yield
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 250, in run_setup
    #16 506.6     _execfile(setup_script, ns)
    #16 506.6   File "/usr/lib/python3/dist-packages/setuptools/sandbox.py", line 45, in _execfile
    #16 506.6     exec(code, globals, locals)
    #16 506.6   File "/tmp/easy_install-jufi3avs/pandas-1.3.4/setup.py", line 650, in <module>
    #16 506.6   File "/tmp/easy_install-jufi3avs/pandas-1.3.4/setup.py", line 423, in maybe_cythonize
    #16 506.6   File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1093, in cythonize
    #16 506.6     result.get(99999)  # seconds
    #16 506.6   File "/usr/lib/python3.7/multiprocessing/pool.py", line 657, in get
    #16 506.6     raise self._value
    #16 506.6 AttributeError: 'BuiltinObjectType' object has no attribute 'exception_value'
    

    Hope you've encountered this before! If you have some idea of how I can fix it, please let me know.

    Thanks and congrats for the repo :D

    opened by alesolano 5
  • Simulation of replays, adding more pytests, bug fixes

    Simulation of replays, adding more pytests, bug fixes

    Hello, @glmcdona , @royerk , @StoneT2000 please take a look at this pull request, it includes the following changes (in order of appearance in the compare).

    In summary, it allows replaying episodes downloaded from Kaggle leaderboard, and checks that the states are the same. Added the corresponding pytest for 7 replays. And fixed some bugs encountered along the way.

    1. Adding a replay argument to class Agent which is a full episode replay as can be downloaded from https://www.kaggle.com/robga/simulations-episode-scraper-match-downloader If a replay is present, this default agent will answer with the actions from the replay.
    2. Adding replay_validate argument to class LuxEnvironment and class MatchController. In run_to_next_observation method after each turn it will validate that the game state is exactly identical to the state provided in the replay. This is the main thing in this pull request, it allows strong testing of the engine.
    3. Adding method run_no_learn to class LuxEnvironment which will run an episode, but it only works in case there are no learning agents. It is needed to run two replay agents as a simulation.
    4. A small bugfix in constants.py
    5. In class Game cutting out a new process_updates function, which now can not only assign updates but also validate updates. This list of updates is what we get in the replays, and process_updates is where we check that the map state is exactly the same.
    6. Cosmetic changes in action_from_command. Needed to call it with different arguments, and some more minor changes there.
    7. Changed default argument {"wood": 0, "uranium": 0, "coal": 0} to None. It is a bugfix, in some cases the same cargo dictionary was shared between couple of units. Remember, not a good idea to use lists and dictionaries as default arguments.
    8. In spawn_city_tile changed city_ids_found from set to a list. This was a nasty bug which produces randomness in tests, because sets are not guaranteed to keep order.
    9. In class Cart method turn removed a duplicated cooldown subtraction.
    10. Added 7 replays for tests in folder replays_for_tests.
    11. Added new pytests family in function test_run_replay.

    Let me know if something looks wrong!

    opened by nosound2 5
  • Add game replay saving, add checkpoint callback to save some replays along with the models

    Add game replay saving, add checkpoint callback to save some replays along with the models

    By default saves 5 replay matches (itself against itself) every 100K steps during training along side the model. Saved to .\models\.

    Adds:

    • Game() ability to write replay outputs.
    • SaveReplayAndModelCallback callback that can be used in training to save both model checkpoints along-side replays. Useful for examining the behavior that your model is learning.
    • Change example training script default param n_steps to be the same as batch size.
    • Action sequences now takes additional argument pointing to current game for better programmatic actions. I think this feature is unused, but will cause a failure if anyone else is using actions sequences unfortunately.
    opened by glmcdona 4
  • Update README.md

    Update README.md

    @nosound2 finished validation of this engine versus the original luxai2020 engine, so marking as complete. Add a couple minimal examples of different ways to use this game engine.

    opened by glmcdona 3
  • Update example training agent and script

    Update example training agent and script

    Updates:

    1. Split unit and city actions.
    2. Update reward function to a better example that is a delta of reward and scaled reasonably. Fixes issue #83.
    3. Set default example agent to inference on non-deterministic mode. Agents often get stuck when set to deterministic in inference.
    4. Fix bug in unit maps where it wouldn't track nearest unit correctly.
    5. Add multi-environment training command-line arg.
    6. Add multi-environment evaluation metrics logging.
    7. Add single-environment tensorboard game internal metrics logging.
    opened by glmcdona 3
  • SAC model need box action space

    SAC model need box action space

    I am trying to use SAC algorithm to doing training. When I implement the SAC model, I got error and realized that it requests "box" action space instead of discrete action space.

    I saw comment from Kaggle saying that it is suppose to run any of A2C, DDPG, DQN, HER, PPO, SAC, or TD3 right out of the box, so am I missing something important here?

    Thanks Jason

    opened by hokhay 3
  • Improve random map generation speed in training.

    Improve random map generation speed in training.

    Use python random number generator when a seed is not specified. In this case, matching the exact same results per seed is not needed and increases training speed. Resolves #77. Also shallow copy config just in case.

    opened by glmcdona 3
  • Suggest to loosen the dependency on stable-baselines3

    Suggest to loosen the dependency on stable-baselines3

    Hi, your project LuxPythonEnvGym(commit id: 55e8ddc15012fd55f17b23aa2e73c919467e43e3) requires "stable-baselines3==1.2.1a2" in its dependency. After analyzing the source code, we found that the following versions of stable-baselines3 can also be suitable, i.e., stable-baselines3 1.2.1a1, since all functions that you directly (11 APIs: stable_baselines3.common.callbacks.BaseCallback.init, stable_baselines3.ppo.ppo.PPO.learn, stable_baselines3.common.base_class.BaseAlgorithm.save, stable_baselines3.common.utils.get_schedule_fn, stable_baselines3.common.base_class.BaseAlgorithm.predict, stable_baselines3.common.callbacks.EvalCallback.init, stable_baselines3.ppo.ppo.PPO.init, stable_baselines3.common.vec_env.subproc_vec_env.SubprocVecEnv.init, stable_baselines3.common.base_class.BaseAlgorithm.load, stable_baselines3.common.base_class.BaseAlgorithm.set_env, stable_baselines3.common.utils.set_random_seed) or indirectly (propagate to -11 stable-baselines3's internal APIs and 123 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

    Therefore, we believe that it is quite safe to loose your dependency on stable-baselines3 from "stable-baselines3==1.2.1a2" to "stable-baselines3>=1.2.1a1,<=1.2.1a2". This will improve the applicability of LuxPythonEnvGym and reduce the possibility of any further dependency conflict with other projects.

    May I pull a request to further loosen the dependency on stable-baselines3?

    By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

    opened by Agnes-U 0
  • [feature request] callback returning wins (or winrate) vs another agent

    [feature request] callback returning wins (or winrate) vs another agent

    When training with different reward functions it's hard to compare 2 bots. A callback capable of running n games between current agent and another would prove useful to measure progress.

    I will look into it but if someone knows how to do that help is welcome.

    opened by kevinu3d 1
  • [feature request] use legacy kind of agents to run on the new environment

    [feature request] use legacy kind of agents to run on the new environment

    Use legacy kind of agents to run on the new environment.

    It seems that what you want to achieve is not supported, the structure is different. The main problem as I see it is that the observation state is constructed differently, so the "new" state can not be directly given to the "legacy" agent. But even if this is resolved, still need to write wrappers to fit the structure. Maybe it is worth doing this work. You can open an issue on the github.

    https://www.kaggle.com/c/lux-ai-2021/discussion/276419#1535771

    opened by vitoque-git 3
  • [Discussion] The learning design

    [Discussion] The learning design

    I think about the learning design that is implemented here, and I just can't resolve to myself two questions. The core function for the learning is the environment step function. The chain of learning is [OBS_UNIT1 -> ACTION1 -> REWARD -> OBS_UNIT2 -> ACTION2 -> OBS_UNIT3 -> ACTION3 ... -> ALL TURN ACTIONS ARE ACTUALLY TAKEN] -> [THE SAME FOR THE NEXT TURN ...]. The questions are:

    1. Less important. Only the first action gets reward. Doesn't it create significant problems, especially when the number of units per turn is big? Especially if the discount factor gamma is small, but also in general. Even this intermediate reward for most actions is delayed. I wonder how much harder the life is for the model because of this. One thing, - the ordering of the units to act can be important. I can imagine that the model can handle it. But is there an example of multi-unit problems that are designed like this?

    2. More important. The algorithms like TD(0), Q-Learning, and more involved like PPO, all depend for the model update not only on the current state (or state-action pair) but also the next one. But the next step is a different unit, its observation is unit-dependent, its value function is completely different, and barely related. The process is basically not markovian, the states are heavily incomplete information, and each time different incomplete information. Isn't it a no-go? Or I miss-understand something major?

    Please share your thought!

    opened by nosound2 23
  • [Feature Request] Example agent for agents that process whole turns instead of single unit actions

    [Feature Request] Example agent for agents that process whole turns instead of single unit actions

    TODO: Add an example agent that takes the game state, and returns a list of actions for all units for their team. Bonus if the example implements something like custom RL.

    opened by glmcdona 0
Owner
Geoff McDonald
@glmcdona
Geoff McDonald
Winning solution of the Indoor Location & Navigation Kaggle competition

This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft

Tom Van de Wiele 62 Dec 28, 2022
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

null 68 Dec 9, 2022
Data visualization app for H&M competition in kaggle

handm_data_visualize_app Data visualization app by streamlit for H&M competition in kaggle. competition page: https://www.kaggle.com/competitions/h-an

Kyohei Uto 12 Apr 30, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

null 1 Nov 10, 2021
An OpenAI Gym environment for Super Mario Bros

gym-super-mario-bros An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) us

Andrew Stelmach 1 Jan 5, 2022
Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Crypto_Bot Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies. Steps to get started using the bot: Sign up

null 21 Oct 3, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

AoxiangFan 11 Nov 7, 2022
Code for STFT Transformer used in BirdCLEF 2021 competition.

STFT_Transformer Code for STFT Transformer used in BirdCLEF 2021 competition. The STFT Transformer is a new way to use Transformers similar to Vision

Jean-François Puget 69 Sep 29, 2022
My 1st place solution at Kaggle Hotel-ID 2021

1st place solution at Kaggle Hotel-ID My 1st place solution at Kaggle Hotel-ID to Combat Human Trafficking 2021. https://www.kaggle.com/c/hotel-id-202

Kohei Ozaki 18 Aug 19, 2022
Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

KAIROS MineRL BASALT Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL B

Vinicius G. Goecks 37 Oct 30, 2022
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

Jianquan Ye 298 Dec 21, 2022
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

yuxzho 94 Dec 25, 2022
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.

2021AICompetition-03 본 repo 는 mAy-I Inc. 팀으로 참가한 2021 인공지능 온라인 경진대회 중 [이미지] 운전 사고 예방을 위한 운전자 부주의 행동 검출 모델] 태스크 수행을 위한 레포지토리입니다. mAy-I 는 과학기술정보통신부가 주최하

Junhyuk Park 9 Dec 1, 2022
QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

null 249 Jan 3, 2023
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 8, 2022