A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Surag Nair

Last update: Jan 5, 2023

Related tags

Deep Learning reinforcement-learning deep-learning chainer tensorflow keras pytorch mcts othello gomoku monte-carlo-tree-search gobang alphago tf alphago-zero alpha-zero alphazero self-play

Overview

Alpha Zero General (any game, any framework!)

A simplified, highly flexible, commented and (hopefully) easy to understand implementation of self-play based reinforcement learning based on the AlphaGo Zero paper (Silver et al). It is designed to be easy to adopt for any two-player turn-based adversarial game and any deep learning framework of your choice. A sample implementation has been provided for the game of Othello in PyTorch, Keras, TensorFlow and Chainer. An accompanying tutorial can be found here. We also have implementations for GoBang and TicTacToe.

To use a game of your choice, subclass the classes in Game.py and NeuralNet.py and implement their functions. Example implementations for Othello can be found in othello/OthelloGame.py and othello/{pytorch,keras,tensorflow,chainer}/NNet.py.

Coach.py contains the core training loop and MCTS.py performs the Monte Carlo Tree Search. The parameters for the self-play can be specified in main.py. Additional neural network parameters are in othello/{pytorch,keras,tensorflow,chainer}/NNet.py (cuda flag, batch size, epochs, learning rate etc.).

To start training a model for Othello:

python main.py

Choose your framework and game in main.py.

Docker Installation

For easy environment setup, we can use nvidia-docker. Once you have nvidia-docker set up, we can then simply run:

./setup_env.sh

to set up a (default: pyTorch) Jupyter docker container. We can now open a new terminal and enter:

docker exec -ti pytorch_notebook python main.py

Experiments

We trained a PyTorch model for 6x6 Othello (~80 iterations, 100 episodes per iteration and 25 MCTS simulations per turn). This took about 3 days on an NVIDIA Tesla K80. The pretrained model (PyTorch) can be found in pretrained_models/othello/pytorch/. You can play a game against it using pit.py. Below is the performance of the model against a random and a greedy baseline with the number of iterations.

A concise description of our algorithm can be found here.

Contributing

While the current code is fairly functional, we could benefit from the following contributions:

Game logic files for more games that follow the specifications in Game.py, along with their neural networks
Neural networks in other frameworks
Pre-trained models for different game configurations
An asynchronous version of the code- parallel processes for self-play, neural net training and model comparison.
Asynchronous MCTS as described in the paper

Contributors and Credits

Shantanu Thakoor and Megha Jhunjhunwala helped with core design and implementation.
Shantanu Kumar contributed TensorFlow and Keras models for Othello.
Evgeny Tyurin contributed rules and a trained model for TicTacToe.
MBoss contributed rules and a model for GoBang.
Jernej Habjan contributed RTS game.
Adam Lawson contributed rules and a trained model for 3D TicTacToe.
Carlos Aguayo contributed rules and a trained model for Dots and Boxes along with a JavaScript implementation.
Robert Ronan contributed rules for Santorini.

Comments

invalid value in division

When I ran 'main.py' under tensorflow, I also got runtime warning:

'.../othello/tensorflow/NNet.py: 103 RuntimeWarning: invalid value encountered in divide

pi = np.exp(pi) / np.sum(np.exp(pi)) "

Here we calculated softmax. When one value in pi is large, the denominator may overflow.

we can try the following method to avoid it.

x = np.exp(pi - np.max(pi)) pi = x/x.sum()

But not sure whether the above warning was caused by overflow.

Jianxiong

opened by jdongca2003 21
Masking Ps[s]*valids may give an array of zeros

https://github.com/suragnair/alpha-zero-general/blob/a6077c9ee503686eaf5a7b0c513df2a2e33112c1/MCTS.py#L80

In this line we divide by sum of initial policy which were previously masked by valid moves (valids).

I observe the cases when product of Ps[s]*valids is an array of zeros. So sum(Ps[s]) is also zero and in the given line we have numpy warning about "division by zero" after that all Ps[s][a] become NaN. Numpy doesn't raise an error so we go on and in the next visits of the state [s] we have best_act = -1 and then a = -1. Then we invoke next_s, next_player = self.game.getNextState(canonicalBoard, 1, a=-1) which is conceptually illegal operation.

Such cases occur in the PIT phase of 1st iteration and then continue. When I debug a case I see that nn.predict() returns few nonzero values inside Ps[s] which don't match any value from valids and its product gives all zeros. On the next iterations a number of cases is gradually decreasing.

I think that occurences of those cases depend on random number generator because they are platform dependent. If I run the same program on different physical computers I may or may not observe those cases.

The question is, should we detect such cases and try to avoid NaNs in Ps[s]? If the case is detected we can for example make all valid moves equally probable, i.e.
self.Ps[s] = self.Ps[s] + valids self.Ps[s] /= np.sum(self.Ps[s])

opened by evg-tyurin 18
Add 3D TicTacToe

Adds 3D TicTacToe. Also adds pretrained model (although not trained for very long). Not sure that all the code is very clean/well coded so might need checking (seems to work though).

opened by goshawk22 13
Max recursion depth exceeded

Othello TF per below on iteration 1, epoch 10. BTW: With TF less verbosity would be helpful (400 lines per epoch)

Training Net |############################### | (400/403) Data: 0.000s | Batch: 0.031s | Total: 0:00:12 | ETA: 0:00:01 | Loss_pi: 3.4790 |Training Net |############################### | (401/403) Data: 0.000s | Batch: 0.031s | Total: 0:00:12 | ETA: 0:00:01 | Loss_pi: 3.4789 |Training Net |############################### | (402/403) Data: 0.000s | Batch: 0.031s | Total: 0:00:12 | ETA: 0:00:01 | Loss_pi: 3.4789 |Training Net |################################| (403/403) Data: 0.000s | Batch: 0.031s | Total: 0:00:12 | ETA: 0:00:01 | Loss_pi: 3.4789 | Loss_v: 0.382 PITTING AGAINST PREVIOUS VERSION /home/brian/hitme/bin/alpha-zero-general/MCTS.py:80: RuntimeWarning: invalid value encountered in true_divide self.Ps[s] /= np.sum(self.Ps[s]) # renormalize Traceback (most recent call last): File "main.py", line 29, in c.learn() File "/home/brian/hitme/bin/alpha-zero-general/Coach.py", line 99, in learn pwins, nwins, draws = arena.playGames(self.args.arenaCompare) File "/home/brian/hitme/bin/alpha-zero-general/Arena.py", line 81, in playGames gameResult = self.playGame(verbose=verbose) File "/home/brian/hitme/bin/alpha-zero-general/Arena.py", line 46, in playGame action = players[curPlayer+1](self.game.getCanonicalForm(board, curPlayer)) File "/home/brian/hitme/bin/alpha-zero-general/Coach.py", line 98, in lambda x: np.argmax(nmcts.getActionProb(x, temp=0)), self.game) File "/home/brian/hitme/bin/alpha-zero-general/MCTS.py", line 31, in getActionProb self.search(canonicalBoard) File "/home/brian/hitme/bin/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/brian/hitme/bin/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/brian/hitme/bin/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) [Previous line repeated 983 more times] File "/home/brian/hitme/bin/alpha-zero-general/MCTS.py", line 103, in search next_s, next_player = self.game.getNextState(canonicalBoard, 1, a) File "/home/brian/hitme/bin/alpha-zero-general/othello/OthelloGame.py", line 31, in getNextState b = Board(self.n) File "/home/brian/hitme/bin/alpha-zero-general/othello/OthelloLogic.py", line 24, in init for i in range(self.n): RecursionError: maximum recursion depth exceeded in comparison

opened by brianprichardson 12
Keras out of GPU memory

Othello Keras after 34 iterations was out of gpu memory. I have an 8GB GTX1070 but limited to per_process_gpu_memory_fraction = 0.4 (about 3.2GB)

Of course, I can run with more, but perhaps there should be some gpu memory size guidelines in the readme, assuming it is not an error.

Caused by op 'batch_normalization_199/FusedBatchNorm', defined at: File "main.py", line 29, in c.learn() File "/home/brian/hitme/bin/alpha-zero-general/Coach.py", line 90, in learn pnet = self.nnet.class(self.game) File "/home/brian/hitme/bin/alpha-zero-general/othello/keras/NNet.py", line 27, in init self.nnet = onnet(game, args) File "/home/brian/hitme/bin/alpha-zero-general/othello/keras/OthelloNNet.py", line 37, in init h_conv1 = Activation('relu')(BatchNormalization(axis=3)(Conv2D(args.num_channels, 3, padding='same')(x_image))) # batch_size x board_x x board_y x num_channels File "/home/brian/hitme/lib/python3.6/site-packages/keras/engine/topology.py", line 617, in call output = self.call(inputs, **kwargs) File "/home/brian/hitme/lib/python3.6/site-packages/keras/layers/normalization.py", line 181, in call epsilon=self.epsilon) File "/home/brian/hitme/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1824, in normalize_batch_in_training epsilon=epsilon) File "/home/brian/hitme/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1799, in _fused_normalize_batch_in_training data_format=tf_data_format) File "/home/brian/hitme/lib/python3.6/site-packages/tensorflow/python/ops/nn_impl.py", line 831, in fused_batch_norm name=name) File "/home/brian/hitme/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 2034, in _fused_batch_norm is_training=is_training, name=name) File "/home/brian/hitme/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/brian/hitme/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/home/brian/hitme/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[64,512,6,6] [[Node: batch_normalization_199/FusedBatchNorm = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=0.001, is_training=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv2d_133/BiasAdd, batch_normalization_199/gamma/read, batch_normalization_199/beta/read, batch_normalization_199/Const_4, batch_normalization_199/Const_4)]] [[Node: loss_33/add/_17985 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3237_loss_33/add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

opened by brianprichardson 12
Eliminate bottleneck in `stringRepresentation` [SPEED IMPROVEMENT]
I time profiled the code, and this line becomes a bottleneck, because of the string.

https://github.com/suragnair/alpha-zero-general/blob/28331bbc48d96c2fecd0683266f76d92ca33c62d/connect4/Connect4Game.py#L63

def stringRepresentation(self, board): return str(self._base_board.with_np_pieces(np_pieces=board))

Another issue is that sometimes returns '-0.' instead of just '0'.

My suggestion: substitute it by something 100 times faster, although not so readable:

def stringRepresentation(self, board): return board.astype(int).tostring()

is true that we are not running the assert contained in Board.__init__ when we run .with_np_pieces, but on the other hand, running it in all iterations, may be too much.
opened by manuelsh 11
Four Player Game

Hi,

I am working on an AI for a four player card game and would like to apply this repo to it. Since you only support two-player games so far, I would like to extend it to the four-player case. Do you think this is possible within reasonable time? If yes, could you please point me to the main points where the changes would have to happen? The getSymmetries() in the Game probably would have to disappear.

Cheers, Joel

opened by JoelNiklaus 10
Coach accepting and rejecting new model
Hi, I might have discovered possible error at the end of coach learn episode when its deciding, whether to discard or keep new model.

if pwins+nwins > 0 and float(nwins)/(pwins+nwins) < self.args.updateThreshold: print('REJECTING NEW MODEL') else: print('ACCEPTING NEW MODEL')

There might be possible error, when there are no pwins+nwins, resulting only in draws. Then new model is accepted. I dont think its good to accept new model, because we do not know actually how good this contesting model, that resulted in all draws, actually is. Possible solution would be the following:

if pwins+nwins == 0 or float(nwins)/(pwins+nwins) < self.args.updateThreshold: print('REJECTING NEW MODEL') else: print('ACCEPTING NEW MODEL')

This will result in not-dividing by zero and will discard model resulting in draws.
opened by JernejHabjan 10
flips assertion failure
Hi Surag, Thank for sharing good software. When I ran it under tensorflow framework, I got the flips assertion failure.

------ITER 19------ Self Play |###################### | (71/100) Eps Time: 1.669s | Total: 0:01:58 | ETA: 0:00:51Traceback (most recent call last): File "main.py", line 30, in c.learn() File "/home/***/tools/alpha-zero-general/Coach.py", line 78, in learn trainExamples += self.executeEpisode()
File "/home/***/tools/alpha-zero-general/Coach.py", line 46, in executeEpisode pi = self.mcts.getActionProb(canonicalBoard, temp=temp) File "/home/***/tools/alpha-zero-general/MCTS.py", line 31, in getActionProb self.search(canonicalBoard) File "/home/***/tools/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/***/tools/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/***/tools/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/***/tools/alpha-zero-general/MCTS.py", line 106, in search v = self.search(next_s) File "/home/***/tools/alpha-zero-general/MCTS.py", line 103, in search next_s, next_player = self.game.getNextState(canonicalBoard, 1, a) File "/home/***/tools/alpha-zero-general/othello/OthelloGame.py", line 34, in getNextState b.execute_move(move, player) File "/home/***/tools/alpha-zero-general/othello/OthelloLogic.py", line 111, in execute_move assert len(list(flips))>0 AssertionError

System info: tensorflow-gpu 1.1.0 master branch (head of commit: " commit 263eccb2de3ca5eae7d5615e8b5b7d13b481b569 Author: suragnair [email protected] Date: Wed Jan 3 12:05:11 2018 +0530

added dim to pytorch log_softmax (UserWarning)" )

I wonder whether it is possible that flips can be empty

Thanks

Jianxiong
opened by jdongca2003 9
input: possibilities instead of situation

I've come up with a method that works much faster than the solutions I see here. I see that with all implementations the current situation on the board is given as input. I imagine that the network always has to interpret that situation first, only then the real thinking begins.

I could not use this method because my game does not involve a board with fixed dimensions. So I came up with something else. I simply skip the first step. I do not feed the network with the current situation on the board but I only give it the possible moves. In the case of legitimate moves, the fields of the person whose turn it is are assigned "1". His opponent's legitimate fields are marked "-1" and all other prohibited fields are marked "0". I get significantly better results! And I can handle many more situations. An important advantage (for me) is that you can handle multiple board dimensions with 1 network.

I myself am a chess player and I see the same distinction between amateurs and professionals. When a professional looks at the board he sees possibilities, when an amateur looks he first sees wooden pieces and only somewhere in the distance does he see possibilities. For that reason alone, he plays worse.

But I understand that this method is not suitable for every game, maybe not for chess and so on However, I am sure that this method can be very beneficial for certain games. I suspect it is for Go with a few changes..

opened by keesdejong 8
Better model/weights

Tried to upload, but too large. After checkpoint 153. Results pitted against pre-trained best (1580):

Arena.playGames |################################| (4097/2048) Eps Time: 3.473s | Total: 3:57:03 | ETA: 0:00:03 (1580, 2516, 0)

opened by brianprichardson 8
Bump certifi from 2022.5.18.1 to 2022.12.7
Bumps certifi from 2022.5.18.1 to 2022.12.7.

Commits

9e9e840 2022.12.07

b81bdb2 2022.09.24

939a28f 2022.09.14

aca828a 2022.06.15.2

de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...

b8eb5e9 2022.06.15.1

47fb7ab Fix deprecation warning on Python 3.11 (#199)

b0b48e0 fixes #198 -- update link in license

9d514b4 2022.06.15

4151e88 Add py.typed to MANIFEST.in to package in sdist (#196)

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump tensorflow from 2.9.1 to 2.9.3
Bumps tensorflow from 2.9.1 to 2.9.3.

Release notes

Sourced from tensorflow's releases.

TensorFlow 2.9.3

Release 2.9.3

This release introduces several vulnerability fixes:

Fixes an overflow in tf.keras.losses.poisson (CVE-2022-41887)

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

Fixes a heap OOB in FractionalAvgPool and FractionalMaxPool(CVE-2022-41900)

Fixes a CHECK_EQ in SparseMatrixNNZ (CVE-2022-41901)

Fixes an OOB write in grappler (CVE-2022-41902)

Fixes a overflow in ResizeNearestNeighborGrad (CVE-2022-41907)

Fixes a CHECK fail in PyFunc (CVE-2022-41908)

Fixes a segfault in CompositeTensorVariantToComponents (CVE-2022-41909)

Fixes a invalid char to bool conversion in printing a tensor (CVE-2022-41911)

Fixes a heap overflow in QuantizeAndDequantizeV2 (CVE-2022-41910)

Fixes a CHECK failure in SobolSample via missing validation (CVE-2022-35935)

Fixes a CHECK fail in TensorListScatter and TensorListScatterV2 in eager mode (CVE-2022-35935)

TensorFlow 2.9.2

Release 2.9.2

This releases introduces several vulnerability fixes:

Fixes a CHECK failure in tf.reshape caused by overflows (CVE-2022-35934)

Fixes a CHECK failure in SobolSample caused by missing validation (CVE-2022-35935)

Fixes an OOB read in Gather_nd op in TF Lite (CVE-2022-35937)

Fixes a CHECK failure in TensorListReserve caused by missing validation (CVE-2022-35960)

Fixes an OOB write in Scatter_nd op in TF Lite (CVE-2022-35939)

Fixes an integer overflow in RaggedRangeOp (CVE-2022-35940)

Fixes a CHECK failure in AvgPoolOp (CVE-2022-35941)

Fixes a CHECK failures in UnbatchGradOp (CVE-2022-35952)

Fixes a segfault TFLite converter on per-channel quantized transposed convolutions (CVE-2022-36027)

Fixes a CHECK failures in AvgPool3DGrad (CVE-2022-35959)

Fixes a CHECK failures in FractionalAvgPoolGrad (CVE-2022-35963)

Fixes a segfault in BlockLSTMGradV2 (CVE-2022-35964)

Fixes a segfault in LowerBound and UpperBound (CVE-2022-35965)

... (truncated)

Changelog

Sourced from tensorflow's changelog.

Release 2.9.3

This release introduces several vulnerability fixes:

Fixes an overflow in tf.keras.losses.poisson (CVE-2022-41887)

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

Fixes a heap OOB in FractionalAvgPool and FractionalMaxPool(CVE-2022-41900)

Fixes a CHECK_EQ in SparseMatrixNNZ (CVE-2022-41901)

Fixes an OOB write in grappler (CVE-2022-41902)

Fixes a overflow in ResizeNearestNeighborGrad (CVE-2022-41907)

Fixes a CHECK fail in PyFunc (CVE-2022-41908)

Fixes a segfault in CompositeTensorVariantToComponents (CVE-2022-41909)

Fixes a invalid char to bool conversion in printing a tensor (CVE-2022-41911)

Fixes a heap overflow in QuantizeAndDequantizeV2 (CVE-2022-41910)

Fixes a CHECK failure in SobolSample via missing validation (CVE-2022-35935)

Fixes a CHECK fail in TensorListScatter and TensorListScatterV2 in eager mode (CVE-2022-35935)

Release 2.8.4

This release introduces several vulnerability fixes:

Fixes a heap OOB failure in ThreadUnsafeUnigramCandidateSampler caused by missing validation (CVE-2022-41880)

Fixes a segfault in ndarray_tensor_bridge (CVE-2022-41884)

Fixes an overflow in FusedResizeAndPadConv2D (CVE-2022-41885)

Fixes a overflow in ImageProjectiveTransformV2 (CVE-2022-41886)

Fixes an FPE in tf.image.generate_bounding_box_proposals on GPU (CVE-2022-41888)

Fixes a segfault in pywrap_tfe_src caused by invalid attributes (CVE-2022-41889)

Fixes a CHECK fail in BCast (CVE-2022-41890)

Fixes a segfault in TensorListConcat (CVE-2022-41891)

Fixes a CHECK_EQ fail in TensorListResize (CVE-2022-41893)

Fixes an overflow in CONV_3D_TRANSPOSE on TFLite (CVE-2022-41894)

Fixes a heap OOB in MirrorPadGrad (CVE-2022-41895)

Fixes a crash in Mfcc (CVE-2022-41896)

Fixes a heap OOB in FractionalMaxPoolGrad (CVE-2022-41897)

Fixes a CHECK fail in SparseFillEmptyRowsGrad (CVE-2022-41898)

Fixes a CHECK fail in SdcaOptimizer (CVE-2022-41899)

... (truncated)

Commits

a5ed5f3 Merge pull request #58584 from tensorflow/vinila21-patch-2

258f9a1 Update py_func.cc

cd27cfb Merge pull request #58580 from tensorflow-jenkins/version-numbers-2.9.3-24474

3e75385 Update version numbers to 2.9.3

bc72c39 Merge pull request #58482 from tensorflow-jenkins/relnotes-2.9.3-25695

3506c90 Update RELEASE.md

8dcb48e Update RELEASE.md

4f34ec8 Merge pull request #58576 from pak-laura/c2.99f03a9d3bafe902c1e6beb105b2f2417...

6fc67e4 Replace CHECK with returning an InternalError on failing to create python tuple

5dbe90a Merge pull request #58570 from tensorflow/r2.9-7b174a0f2e4

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

mainTafl.py does not work for me

mainTafl.py does not work; it looks for me like an inconsistency in handling board. Coach expects for trainExamples (compared to Otello) from getNextState a ndarray, but gets Board objects.

  Traceback (most recent call last):
    File "mainTafl.py", line 34, in <module>
      c.learn()
    File "/content/Coach.py", line 113, in learn
      self.nnet.train(trainExamples)
    File "/content/tafl/pytorch/NNet.py", line 55, in train
      boards = torch.FloatTensor(np.array(boards).astype(np.float64))
  ValueError: setting an array element with a sequence.

opened by rosnu 0

Othello Keras Pretrained Model : UnpicklingError

The pretrained model in https://github.com/suragnair/alpha-zero-general/tree/master/pretrained_models/othello/keras cannot be loaded.

Code:

import torch
model=torch.load("6x6 checkpoint_145.pth.tar",map_location='cpu')

Error Message:

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
Cell In [2], line 2
      1 import torch
----> 2 model=torch.load("6x6 checkpoint_145.pth.tar",map_location='cpu')

File c:\Users\entdi\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py:713, in load(f, map_location, pickle_module, **pickle_load_args)
    711             return torch.jit.load(opened_file)
    712         return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 713 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)

File c:\Users\entdi\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py:920, in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
    914 if not hasattr(f, 'readinto') and (3, 8, 0) <= sys.version_info < (3, 8, 2):
    915     raise RuntimeError(
    916         "torch.load does not work with file-like objects that do not implement readinto on Python 3.8.0 and 3.8.1. "
    917         f"Received object of type \"{type(f)}\". Please update to Python 3.8.2 or newer to restore this "
    918         "functionality.")
--> 920 magic_number = pickle_module.load(f, **pickle_load_args)
    921 if magic_number != MAGIC_NUMBER:
    922     raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, 'H'.

The models in other folders can be loaded with similar code, but not this one.

Also, why is this a .pth.tar file instead of .h5 ?

Sorry, if this is a naive question.

opened by Dimanjan 0

Bump protobuf from 3.19.4 to 3.19.5
Bumps protobuf from 3.19.4 to 3.19.5.

Release notes

Sourced from protobuf's releases.

Protocol Buffers v3.19.5

C++

Reduce memory consumption of MessageSet parsing

This release addresses a Security Advisory for C++ and Python users

Commits

b464cfb Updating changelog

40859fb Updating version.json and repo version numbers to: 19.5

3b175f1 Merge pull request #10543 from deannagarcia/3.19.x

c05b5f3 Add missing includes

0299c03 Apply patch

0a722f1 Update version.json with "lts": true (#10533)

d5eb60a Merge pull request #10530 from protocolbuffers/deannagarcia-patch-6

6cf1f78 Update version.json

97fc844 Merge pull request #10504 from deannagarcia/3.19.x

29d60a2 Add version file

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Related tags

Overview

Alpha Zero General (any game, any framework!)

Docker Installation

Experiments

Contributing

Contributors and Credits

Comments

TensorFlow 2.9.3

Release 2.9.3

TensorFlow 2.9.2

Release 2.9.2

Release 2.9.3

Release 2.8.4

Protocol Buffers v3.19.5

C++

Owner

Surag Nair

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

This is a clean and robust Pytorch implementation of DQN and Double DQN.

A clean and robust Pytorch implementation of PPO on continuous action space.

PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Clean and readable code for Decision Transformer: Reinforcement Learning via Sequence Modeling

The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble

Gesture-controlled Video Game. Just swing your finger and play the game without touching your PC

Hand-distance-measurement-game - Hand Distance Measurement Game

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

Torchlight2 lan game server tool - A message forwarding tool for Torchlight 2 lan game

Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Hypersearch weight debugging and losses tutorial