Alpha Chess in Python

https://github.com/AppliedDataSciencePartners/DeepReinforcementLearning

Main components

Make two bots of different versions play against each other
Logger that logs a sequence of actions and their score (Logging is actually the most interesting part of this codebase)
Uses graphviz and pydot to output a graph of the neural network architectures
Model is a CNN followed by batch normalizatino followed by Relu and then out dense output. Model has value and policy network that are the same but for some reason value head has 2 layers while policy has 1
Has a seperate memory implementation with long and short term memory of the same size. Logs the board state, id, chosen action, action value and player turn
Many global variables are pulled out from a central global config file which is python code (no seperate configuration language). This includes the neural net architecture. Config has ints, floats and dictionaries
Game interface which
- Inits the game
- Reset, step, identities? (not sure what is meant by identities)
- Enumerates the allowed actions
- Checks if game has ended
- Get the current value of a state
- Render the board as a text file
Implementation of the Connect4 game
Agent
- Simulates MCTS by moving to a leaf, logging it, evaluating the leaf and then backfilling the value (I think this is something the MCTS tree should do automatically instead)
- performs an action with some selection strategy (act takes as input a function that gets the best state by doing MCTS simulations then applies the action to the world and triggers logging code around which action was taken)
- Agent also converts the state to somethign that the model file can actually consume
- Replay which retrains the model from memories with training targets as its past q value
It's useful to have a logger for the runs so you can reinspect all the values. Logger also has a central config that disables or enables it at different points
Loss function in its own seperate file for some reason (this should be part of the config as well)
Monte Carlo Tree SEarch implementation
- Node data structure which keeps track of neighbors, if it's leaf, player turn and board state
- Edge doesn't make much sense (NWQP?)
- Most code is in move to leave
  - Q and U values determine
- Add Node is 1 line of code
- Backfill go back up the tree and update values

Alpha Go Julia

https://github.com/tejank10/AlphaGo.jl

Main components

Readme allows training by adjusting number of layers and num_games to train
Allows play takes in an environment, the model net, num_readouts of MCTS and player turn
Replay batch manager
UX rendering take in as a config of a bunch of css files for play vs player only
Has board featurizations with the pre Alpha Go Zero techniques where number of liberties among others
Has bindings for a Go engine called GTG
Has bindings for kgs coordinates
Interface for Web IO
A resnet stack implementation
MCTS
- Board repreesnted as a n x N^2 tensor of value estimates. n is number of turns, N is size of the board
- Longest code
MCTS player that loads a MC tree and can pick moves and manages updates to the Monte Carlo Tree Search
Main loads all the net params and if they don't exist train MCTS player

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alpha Chess in Python

Alpha Go Julia

FilesExpand file tree

AlphaChess.md

Latest commit

History

AlphaChess.md

File metadata and controls

Alpha Chess in Python

Alpha Go Julia