Downloadable data for KataGo's main run as of 2019-06 (run number g104). To the extent permitted by law, the data for "g104" has been dedicated to the public domain, see LICENSE.txt for more details. Trained using area scoring rules only. Final model b20c256-s447913472-d241840887 is close to the strength of LZ200. https://github.com/lightvector/KataGo ------------------------------------------------------------------------ models/ ------------------------------------------------------------------------ Contains all neural nets from the run that passed gating. Every neural version is of the form: Architecture-NumberOfTrainingSamples-NumberOfDataSamples For example "b15c192-s131273984-d101093686" indicates 15 block 192 channel neural net, trained for 131273984 steps where steps are measured in number of data samples (e.g. number of batches times batch size) and 101093686 cumulative data samples generated by self play so far up to that point in the run. */model.config.json - json file describing model architecture parameters. Generally this file is an argument to all the python scripts used to build the model with the right number of blocks and channels. */model.txt.gz - Weights in the format expected by the GTP engine, or selfplay, or any of the other C++ code. No need to unzip down further to the .txt file itself, just pass the model.txt.gz file directly to the C++ code for any command line argument asking for a neural net model. If do you want to manually look at the weights yourself, it is just a gzipped plain text format. Although there is no separate documentation of the text format, roughly it all the weight tensors of the neural net written in order as plain-text floats. For details you can refer to the source code at https://github.com/lightvector/KataGo in python/export_model.py which is the script that exports this file. */saved_model - A directory containing a dump of the model weights in Tensorflow's format. Note that g104 uses stochastic weight averaging ("SWA") when dumping these values, which saves from a model constructed with a tf.name_scope("swa_model"), so you will generally need to pass -name-scope 'swa_model' to any python scripts to build their models under this name scope to be able to load these values in if you want to play around with the python scripts. ------------------------------------------------------------------------ selfplay/ ------------------------------------------------------------------------ Contains all self-play data, organized by the neural net version as of the time the game was finished (some of the moves of a game might be from an earlier version, since the neural net might have been switched out in between). Except for the initial random play that did not use a neural net, every neural version is of the form: Architecture-NumberOfTrainingSamples-NumberOfDataSamples For example "b15c192-s131273984-d101093686" indicates 15 block 192 channel neural net, trained for 131273984 steps where steps are measured in number of data samples (e.g. number of batches times batch size) and 101093686 cumulative data samples generated by self play so far up to that point in the run. */sgfs/*.sgfs These are sgf files of the games, concatenated to greatly reduce the number of individual files. Each individual line of an ".sgfs" file is a fully valid sgf. The sgfs produced by self-play never themselves contain newlines, so if you want separate sgf files, simply split out each line into a separate file. */tdata/*.npz */vdata/*.npz These are zipped numpy files of training and validation data. There is no difference between the two other than about 5% of games were simply randomly selected to be validation data instead of training. For a description of npz format: https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html https://docs.scipy.org/doc/numpy/reference/generated/numpy.lib.format.html#module-numpy.lib.format These files can directly be loaded by numpy's "load" function: https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html ------------------------------------------------------------------------ NPZ Contents ------------------------------------------------------------------------ Each npz file contains several numpy arrays. These are: --------------------------------------------- "binaryInputNCHWPacked" Binary input features for the neural net in [N,C,(HW)] format, where the HW has been zero-padded from 361 values up to 368 values, and then packed big-endianwise so that every binary value is a single bit. To unpack, you can use: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.unpackbits.html Channels are: 0 - On-board or not? Note that board size in the training data is variable. 1,2 - {own,opp} stone 3,4,5 - {1,2,3} liberties 6 - Ko or superko banned locations 7,8 - Unused, these are used for Japanese scoring rules but g104 only used area-scoring rules. 9,10,11,12,13 - Location of move {1,2,3,4,5} turns ago. 14,15,16 - Stones with 1 liberty in inescapable atari, or stones with 2 liberties that an attacker could in a single move put in inescapable atari, as of {0,1,2} turns ago. 17 - Moves by the current player that put at least one opposing stone in inescapable atari that was not in inescapable atari before. 18,19 - {own,opp} pass-alive stones, and empty areas that are surrounded where all surrounding stones are {own,opp} pass-alive. 20,21 - Unused, these are used for Japanese scoring rules but g104 only used area-scoring rules. For further details, see the function NNInputs::fillRowV4 (g104 used V4 input features) in the source file cpp/neuralnet/nninputs.cpp in https://github.com/lightvector/KataGo --------------------------------------------- "globalInputNC" Global input features for the neural net in [N,C] format. Channels are: 0,1,2,3,4 - Equal to 1.0 if the move {1,2,3,4,5} turns ago was a pass, else 0.0. 5 - Equal to komi/15.0 if white, -komi/15.0 if black. 6 - Equal to 0.0 if using simple ko, 1.0 if using superko. g104 only uses superko. 7 - Equal to -0.5 if using situational superko, 0.5 if using positional superko. 8 - Equal to 1.0 if multi-stone suicide is a legal move, and 0.0 if not. (single stone suicide as always treated as illegal, even under simple or situational superko). 9 - Equal to 0.0 if area scoring, 1.0 if Japanese territory scoring. g104 used only area scoring rules. 10,11 - Unused, these are used for Japanese scoring rules but g104 only used area-scoring rules. 12 - Would a pass end the current game phase? (For area scoring rules, there is only 1 game phase). 13 - Ranges anywhere between -1 and 1, indicating combined parity of komi and board size. For further details, see the function NNInputs::fillRowV4 (g104 used V4 input features) in the source file cpp/neuralnet/nninputs.cpp in https://github.com/lightvector/KataGo --------------------------------------------- "policyTargetsNCMove" The policy target for the neural net in [N,C,Pos] format. This is the same as NCHW format except the HW axes have been merged into a single axis of length 362 instead of length 361, where the extra index 361 indicates the pass move. Unlike Leela Zero data, the values in this array have NOT been normalized to sum to 1, instead they are simply the numbers of visits for all the children (with things like target-pruning as postprocessing). Leaving them as integers enables better compression than turning them into floats, and is non-lossy since it does not throw away the information of how many visits were used. Channel 0 is the policy target, Channel 1 is the opponent policy target (i.e. the target for the NEXT move, which can be used as an auxiliary training target for regularization if desired). --------------------------------------------- "globalTargetsNC" A variety of floating point values in [N,C] format. Brief description of channels below: #Value targets and other metadata, from the perspective of the player to move #C0-3: Categorial game result, win,loss,noresult, and also score. Draw is encoded as some weighted blend of win and loss. #C4-7: MCTS win-loss-noresult estimate td-like target, lambda = 35/36, nowFactor = 1/36 #C8-11: MCTS win-loss-noresult estimate td-like target, lambda = 11/12, nowFactor = 1/12 #C12-15: MCTS win-loss-noresult estimate td-like target, lambda = 3/4, nowFactor = 1/4 #C16-19: MCTS win-loss-noresult estimate td-like target, lambda = 0, nowFactor = 1 (no-temporal-averaging MCTS search result) #C20: Actual final score, from the perspective of the player to move, adjusted for draw utility, zero if C27 is zero. #C21: MCTS utility variance, 1->4 visits #C22: MCTS utility variance, 4->16 visits #C23: MCTS utility variance, 16->64 visits #C24: MCTS utility variance, 64->256 visits #C25: Weight multiplier for row as a whole #C26: Weight assigned to the policy target #C27: Weight assigned to the final board ownership target and score distr and bonus score targets. Most training rows will have this be 1, some will be 0. #C28: Weight assigned to the next move policy target #C29-32: Weight assigned to the utilityvariance targets C21-C24 #C33-35: Unused #C36-40: Precomputed mask values indicating if we should use historical moves 1-5, if we desire random history masking. 1 means use, 0 means don't use. (This is for if you want to randomly mask out some history on a tiny fraction of training data to ensure that the neural net behaves reasonably when given no history). #C41-46: 128-bit hash-like identifier for the game as a whole, shared by rows belonging to the same game. Split into chunks of 22, 22, 20, 22, 22, 20 bits, little-endian style (since floats have > 22 bits of precision). #C47: Komi, adjusted for draw utility and points costed or paid so far, from the perspective of the player to move. #C48: 1 if we're in an area-scoring-like phase of the game (area scoring or second encore territory scoring) #C49: 1 if an earlier neural net started this game, compared to the latest in this data file. #C50: If positive, an earlier neural net was playing this specific move, compared to the latest in this data file. #C51: Turn number of the game, zero-indexed. #C52: Did this game end via hitting turn limit? #C53: First turn of this game that was selfplay for training rather than initialization (e.g. handicap stones, random init of the starting board pos) #C54: Number of extra moves black got at the start (i.e. handicap games) #C55-56: Game type, game typesource metadata # 0 = normal self-play game. C51 unused # 1 = encore-training game. C51 is the starting encore phase #C57: 0 = normal, 1 = whole game was forked with an experimental move in the opening #C58: 0 = normal, 1 = training sample was an isolated side position forked off of main game #C59: Unused #C60: Number of visits in the search generating this row, prior to any reduction. #C61-63: Unused --------------------------------------------- "scoreDistrN" Training target for the score distribution belief with format [N,S]. S is the score difference and ranges from [0,842]. These represent score differences (-421.5, -420.5, ...., 420.5, 421.5). The distribution is stored as a one-hot format, except in case integer komi, in which case it is two-hot. --------------------------------------------- "selfBonusScoreN" Only relevant for Japanese scoring rules, not used in g104. --------------------------------------------- "valueTargetsNCHW" Training target for the ownership prediction of the neural net, with format [N,C,H,W]. There currently is only 1 channel. -1 indicates ownership by the opponent, 0 unowned, and 1 ownership by the current player.