Provided in this directory are bulk-downloadable archives of the rating games, training games, and training data for "kata1", the public run of KataGo hosted at https://katagotraining.org/ See DISCLAIMER.txt for legal disclaimers regarding this data. Training games are provided in SGF format, and on each move is labeled: the MCTS-estimated white win probability, black win probability, "no result" probability (see "no result" at https://lightvector.github.io/KataGo/rules.html), expected final score in points, visits, and weight. These SGFs are mainly for human viewing, and only contain a tiny fraction of the actual set of targets and information about each turn of a game. For the full details, see the training data. Rating games are provided in SGF format, and labeled similarly. For rating games, there is no additional detailed data, the game record and game result between the two players are the data. Training data is provided in NPZ format (i.e. numpy zipped tensors). Each row is a dictionary of a few fields. Training data input fields: The code that generated these two tensors is at: https://github.com/lightvector/KataGo/blob/v1.12.3/cpp/neuralnet/nninputs.cpp#L2145 Please refer to the source code for details about the exact way different channels are computed. "binaryInputNCHWPacked" - spatial inputs to the model in NCHW format (H=19,W=19), except that HW have been flattened and bitwise packed, from 19x19=361 bits, padded to 368 bits, and written as 46 bytes, using numpy.packbits(). For the channels C, these are 22 different feature planes containing the stones, liberties, ladder info, etc. For details, refer to the source code linked above. "globalInputNC" - Global (i.e. non-spatial) inputs to the model in NC format, indicating some history information, the rules, komi, etc. There are 19 global features. Training data outcome and training targets and metadata fields: For these fields, please refer to the source code and the comments within the source code here: https://github.com/lightvector/KataGo/blob/v1.11.0/cpp/dataio/trainingwrite.h#L134 "policyTargetsNCMove" - Various policy targets (including auxiliary targets besides the main policy prediction) in NC format, where is HW flattened from 19x19 into 361 and then extended to 362 where the last index corresponds to passing. "globalTargetsNC", - Various non-spatial targets including game outcome, score, various exponential-moving-averaged "short-term" versions of these. Also includes the weighting of rows, a hash identifier, and other per-row metadata. "scoreDistrN" - The score of the game, expressed as a large tensor N where outcome has 1 index for every possible score in a large range, and has 0s everywhere except has a 100 on the index corresponding to the final score, or two adjacent entries that sum to 100 if the score was in between those. Indices correspond to half-point outcomes (e.g. ...,-2.5,-1.5,-0.5,0.5,1.5,2.5,...) and draws will generally have two adjacent entries that sum to 100 based on the utility of a draw (i.e. how many fractional wins a draw counts as for the player). "valueTargetsNCHW" - Various spatial targets that involve predicting something on every square of the board, such as final ownership, and future stone positions.