Train your own Leela from scratch

To train a neural network to play chess using unsupervised machine learning you need

  1. A neural net (random weights and biases should be OK).
  2. A game generating process which uses the current neural net
  3. A training procedure which takes current neural net as input and outputs an adjusted version of it that is be better at predicting

The next cycle of training game generation uses the adjusted version of the neural net.

Contributing training games to Leela is very simple, setting up this workflow is rather complicated, which is why I have chosen document it as I learn it.

The Leela project kindly provides most parts of this infrastructure at https://github.com/LeelaChessZero/lczero-training

These were the steps required on Debian unstable July 2020.

Pre-requisites

A pre-requisite is a working installation of tensorflow. As of July 2020, current tensorflow is 2.3, and tensorflow >= 2.1 requires:

The debian packages corresponding to the above are, currently

apt-get update
apt-get dist-upgrade
apt-get install nvidia-driver libcudart10.1 libcupti10.1

If your dist-upgrade doesn't work, use upgrade instead

# LC_ALL=C dpkg -l nvidia-driver libcudart10.1 libcupti10.1
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                Version      Architecture Description
+++-===================-============-============-====================================================
ii  libcudart10.1:amd64 10.1.243-6   amd64        NVIDIA CUDA Runtime Library
ii  libcupti10.1:amd64  10.1.243-6   amd64        NVIDIA CUDA Profiler Tools Interface runtime library
ii  nvidia-driver       440.100-1    amd64        NVIDIA metapackage

Make python3 the default python.

# update-alternatives --install /usr/bin/python python /usr/bin/python3 2
# update-alternatives --install /usr/bin/python python /usr/bin/python2 1

In this case we don't specify a particular version of python3, since the second last argument above is /usr/bin/python3. If you want to use 3.8 specifically, add that as an alternative and make it the default option.

Install libraries for python3

apt-get install python3-termcolor python3-wheel python3-protobuf python3-numpy python3-rdkit python3-openbabel python3-grpcio python3-six python3-astor python3-bleach python3-markdown python3-html5lib python3-pip python3-keras-preprocessing python3-werkzeug python3-requests-oauthlib python3-cachetools python3-pyasn1-modules python3-rsa

Try the python3 installation:

Here I have explictly set python3.8 to the default python:

$ python
Python 3.8.4rc1 (default, Jul  1 2020, 15:31:45)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> exit()

On this box, I just set python3 to be the default, which in turn points at pyton3.7

$ python
Python 3.7.4 (default, Aug 21 2019, 16:01:23)
[GCC 9.2.1 20190813] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> exit()

Installing tensorflow

If your CPU support AVX, then go with the binaries from google, if not install from source.

$ grep avx /proc/cpuinfo

If the above returns nothing, then you have to compile your own version of tensorflow, otherwise simply use:

# pip3 install tensorflow

To compile tensorflow, you need bazel, which in turn is not available from debian. To install bazel, first install bazelisk which can install bazel. But first install some pre-requisites:

# apt-get install pkg-config zip g++ zlib1g-dev unzip

Get bazelisk as a binary, save it an a directory in your $PATH, for example like this:

PATH=$PATH:~/bin
mkdir -p ~/bin
wget -O ~/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v1.5.0/bazelisk-linux-amd64
chmod 755 ~/bin/bazel

Get the source code of tensorflow.

git clone git@github.com:tensorflow/tensorflow.git
cd tensorflow
./configure

/usr/include/,/usr/lib/x86_64-linux-gnu/,/usr/bin/,/usr/local/cuda/

Set up symlinks like this:

ls -l
totalt 0
lrwxrwxrwx 1 root root 32 14 jul 01.20 bin -> /usr/lib/nvidia-cuda-toolkit/bin
lrwxrwxrwx 1 root root 13 14 jul 01.07 include -> /usr/include/
lrwxrwxrwx 1 root root 25 14 jul 01.06 lib64 -> /usr/lib/x86_64-linux-gnu
lrwxrwxrwx 1 root root 18 14 jul 01.21 nvvm -> /usr/lib/cuda/nvvm

And add nvlink

cd /usr/local/cuda/bin
ln -s /usr/bin/nvlink .
/usr/bin/bazel build //tensorflow/tools/pip_package:build_pip_package

protoc —cpp_out=./ tensorflow/stream_executor/dnn.proto

Starting tensorflow compiled in a docker environment

docker run —gpus all -it -w /tensorflow -v $PWD:/mnt -e HOST_PERMS="$(id -u):$(id -g)" tensorflow/tensorflow:devel-gpu bash </src>

As non-root

docker run -u $(id -u):$(id -g) --gpus all -it -w /tensorflow -v $PWD:/mnt tensorflow/tensorflow:devel-gpu bash

at gpu-monster

<src lang="sh">
docker start 18e24368f7d8
docker attach 18e24368f7d8
export LD_LIBRARY_PATH=/usr/local/cuda/lib64
python
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Training

Supervised training (training data generated with another net)

python3 ~/src/lczero-training/tf/train.py --cfg ~/src/lczero-training/tf/configs/example.yaml --output /home/hans/mymodel.txt

./lc0 selfplay -w ~/leela-nets/mymodel.gz —training —games=5 —parallelism=5 —visits=800

Reinforcement learning

Initialise a net

1. Generate training data with the random backend:

./lc0 selfplay —training —games=4000 —parallelism=12 —visits=800 —backend=random

2. Use this games to form the first net:

python3 ~/src/lczero-training/tf/train.py —cfg ~/src/lczero-training/tf/configs/first.yaml —output /home/hans/leela-nets/0001.gz

improve the net with new training data

1. Generate new training data with the current net

~/src/lc0-match/build/release/lc0 selfplay -w /home/hans/leela-nets/0001.gz —training —games=4000 —visits=800

As of August 2020, official lc0 training uses these parameters

/home/hans/lc0-binaries/lc0 selfplay —backend-opts=backend=cudnn —visits=10000 —cpuct=1.32 —cpuct-at-root=1.9 —root-has-own-cpuct-params=true —resign-percentage=4.0 —resign-playthrough=20 —temperature=0.9 —temp-endgame=0.30 —temp-cutoff-move=60 —temp-visit-offset=-0.8 —fpu-strategy=reduction —fpu-value=0.23 —fpu-strategy-at-root=absolute —fpu-value-at-root=1.0 —minimum-kldgain-per-node=0.000030 —black.minimum-kldgain-per-node=0.000048 —policy-softmax-temp=1.4 —resign-wdlstyle=true —noise-epsilon=0.1 —noise-alpha=0.12 —sticky-endgames=true —moves-left-max-effect=0.2 —moves-left-threshold=0.0 —moves-left-slope=0.008 —moves-left-quadratic-factor=1.0 —moves-left-constant-factor=0.0 —training=true —weights=/home/hans/.cache/lc0/client-cache/353fc719885e523a45133ee43cd349be8506031624b1f5197e1fb9401917f67d

~/src/lc0/build/release/lc0 selfplay -w ~/mnt/gpu-master/leela-nets/0001.gz —training —games=4000 —backend-opts="(backend=cudnn,gpu=0),(backend=cudnn,gpu=1),(backend=cudnn,gpu=2),(backend=cudnn,gpu=3),(backend=cudnn,gpu=4),(backend=cudnn,gpu=5)" —parallelism=12 —visits=10000 —cpuct=1.32 —cpuct-at-root=1.9 —root-has-own-cpuct-params=true —resign-percentage=4.0 —resign-playthrough=20 —temperature=0.9 —temp-endgame=0.30 —temp-cutoff-move=60 —temp-visit-offset=-0.8 —fpu-strategy=reduction —fpu-value=0.23 —fpu-strategy-at-root=absolute —fpu-value-at-root=1.0 —minimum-kldgain-per-node=0.000030 —black.minimum-kldgain-per-node=0.000048 —policy-softmax-temp=1.4 —resign-wdlstyle=true —noise-epsilon=0.1 —noise-alpha=0.12 —sticky-endgames=true —moves-left-max-effect=0.2 —moves-left-threshold=0.0 —moves-left-slope=0.008 —moves-left-quadratic-factor=1.0 —moves-left-constant-factor=0.0

2. Train the current net

python3 ~/src/lczero-training/tf/train.py —cfg ~/src/lczero-training/tf/configs/second.yaml —output /home/hans/leela-nets/0002.gz

comments powered by Disqus


Back to the index

Blog roll

R-bloggers, Debian Weekly
Valid XHTML 1.0 Strict [Valid RSS] Valid CSS! Emacs Muse Last modified: augusti 14, 2020