TensorFlow, TensorBoard, and Docker

Docker

Docker comes in two flavors: Enterprise and Community. When in doubt, pick the latter because Docker’s business model is akin to Linux’s freemium model. The installation process is straightforward, and post-installation steps are only necessary to manage Docker as a non-root user.

At the moment, Docker can only limit the CPU and memory a container uses. It does not have good support for GPUs. This is where nvidia-docker comes in: it is a thin wrapper on top of docker and acts as a drop-in replacement for the docker CLI. nvidia-docker only modified the behavior of the Docker commands create and run; all the other commands are passed directly to docker. The modified commands automatically detects and configures the GPUs. Nvidia also provides CUDA images with CUDA Toolkit preinstalled. The only prerequisite for running the CUDA container is the Nvidia driver.

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
# Test nvidia-smi using only the first two GPUs
docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0,1 --rm nvidia/cuda nvidia-smi

When an operator executes docker run, the container process that runs is isolated in that it has its own file system, its own networking, and its own isolated process tree separate from the host. The Nvidia runtime was registered during the installation of nvidia-docker. –rm instructs docker to automatically remove the container when it exits. nvidia/cuda is the image the container derives from. Container images are typically stored at Docker Hub or Docker Store. -e sets the environment variables before creating the container. In the case of CUDA images, NVIDIA_VISIBLE_DEVICES=all is the default setting.

TensorFlow

Assuming the Nvidia driver and the CUDA Toolkit are installed, using TensorFlow with Docker is literally just launching one of two containers:

# Run TensorFlow programs in a Jupyter notebook without GPU support
docker run -it -p 8888:8888 gcr.io/tensorflow/tensorflow
# Run TensorFlow programs in a Jupyter notebook with GPU support
nvidia-docker run -it -p 8888:8888 gcr.io/tensorflow/tensorflow:latest-gpu

-it instructs Docker to allocate a pseudo-TTY connected to the container’s stdin. -p publishes the container’s port(s) to the host.

Capture TensorFlow Console Output

Currently Tensorflow does not have a way to redirect its console output to the Jupyter notebook. The following function is a solution contingent on the with statement context manager.

import contextlib
import os
import sys
import tempfile

@contextlib.contextmanager
def capture(fd='stdout'):
    fd, stream = (2, 'stderr') if fd == 'stderr' else (2, 'stdout')

    t = tempfile.NamedTemporaryFile(delete=False)
    print('Piping {} to {}\n'.format(stream, t.name))

    previous_fd = os.dup(fd)
    os.dup2(t.fileno(), fd)

    yield t

    os.dup2(previous_fd, fd)
    t.close()

TensorBoard

Running TensorBoard requires a little bit more effort:

<docker_command> run -d --name <container_name> -e PASSWORD=<your_desired_pw> -p 8888:8888 -p 6006:6006 <image>
# Initialize TensorBoard
docker exec -it <container_name> bash
tensorboard --logdir /tmp/tensorflow/logs

-d instructs Docker to run the container as a background process and print the container’s ID. –name assign a name to the container instead of relying on Docker’s automatic name generator. docker exec runs the desired command in a running container. PASSWORD=<your_desired_pw> is an undocumented feature in TensorFlow that avoids the security token.