Conda Environment YAML for running TensorFlow on GPU
Getting TensorFlow to work on a GPU can be tricky, but conda can make it relatively easy. Here’s a configuration that I find works on TensorFlow 2.11 with CUDA 11.7:
environment.yml
name: tensorflow
channels:
- defaults
- nvidia/label/cuda-11.7.1
dependencies:
- python=3.9
- cudatoolkit=11.7
- cudnn=8.1.0
- cuda-nvcc
- pip
- pip:
- tensorflow==2.11.0
variables:
LD_LIBRARY_PATH: "'$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/'"
XLA_FLAGS: "'--xla_gpu_cuda_data_dir=$CONDA_PREFIX/lib/'"
This assumes you have a few pre-requisites:
- a Linux machine (including WSL2)
- an NVIDIA GPU and appropriate drivers installed (which you can check by running
nvidia-smi
) conda
installed (for example through miniconda or alternatively mamba which is a faster drop in replacement).
Then to install it you can run at a shell (you can switch the name tensorflow
wiht whatever you like):
TFC_ENV_NAME=tensorflow
conda env remove -n "$TFC_ENV_NAME"
conda env create -f environment.yml -n "$TFC_ENV_NAME"
TFC_CONDA_PREFIX=$(conda info --envs | grep -Po "$TFC_ENV_NAME\K .*" | sed 's: ::g')
mkdir -p "$TFC_CONDA_PREFIX/lib/nvvm/libdevice/"
ln -s "$TFC_CONDA_PREFIX/lib/libdevice.10.bc" "$TFC_CONDA_PREFIX/lib/nvvm/libdevice/"
and now you should be able to run training python code whenever you conda activate tensorflow
.
The rest of this article will discuss how this all goes together.
Basic setup
Let’s start with a very simple environment.yml
that installs Python 3.9 and uses pip
to install Python and TensorFlow. Instead of using pip
we could use the Anaconda TensorFlow conda package, but that tends to be an earlier version.
environment.yml
name: tensorflow
dependencies:
- python=3.9
- pip
- pip:
- tensorflow==2.11.0
We can use this with TensorFlow, but it runs on the CPU as can be verified with this code:
test_gpu.py
import tensorflow as tf
assert tf.config.list_physical_devices('GPU')
Adding GPU Support
Tensorflow official documentation
Miniconda is the recommended approach for installing TensorFlow with GPU support. It creates a separate environment to avoid changing any installed software in your system. This is also the easiest way to install the required software especially for the GPU setup.
…
GPU Setup
First install the NVIDIA GPU driver if you have not. You can use the following command to verify it is installed.
nvidia-smi
Then install CUDA and cuDNN with conda.conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
Configure the system paths. You can do it with following command everytime your start a new terminal after activating your conda environment.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/
For your convenience it is recommended that you automate it with the following commands. The system paths will be automatically configured when you activate this conda environment.
mkdir -p $CONDA_PREFIX/etc/conda/activate.d echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
Rather than running the install script we can simply add the dependencies to environment.yml
. Note that you can also use a more recent version of CUDA providing your GPU is compatible with it, so I used the more recent 11.7 instead.
Setting the library path is a bit more complex; as suggested we could use activate.d/env_vars.sh
, but it would be better if we declare it in our environment.yml
. Instead of using the conda activate scripts we can set environment variables with variables
. This has the added benefit that any changed variables are reset when the environment is deactivates. This only lets us set an environment variable, whereas we want to append to it. We can hack this using that in the conda implementation conda uses the shell to call export {envvar}='{value}'
. We can do a shell injection by including single quotes inside our environment variable.
environment.yml
name: tensorflow
channels:
- defaults
dependencies:
- python=3.9
- cudatoolkit=11.7
- cudnn=8.1.0
- pip
- pip:
- tensorflow==2.11.0
variables:
LD_LIBRARY_PATH: "'$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/'"
Now the GPU test passes, but when we try to train a model it fails.
Enabling XLA
Lets try a very simple training example:
test_train.py
import numpy as np
from keras.models import Sequential
from keras import layers
= np.array([[0.2], [0.7]])
X_train
= np.array([0, 1])
y_train
= X_train.shape[1]
input_dim
= Sequential()
model 1, activation='sigmoid'))
model.add(layers.Dense(
compile(loss='binary_crossentropy',
model.='adam',
optimizer=['accuracy'])
metrics
=1, verbose=0) model.fit(X_train, y_train, epochs
When we try to run them we get an error like:
InternalError: Graph execution error:
...
Node: 'StatefulPartitionedCall'
libdevice not found at ./libdevice.10.bc
[[{{node StatefulPartitionedCall}}]] [Op:__inference_train_function_2088]
Some more digging shows this libdevice
driver is related to XLA, an optimizing compiler, that is apparently automatically used by Keras. We’ll need to install some additional libraries associated with it.
Instaling ptxas
The error is related to a component of the Nvidia CUDA Compiler, which is provided as a Conda package by nvidia. We just have to add the channel nvidia/label/cuda-11.7.1
corresponding to the version of cudatoolkit
we specified, and add cuda-nvcc
to the requirements.
environment.yml
name: tensorflow
channels:- defaults
- nvidia/label/cuda-11.7.1
dependencies:- python=3.9
- cudatoolkit=11.7
- cudnn=8.1.0
- cuda-nvcc
- pip
- pip:
- tensorflow==2.11.0
variables:"'$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/'"
LD_LIBRARY_PATH: "'--xla_gpu_cuda_data_dir=$CONDA_PREFIX/lib/'" XLA_FLAGS:
Then finally the training runs successfully; we’ve got a working GPU setup with Tensorflow. However there’s one more improvement we can make to the setup to enable faster inference.
Making inference faster with TensorRT
When running the test code there are a bunch of warnings about TensorRT, a way of making inference faster.
2022-12-14 23:27:03.152268: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: ...
2022-12-14 23:27:03.152353: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: ...
2022-12-14 23:27:03.152358: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
I still haven’t managed to successfully install these; it would be great to know how to solve this.