Tensorflow: Cuda compute capability 3.0. The minimum required Cuda capability is 3.5

37,017

Solution 1

I have installed Tensorflow revision 1.8. It recommends CUDA 9.0. I am using a GTX 650M card which has CUDA compute capability 3.0 and now works like a charm. OS is ubuntu 18.04. Below are detailed steps:

Installing dependencies

I have included ffmpeg and some related packages for my opencv 3.4 compilation, if not required do not install Run the below commands:

sudo apt-get update 
sudo apt-get dist-upgrade -y
sudo apt-get autoremove -y
sudo apt-get upgrade
sudo add-apt-repository ppa:jonathonf/ffmpeg-3 -y
sudo apt-get update
sudo apt-get install build-essential -y
sudo apt-get install ffmpeg -y
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev -y
sudo apt-get install python-dev libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev -y
sudo apt-get install libxvidcore-dev libx264-dev -y
sudo apt-get install unzip qtbase5-dev python-dev python3-dev python-numpy python3-numpy -y
sudo apt-get install libopencv-dev libgtk-3-dev libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev >libjasper-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev -y
sudo apt-get install libv4l-dev libtbb-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev -y
sudo apt-get install libvorbis-dev libxvidcore-dev v4l-utils vtk6 -y
sudo apt-get install liblapacke-dev libopenblas-dev libgdal-dev checkinstall -y
sudo apt-get install libgtk-3-dev -y
sudo apt-get install libatlas-base-dev gfortran -y
sudo apt-get install qt-sdk -y
sudo apt-get install python2.7-dev python3.5-dev python-tk -y
sudo apt-get install cython libgflags-dev -y
sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-eng -y 
sudo apt-get install tesseract-ocr-ell -y
sudo apt-get install gstreamer1.0-python3-plugin-loader -y
sudo apt-get install libdc1394-22-dev -y
sudo apt-get install openjdk-8-jdk
sudo apt-get install pkg-config zip g++-6 gcc-6 zlib1g-dev unzip  git
sudo wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install -U pip
sudo pip install -U numpy
sudo pip install -U pandas
sudo pip install -U wheel
sudo pip install -U six

Installing the nvidia driver

Run the below commands:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-390 -y

Reboot and run the below command and it should give you details as described in the image below: enter image description here

gcc-6 and g++-6 checks.

gcc-6 and g++-6 is required for CUDA 9.0, run the below commands:

cd /usr/bin 
sudo rm -rf gcc gcc-ar gcc-nm gcc-ranlib g++
sudo ln -s gcc-6 gcc
sudo ln -s gcc-ar-6 gcc-ar
sudo ln -s gcc-nm-6 gcc-nm
sudo ln -s gcc-ranlib-6 gcc-ranlib
sudo ln -s g++-6 g++

Installing CUDA 9.0

Go to https://developer.nvidia.com/cuda-90-download-archive. Select options: Linux->x86_64->Ubuntu->17.04->deb(local). Download the main file and the two patches. Run below commands:

sudo dpkg -i cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

Navigate to the first patch on your PC and double click it, it will automatically execute, follow same for second patch.

Add below to lines to your ~/.bashrc file and give it a reboot:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:$PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Installing cudnn 7.1.4 for CUDA 9.0

Download the tar file from https://developer.nvidia.com/cudnn and extract it to your Downloads folder Download requires a nvidia developed login, free sign-up Run the below commands:

cd ~/Downloads/cudnn-9.0-linux-x64-v7.1/cuda
sudo cp include/* /usr/local/cuda/include/
sudo cp lib64/libcudnn.so.7.1.4 lib64/libcudnn_static.a /usr/local/cuda/lib64/
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libcudnn.so.7.1.4 libcudnn.so.7
sudo ln -s libcudnn.so.7 libcudnn.so

Installing NCCL 2.2.12 for CUDA 9.0

Download the tar file from https://developer.nvidia.com/nccl and extract it to your Downloads folder Download requires a nvidia developed login, free sign-up Run the below commands:

sudo mkdir -p /usr/local/cuda/nccl/lib /usr/local/cuda/nccl/include
cd ~/Downloads/nccl-repo-ubuntu1604-2.2.12-ga-cuda9.0_1-1_amd64/
sudo cp *.txt /usr/local/cuda/nccl
sudo cp include/*.h /usr/include/
sudo cp lib/libnccl.so.2.1.15 lib/libnccl_static.a /usr/lib/x86_64-linux-gnu/
sudo ln -s /usr/include/nccl.h /usr/local/cuda/nccl/include/nccl.h
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libnccl.so.2.1.15 libnccl.so.2
sudo ln -s libnccl.so.2 libnccl.so
for i in libnccl*; do sudo ln -s /usr/lib/x86_64-linux-gnu/$i /usr/local/cuda/nccl/lib/$i; done

Install Bazel (the recomended manual installation of bazel worked, for reference: https://docs.bazel.build/versions/master/install-ubuntu.html#install-with-installer-ubuntu)

Download "bazel-0.13.1-installer-darwin-x86_64.sh" from https://github.com/bazelbuild/bazel/releases Run the below commands:

chmod +x bazel-0.13.1-installer-darwin-x86_64.sh
./bazel-0.13.1-installer-darwin-x86_64.sh --user
export PATH="$PATH:$HOME/bin"

Compiling Tensorflow

We will compile with CUDA, with XLA JIT (oh yeah) and jemalloc as malloc support. So we enter yes for these things. Run the below command and answer to the queries as described for running configuration

git clone https://github.com/tensorflow/tensorflow 
git checkout r1.8
./configure
You have bazel 0.13.0 installed.
Please specify the location of python. [Default is /usr/bin/python]:
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: y
XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.4
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 2.2.12
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda/nccl
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.0]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/x86_64-linux-gnu-gcc-7]: /usr/bin/gcc-6
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
 --config=mkl          # Build with MKL support.

 --config=monolithic   # Config for mostly static monolithic build.

Configuration finished

Now to compile tensorflow, run below command, this is super RAM consuming and will take time. You can remove "--local_resources 2048,.5,1.0" from below line if you have a lot of RAM or this will work on 2 GB of RAM

bazel build --config=opt --config=cuda --local_resources 2048,.5,1.0 //tensorflow/tools/pip_package:build_pip_package

Once the compilation is completed you will have thing appear as per the image below confirming it was a success enter image description here

Build the wheel file, run below:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Install the generated wheel file using pip

sudo pip install /tmp/tensorflow_pkg/tensorflow*.whl

To explore on the devices now you can run tensorflow, below image is the showcase on ipython terminal

enter image description here

Solution 2

In anaconda, tensorflow-gpu=1.12 with cudatoolkit=9.0 is compatible with gpu which has 3.0 compute capability. Here is the ccommand for creating new environment, and installation of necessary libraries for 3.0 gpus.

conda create -n tf-gpu
conda activate tf-gpu
conda install tensorflow-gpu=1.12
conda install cudatoolkit=9.0

then you can try it by followings.

>python
import tensorflow as tf
tf.Session()

Here is my output

name: GeForce GT 650M major: 3 minor: 0 memoryClockRate(GHz): 0.95 pciBusID: 0000:01:00.0 totalMemory: 3.94GiB freeMemory: 3.26GiB 2019-12-09 13:26:11.753591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2019-12-09 13:26:12.050152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-12-09 13:26:12.050199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-12-09 13:26:12.050222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2019-12-09 13:26:12.050481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2989 MB memory) -> physical GPU (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0, compute capability: 3.0)

Enjoy !

Solution 3

Thank you for making your WHL available! I am now finally able to work with TF when I was fighting for days just to compile it (without success), as my laptop only supports Compute 3.0. I was not able to compile with your instructions on a fresh install of Ubuntu 18.04, and wanted to point out a couple of things:

  • In your 'Dependencies' section, libjasper is no longer available independently, ffmpeg is no longer available from the repository you have listed, and libtiff5-dev is no longer available (I think there is a new version of this). I know this is mostly for the OpenCV stuff, which I use too. You also have a couple of packages repeated, like git and unzip.
  • In your 'Nvidia Driver' section, I don't think that driver is available from the repository. At least I couldn't pull it. With your built WHL file I am using the 418 driver from the Nvidia website, and that seems to be working well.
  • In your 'Installing cudnn 7.1.4 for CUDA 9.0' section, you 'cd /usr/lib/x86_64-linux-gnu', but the files are in /usr/local/cuda. Is this correct? I'm guessing the links would at least have to be told to point back to the cuda folder.
  • In section 'Installing NCCL 2.2.12 for CUDA 9.0' you are using 2.2.12, but your command lines all reference 2.1.15
  • In your Bazel install section, you say to use the Bazel Darwin installer, but I think this is for Mac. I think you need the Bazel Linux installer.

Thanks again for all of your work on this!

P.S. I was able to get this to build by doing a git checkout of Tensorflow 1.12 following these instructions and by pip installing keras_applications and keras_preprocessing, using CUDA 9.2, CUDNN 7.1.4, and NCCL 2,2,13, using Bazel 0.15.0. Some have pointed out that CUDA 9.0 can't be compiled against with gcc6/g++6. Apparently 9.2 can.

Solution 4

@Taako, so sorry for this late response. I did not save the wheel file of the compilation displayed above. However, here is a new one for tensorflow 1.9. Hope this helps you enough. Kindly ensure of the below details used for the build.

Tensorflow: 1.9 CUDA Toolkit: 9.2 CUDNN: 7.1.4 NCCL: 2.2.13

Below is the link to the wheel file: wheel file

Solution 5

For Tensorflow 2.1.0

I was able to manage it on Windows by compiling the source for TF2.1.0 . The TF 2.2.0 build failed because of XLA reasons, even with all XLA flags disabled for bazel. Also be wary of using more recent Python versions - I was getting some weird errors in the prebuilt pip package using Python 3.8, so I used Python 3.6 to resolve that.

A word of warning - a few hours after the build finished and I started using the library, a simple model training that lasted only a few seconds worked just fine, but training for a basic convolutional network failed after 0 or 1 epochs due to CUDA errors. Your mileage may vary.

Share:
37,017

Related videos on Youtube

Abhijay Ghildyal
Author by

Abhijay Ghildyal

Updated on July 17, 2022

Comments

  • Abhijay Ghildyal
    Abhijay Ghildyal almost 2 years

    I am installing tensorflow from source (documentation).

    Cuda driver version:

    nvcc: NVIDIA (R) Cuda compiler driver
    Cuda compilation tools, release 7.5, V7.5.17
    

    When I ran the following command :

    bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
    

    it gave me the following error :

    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
    I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:118] Found device 0 with properties: 
    name: GeForce GT 640
    major: 3 minor: 0 memoryClockRate (GHz) 0.9015
    pciBusID 0000:05:00.0
    Total memory: 2.00GiB
    Free memory: 1.98GiB
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:138] DMA: 0 
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:148] 0:   Y 
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
         [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
    F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
         [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
    F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
         [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
    F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
         [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
    Aborted (core dumped)
    

    Will I need a different gpu to run this?

    • Peter Hawkins
      Peter Hawkins over 7 years
      You need to specify compute capability 3.0 support when configuring Tensorflow. See: tensorflow.org/versions/r0.10/get_started/os_setup.html and github.com/tensorflow/tensorflow/issues/25
    • Abhijay Ghildyal
      Abhijay Ghildyal over 7 years
      I configured using TF_UNOFFICIAL_SETTING=1 ./configure and then after bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer I ran bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu. It still gives me the same error
    • Peter Hawkins
      Peter Hawkins over 7 years
      Did you explicitly request compute capability 3.0 support when running ./configure?
    • Abhijay Ghildyal
      Abhijay Ghildyal over 7 years
      It runs beautifully now. Thanks a ton!
  • Abhijay Ghildyal
    Abhijay Ghildyal almost 6 years
    Thanks, Manoj. It explains Tensorlfow installation very well. It'll be good for future reference.
  • Taako
    Taako almost 6 years
    @Manoj Kumar Das can you upload your .whl file for this compile? I'd really appreciate it
  • rayryeng
    rayryeng over 5 years
    I also have built a wheel for Tensorflow 1.12, CUDNN 7.2.1, NCCL: 2.2.13. You can send me a message in the MATLAB and Octave chat room if you need to get a hold of me: chat.stackoverflow.com/rooms/81987/chatlab-and-talktave
  • rayryeng
    rayryeng over 5 years
    I also have built a wheel for Tensorflow 1.12, CUDNN 7.2.1, NCCL: 2.2.13. You can send me a message in the MATLAB and Octave chat room if you need to get a hold of me: chat.stackoverflow.com/rooms/81987/chatlab-and-talktave
  • Kiryl Bielašeŭski
    Kiryl Bielašeŭski over 4 years
    Thank you, I spent so much time with dependencies and drivers on my old laptop with GT 750M but Conda solved my problem.
  • Mehdi
    Mehdi about 4 years
    guys, is it possible to compile TF2 for windows for cuda compatibility 3.0? There are some tuts for compiling TF1.x
  • Chris
    Chris almost 4 years
    @Mehdi I was able to manage it on Windows by compiling the source for TF2.1.0 . The TF 2.2.0 build failed because of XLA reasons, even with all XLA flags disabled for bazel. Also be wary of using more recent Python versions - I was getting some weird errors in the prebuilt pip package using Python 3.8, so I used Python 3.6 to resolve that.
  • Mehdi
    Mehdi almost 4 years
    @Chris, Is it possible for you to share your build, please?
  • user145453
    user145453 over 2 years
    Conda solved it, too. Older NVIDIA cards seem to work with specific lower tensorflow-gpu versions with corresponding lower dependency packages' versions.