How to install libcusolver.so.11

15,960

Solution 1

Can someone tell me what I am doing wrong

Nothing.

As noted in comments there is no version 11.0 of cuSolver in the CUDA 11.0 release. There is plainly some logic built into bazel which is automagically deriving the names of the component libraries from the major version of the toolkit it detects. That logic is not correct for the CUDA toolkit you have. I would be raising this as a bug with the developers of bazel. You might be able to explicitly override that in some way, but I can't tell you how.

Solution 2

If you want a concrete solution, just find libcusolver.so.10 on your machine and create a link to libcusolver.so.11:

Following command solved issue for me:

sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11

Credit to: https://github.com/tensorflow/tensorflow/issues/43947

Share:
15,960

Related videos on Youtube

puk
Author by

puk

Updated on June 04, 2022

Comments

  • puk
    puk almost 2 years

    I am trying to install Tensorflow but it is asking for libcusolver.so.11 and I only have libcusolver.so.10. Can someone tell me what I am doing wrong

    Here are my Ubuntu, nvidia and CUDA versions

    $ uname -a
    $ Linux *****-dev-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    
    $nvidia-smi --query-gpu=gpu_name --format=csv|tail -n 1
    GeForce GTX 1650
    
    $ nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2020 NVIDIA Corporation
    Built on Thu_Jun_11_22:26:38_PDT_2020
    Cuda compilation tools, release 11.0, V11.0.194
    Build cuda_11.0_bu.TC445_37.28540450_0
    

    Here is how I am building tensorflow

    $git clone https://github.com/tensorflow/tensorflow.git
    $cd ./tensorflow
    $git checkout tags/v2.2.0
    $./configure
    $bazel build --config=v2 --config=cuda --config=monolithic --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --copt=-Wno-sign-compare //        tensorflow:libtensorflow_cc.so
    

    Here is the error I am receiving

    ERROR: An error occurred during the fetch of repository 'local_config_cuda':
        Traceback (most recent call last):
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
             _create_local_cuda_repository(<1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
             _find_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
             _check_cuda_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
             execute(repository_ctx, <1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
             fail(<1 more arguments>)
     Repository command failed
     No library found under: /usr/local/cuda/lib64/libcusolver.so.11
     ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
             _create_local_cuda_repository(<1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
             _find_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
             _check_cuda_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
             execute(repository_ctx, <1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
             fail(<1 more arguments>)
     Repository command failed
     No library found under: /usr/local/cuda/lib64/libcusolver.so.11
     WARNING: Target pattern parsing failed.
     ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210
             _create_local_cuda_repository(<1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
             _find_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
             _check_cuda_libs(repository_ctx, <2 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
             execute(repository_ctx, <1 more arguments>)
         File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute
             fail(<1 more arguments>)
     Repository command failed
     No library found under: /usr/local/cuda/lib64/libcusolver.so.11
     INFO: Elapsed time: 1.998s
     INFO: 0 processes.
     FAILED: Build did NOT complete successfully (0 packages loaded)
         currently loading: tensorflow
     NORMAL   test.log
    
    • Robert Crovella
      Robert Crovella almost 4 years
      There is no libcusolver.so.11, currently, from NVIDIA. The latest/currently available CUDA 11 linux install will actually install libcusolver.so, libcusolver.so.10, and libcusolver.so.10.5.0.218 in /usr/local/cuda/lib64. This, in spite of the fact that e.g. the libcudart installed there is libcudart.so.11 and the libcublas is libcublas.so.11 (whereas libcufft is also libcufft.so.10). So this is rather unusual and may be tripping up your build process. I'm not really familiar with how bazel does this, but if it is attempting to link against libcusolver.so.11 that is broken
    • Robert Crovella
      Robert Crovella almost 4 years
      see here for documented confirmation. And when I say "broken" I mean if bazel is looking for libcusolver.so.11, then either bazel is broken, or something you fed to bazel by way of configuration broke it. As a workaround/alternative you might want to switch to CUDA 10.2 since there are certainly TF that have been built against CUDA 10.2.
    • Robert Crovella
      Robert Crovella almost 4 years
      another alternative would be to switch to the latest ngc TF container which has TF utilizing CUDA 11. Or perhaps you need to update to a newer TF branch and newer bazel to pick up some fixes for this.
    • MSI
      MSI over 2 years
      @ CherryDT I am facing the same error. No such file of "Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory" whats the problem here?
  • mdiener
    mdiener almost 3 years
    If this still does not work, try additionally "export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64${LD_LIBRARY_PATH:‌​+:${LD_LIBRARY_PATH}‌​}"
  • Geoffrey Anderson
    Geoffrey Anderson over 2 years
    This answer is good. It should be the official answer.
  • Ufos
    Ufos over 2 years
    Weirdest stuff ever, this did not help. I also needed to then symlink it into my virtual enviroment # sudo ln -s /usr/local/cuda/targets/x86_64-linux/lib/libcusolver.so.11 .venv/lib/python3.9/site-packages/tensorflow/python/libcusol‌​ver.so.11
  • Yan Varakin
    Yan Varakin over 2 years
    Ufos, thank you, your extended method also worked for me, it took big amount of time before I solved this
  • profPlum
    profPlum over 2 years
    Has this bug been fixed yet?
  • Alessio Mora
    Alessio Mora about 2 years
    Thank you Ufos. I had the same problem when using TF in a virtual environment on remote host (the trick explained by Aleksey answer only worked directly on the host machine). To make it work I also had to use the Ufo's suggestion.