How to install libcusolver.so.11
Solution 1
Can someone tell me what I am doing wrong
Nothing.
As noted in comments there is no version 11.0 of cuSolver in the CUDA 11.0 release. There is plainly some logic built into bazel which is automagically deriving the names of the component libraries from the major version of the toolkit it detects. That logic is not correct for the CUDA toolkit you have. I would be raising this as a bug with the developers of bazel. You might be able to explicitly override that in some way, but I can't tell you how.
Solution 2
If you want a concrete solution, just find libcusolver.so.10 on your machine and create a link to libcusolver.so.11:
Following command solved issue for me:
sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
Credit to: https://github.com/tensorflow/tensorflow/issues/43947
Related videos on Youtube
puk
Updated on June 04, 2022Comments
-
puk almost 2 years
I am trying to install Tensorflow but it is asking for libcusolver.so.11 and I only have libcusolver.so.10. Can someone tell me what I am doing wrong
Here are my Ubuntu, nvidia and CUDA versions
$ uname -a $ Linux *****-dev-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $nvidia-smi --query-gpu=gpu_name --format=csv|tail -n 1 GeForce GTX 1650 $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Thu_Jun_11_22:26:38_PDT_2020 Cuda compilation tools, release 11.0, V11.0.194 Build cuda_11.0_bu.TC445_37.28540450_0
Here is how I am building tensorflow
$git clone https://github.com/tensorflow/tensorflow.git $cd ./tensorflow $git checkout tags/v2.2.0 $./configure $bazel build --config=v2 --config=cuda --config=monolithic --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --copt=-Wno-sign-compare // tensorflow:libtensorflow_cc.so
Here is the error I am receiving
ERROR: An error occurred during the fetch of repository 'local_config_cuda': Traceback (most recent call last): File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210 _create_local_cuda_repository(<1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository _find_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs _check_cuda_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs execute(repository_ctx, <1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute fail(<1 more arguments>) Repository command failed No library found under: /usr/local/cuda/lib64/libcusolver.so.11 ERROR: Skipping '//tensorflow:libtensorflow_cc.so': no such package '@local_config_cuda//cuda': Traceback (most recent call last): File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210 _create_local_cuda_repository(<1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository _find_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs _check_cuda_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs execute(repository_ctx, <1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute fail(<1 more arguments>) Repository command failed No library found under: /usr/local/cuda/lib64/libcusolver.so.11 WARNING: Target pattern parsing failed. ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last): File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 1210 _create_local_cuda_repository(<1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository _find_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs _check_cuda_libs(repository_ctx, <2 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs execute(repository_ctx, <1 more arguments>) File "/home/********/Documents/foo/.temp_install_dir/tensorflow/tensorflow/third_party/remote_config/common.bzl", line 208, in execute fail(<1 more arguments>) Repository command failed No library found under: /usr/local/cuda/lib64/libcusolver.so.11 INFO: Elapsed time: 1.998s INFO: 0 processes. FAILED: Build did NOT complete successfully (0 packages loaded) currently loading: tensorflow NORMAL test.log
-
Robert Crovella almost 4 yearsThere is no libcusolver.so.11, currently, from NVIDIA. The latest/currently available CUDA 11 linux install will actually install
libcusolver.so
,libcusolver.so.10
, andlibcusolver.so.10.5.0.218
in/usr/local/cuda/lib64
. This, in spite of the fact that e.g. thelibcudart
installed there islibcudart.so.11
and thelibcublas
islibcublas.so.11
(whereaslibcufft
is alsolibcufft.so.10
). So this is rather unusual and may be tripping up your build process. I'm not really familiar with how bazel does this, but if it is attempting to link againstlibcusolver.so.11
that is broken -
Robert Crovella almost 4 yearssee here for documented confirmation. And when I say "broken" I mean if bazel is looking for libcusolver.so.11, then either bazel is broken, or something you fed to bazel by way of configuration broke it. As a workaround/alternative you might want to switch to CUDA 10.2 since there are certainly TF that have been built against CUDA 10.2.
-
Robert Crovella almost 4 yearsanother alternative would be to switch to the latest ngc TF container which has TF utilizing CUDA 11. Or perhaps you need to update to a newer TF branch and newer bazel to pick up some fixes for this.
-
MSI over 2 years@ CherryDT I am facing the same error. No such file of "Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory" whats the problem here?
-
-
mdiener almost 3 yearsIf this still does not work, try additionally "export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
-
Geoffrey Anderson over 2 yearsThis answer is good. It should be the official answer.
-
Ufos over 2 yearsWeirdest stuff ever, this did not help. I also needed to then symlink it into my virtual enviroment
# sudo ln -s /usr/local/cuda/targets/x86_64-linux/lib/libcusolver.so.11 .venv/lib/python3.9/site-packages/tensorflow/python/libcusolver.so.11
-
Yan Varakin over 2 yearsUfos, thank you, your extended method also worked for me, it took big amount of time before I solved this
-
profPlum over 2 yearsHas this bug been fixed yet?
-
Alessio Mora about 2 yearsThank you Ufos. I had the same problem when using TF in a virtual environment on remote host (the trick explained by Aleksey answer only worked directly on the host machine). To make it work I also had to use the Ufo's suggestion.