0%

ubuntu conda 环境下配置 deepspeech

前言

最近看的论文很多都用到了 DeepSpeech,基本上是目前能搞到的最优秀的 ASR 项目之一了。在 GPU 服务器环境下配置 DeepSpeech 也是一番波折。记录一下踩过的坑。

正文

所有的操作均是在 docker 内进行。

docker 默认配置为:
cuda11.1+python3.6+torch 1.10.0.dev20210622+cu11.1+Keras 2.4.3+tensorflow 2.5.0

可以看到自带了 CUDA 11.1 。

按照官方文档:https://deepspeech.readthedocs.io/en/r0.9/?badge=latest

首先新建虚拟环境,因为有 Conda,所以不用 Virtual Venv 了。

1
conda create -n deepspeech python=3.8

接下来 激活虚拟环境:

1
conda activate deepspeech

然后按照文档,安装 deepspeech。

1
conda install deepspeech-gpu

这时候得到报错:

1
2
3
PackagesNotFoundError: The following packages are not available from current channels:

- deepspeech-gpu

一位配环境深有新的的胖友让我不要混用 pip 和 conda,但是这个库 conda 没有,搜了半天只好 pip 安装(官方文档给的命令也是用 pip)。

1
pip3 install deepspeech

很快,轻松快乐。但是不知道为什么居然不需要安装 TensorFlow。输出如下。

1
2
3
4
5
6
7
8
9
Collecting deepspeech
Downloading deepspeech-0.9.3-cp38-cp38-manylinux1_x86_64.whl (9.2 MB)
|████████████████████████████████| 9.2 MB 416 kB/s
Collecting numpy>=1.17.3
Downloading numpy-1.21.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
|████████████████████████████████| 15.7 MB 24.2 MB/s
Installing collected packages: numpy, deepspeech
Successfully installed deepspeech-0.9.3 numpy-1.21.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

然后按照文档的说明,配置预训练模型和数据。这里不做赘述了。如果服务器下载失败,就在 ssh 客户端机器下载好拖过去。

1
deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav

进行一次 ASR 测试,输出结果。

1
2
3
4
5
6
7
8
9
10
11
Loading model from file deepspeech-0.9.3-models.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2021-10-28 07:40:01.471050: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Loaded model in 0.024s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.00022s.
Running inference.
experience proves this
Inference took 1.575s for 1.975s audio file.

没有问题,正确输出转文字结果。但是我们想要 GPU 加速。

1
2
3
4
5
6
7
8
9
pip3 install deepspeech-gpu
# 下面是输出
Collecting deepspeech-gpu
Downloading deepspeech_gpu-0.9.3-cp38-cp38-manylinux1_x86_64.whl (22.3 MB)
|████████████████████████████████| 22.3 MB 698 kB/s
Requirement already satisfied: numpy>=1.17.3 in /root/miniconda3/envs/deepspeech/lib/python3.8/site-packages (from deepspeech-gpu) (1.21.3)
Installing collected packages: deepspeech-gpu
Successfully installed deepspeech-gpu-0.9.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

重新做一次转文字。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
2021-10-28 07:53:06.561665: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-10-28 07:53:06.561703: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading model from file deepspeech-0.9.3-models.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2021-10-28 07:53:06.786411: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-28 07:53:06.792126: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
# 省略一些关于显卡的信息
2021-10-28 07:53:07.255877: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-10-28 07:53:07.256068: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2021-10-28 07:53:07.257113: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-10-28 07:53:07.257399: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-10-28 07:53:07.257546: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2021-10-28 07:53:07.257690: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory
2021-10-28 07:53:07.257829: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-10-28 07:53:07.257847: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
# 省略一些关于显卡的信息
Loaded model in 0.492s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.000226s.
Running inference.
experience proves this
Inference took 1.785s for 1.975s audio file.

嗯……总之可以看到就是这也缺那也缺,安装一下 CUDA 10.1 看看,上面很多库都是 CUDA 10.1 的。

1
conda install cudatoolkit=10.1 -c nvidia

大约300多MB,安装完以后在测试。

1
2
3
4
5
6
7
8
9
10
# 只截取了关键输出
2021-10-28 07:58:37.492337: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-10-28 07:58:37.494255: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-10-28 07:58:37.496059: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-10-28 07:58:37.496379: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-10-28 07:58:37.498280: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-10-28 07:58:37.499158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-10-28 07:58:37.499378: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory
2021-10-28 07:58:37.499395: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

可以看到大部分的库成功引入了,但是缺少 ‘libcudnn.so.7’ ……搜索了一下,这是一个 CUDA 9.0 的库文件……此刻心中万马奔腾……

OK,那就安 CUDA 9.0 。

1
2
3
4
# 其实还抱着希望试了试 cudnn,其实没用
conda install cudnn -c nvidia
# 主要是下面这条命令
conda install cudatoolkit=9.0 -c nvidia

安装完毕以后却发现,cudatoolkit 会覆盖…… 9.0 安装完,10.1 的库又加载不出来了。

于是再安装回 10.1 。

1
conda install cudatoolkit=10.1 -c nvidia

顺腾摸瓜找安装路径,最后在 /root/miniconda3/envs/deepspeech/lib/ 路径找到了 cuda 的库文件。

我发现 cuda 10.1 自带的 DNN 库是 libcudnn.so.8.0.4。于是灵光一现,复制一个链接文件 libcudnn.so.7 指向 libcudnn.so.8.0.4 也许就行?

然后,重新跑一遍转文字。

结果……卡死了……它临终前的最后几行输出是这样的:

1
2
3
4
5
6
7
8
2021-10-28 08:39:04.099199: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-10-28 08:39:04.102018: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-10-28 08:39:04.103797: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-10-28 08:39:04.104111: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-10-28 08:39:04.106148: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-10-28 08:39:04.107205: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-10-28 08:39:04.107373: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-10-28 08:39:04.111394: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2

等了几分钟也没有然后。CTRL-C 也没有反应。

很气啊……然后我就 CTRL-Z 挂起查看进程号,kill 一气呵成了。这次顺利 Kill 了,不需要强制 kill。然后想了想这样可能版本向下不兼容吧,CUDA 10.1 的 DNN 库( libcudnn.so.7 )明显比 CUDA 9.0 小很多,也许是有些东西没了?

于是我把 CUDA 9.0 的 DNN 库拷贝出来,然后

1
conda install cudatoolkit=10.1 -c nvidia

一秒升级,很快。

再将 libcudnn.so.7 拷贝到库文件的路径。见证奇迹的时刻来了,再执行一遍转文字的命令。

1
2
3
4
5
6
7
8
2021-10-28 08:41:49.454505: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-10-28 08:41:49.456171: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-10-28 08:41:49.457685: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-10-28 08:41:49.457960: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-10-28 08:41:49.459548: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-10-28 08:41:49.460527: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
[1] 51805 bus error (core dumped) deepspeech --model deepspeech-0.9.3-models.pbmm --scorer --audio

黑人问号

真的不想搞了,deepspeech 什么鬼?

搜了半天还是没搜到解决办法,最近事情也蛮多暂时在这挖个坑吧……

山穷水尽疑无路,柳暗花明又一村。

网上能搜到的资料基本都救不了,甚至一些资料当时 DeepSpeech 匹配的 CUDA 库还是 10.0 ,10.1 都用不了。

猛然一想,这不是因为资料太老了,不如 Google 搜索的时候限定时间为最近一年,结果发现了这个。

https://qiita.com/dauuricus/items/63f5604080846ca25868

这是一个日语的技术博客网站,猛然发现他用的 DeepSpeech 版本都与我一样(0.9.3)。Chrome 翻译启动!

结果发现他是在 Google Colab 上跑的,OK 试试。notebook 文件连接:https://gist.github.com/dauuricus/d4551587784838d5a5fd8ef568970d32#file-deepspeech-ipynb。

https://colab.research.google.com/ 进入 Google Colab,然后将下载好的 ipynb 文件传上去。

Shift + 回车*n。

结果居然跑通了,那我就很纳闷了,Colab 使用的库和 CUDA 都应该是新版才对,为什么能跑通 DeepSpeech?

1
whereis libcudnn.so.7

于是我搜了一波这个库,发现他在

1
libcudnn.so: /usr/lib/x86_64-linux-gnu/libcudnn.so.7 /usr/lib/x86_64-linux-gnu/libcudnn.so.8 /usr/lib/x86_64-linux-gnu/libcudnn.so

很直接,直接用 cudnn 8 代替 7 。

想到这个 docker 本就安装了 CUDA 11.1,也许这上面的 Cudnn 库可以用?于是:

1
2
3
4
5
6
7
❯ cd /usr/lib/x86_64-linux-gnu/
❯ ls libcudnn.so -li
12345678 lrwxrwxrwx 1 root root 29 Dec 15 2020 libcudnn.so -> /etc/alternatives/libcudnn_so
❯ ls /etc/alternatives/libcudnn_so -li
12345679 lrwxrwxrwx 1 root root 39 Dec 15 2020 /etc/alternatives/libcudnn_so -> /usr/lib/x86_64-linux-gnu/libcudnn.so.8
❯ ls libcudnn.so.8 -li
12345677 lrwxrwxrwx 1 root root 17 Nov 6 2020 libcudnn.so.8 -> libcudnn.so.8.0.5

好,然后从这个路径把 libcudnn.so.8.0.5 拷贝到虚拟环境的 lib 里。(实际执行命令不止这些,做了备份)

1
cp libcudnn.so.8.0.5 /root/miniconda3/envs/deepspeech/lib/libcudnn.so.7

再次执行命令。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
❯ deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav
2021-10-28 10:05:39.482388: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Loading model from file deepspeech-0.9.3-models.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2021-10-28 10:05:39.654008: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-28 10:05:39.658194: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2021-10-28 10:05:40.081253: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
# 省略一些显卡信息
2021-10-28 10:05:40.082526: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-10-28 10:05:40.084675: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-10-28 10:05:40.086519: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-10-28 10:05:40.086845: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-10-28 10:05:40.088980: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-10-28 10:05:40.089999: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-10-28 10:05:40.090156: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-10-28 10:05:40.093871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2
2021-10-28 10:05:41.060113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-28 10:05:41.060163: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 1 2
2021-10-28 10:05:41.060189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N N N
2021-10-28 10:05:41.060196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 1: N N Y
2021-10-28 10:05:41.060205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 2: N Y N
# 省略一些显卡信息
Loaded model in 1.57s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.000257s.
Running inference.
2021-10-28 10:05:41.278758: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
experience proves this
Inference took 0.699s for 1.975s audio file.

成功了,能用 GPU 跑 DeepSpeech 了。

ohhhh

之后有什么需要补充的再填坑。