跳转至

常见问题

常见问题

Q1 未定义符号

  • undefined symbol:_ZN9mindspore5tracel15GetDebugInfostrERKSt10shared_ptrINS_9DebugInfoEERKSsNS_13SourceLineTipE
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more infommation.
>>> import mindspore
>>> import mindspore lite
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
  File "/root/miniconda3/envs/xxx/lib/python3.7/site-packages/mindspore_lite/_init_.py", line 26, in <module>
      from mindspore lite.context import Context
  File "/root/miniconda3/envs/xxx/lib/python3.7/site-packages/mindspore_lite/context.py", line 22, in <module>
      from mindspore lite.lib import-c lite wrapper
ImportError: xxxx/mindspore-lite-2.2.0-linux-x64/tools/converter/lib/libmindspore_converter.so: undefined symbol:_ZN9mindspore5tracel15GetDebugInfostrERKSt10shared_ptrINS_9DebugInfoEERKSsNS_13SourceLineTipE
  • undefined symbol: _ZN9mindspore12label_manage23GetGlobalTraceLabelTypeEv
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright". "credits" or "license" for more infommation.
>>> import mindspore_lite
>>> import mindspore
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
  File "/ root/miniconda3/envs/xxx/1ib/python3.7/site-packages/mindspore/_init_.py", line 18, in <module>
      from mindspore.run check import run check
  File "/root/miniconda3/envs/xxx/lib/python3.7/site-packages/mindspore/ run_check/_init_.py", line 17, in <module>
      from . check_version import check_version_and_env_config
  File "/root/miniconda3/envs/xxx/lib/python3.7/site-packages/mindspo re/run_check/check version.py", line 29, in <module>
      from mindspore._c_expression "import MSContext, ms_ctx_param
ImportError: xxxx/mindspore-lite-2.2.0-linux-x64/tools/converter/lib/libmindspore_converter.so: undefined symbol: _ZN9mindspore12label_manage23GetGlobalTraceLabelTypeEv
  • undefined symbol: _ZNK9mindspore6kernel15KernelBuildInfo8TostringEv
[WARNING] LITE(20788, 7f897f04ff40, converter_lite) :2023-10-19-07:24: 10.858.973 [mindspore/lite/tools/opt imizer/common/fommat_utils.cc:385] ConvertAbstractFommatShape] abstract must be a tensor, but got: ValueAny.
[WARNING] LITE(20788,7f897f04ff40, converter_lite) :2023-10-19-07:24: 10.858.998 [mindspore/lite/tools/optimizer/common/gllo_utils.cc: 1071] GenTransposeNode] Convertabstract failed for node: args0_nh2nc
[WARNING] LITE(20788,7f897fO04ff40, converter_lite) :2023-10-19-07:24: 11.035.069 [mindspore/lite/src/extendrt/cxx_api/dlutils.h:124] DLSopen] dlopen /xxx/mindspore/to0ls/converter/lib/libascend pass plugin.so failed, error: /xxx/mindspore/tools/converter/1ib/libmslite_shared lib.s0: undefined symbol: _ZNK9mindspore6kernel15KernelBuildInfo8TostringEv

[ERROR] LITE(20788,7f897f04ff40, converter_lite) :2023-10-19-07:24: 11.035.121 [mindspore/lite/tools/converter/adapter/acl/plugin/acl_pass_plugin.cc:86] CreateAclPassInner] DLSopen failed, so path: /xxx/mindspore-1ite-2.2.0.20231019-1inux-x64/tools/converter/lib/1ibascend_pass_plugin.so, ret: dlopen /xxx/mindspore/tools/converter/lib/libascend_pass_plugin.so failed, error: /xxx/mindspore/tools/converter/lib/libmslite shared lib.so: undefined symbol: _ZNK9mindspore6kernel15KernelBuildInfo8TostringEv

以上缺少符号问题,为mindspore python whl包、mindspore_lite python whl包、mindspore_litetar包不匹配导致。根据下载mindspore下载MindSpore Lite,需检查

  • mindspore, mindspore_lite是否版本一致,例如都为2.2.0版本;
  • mindspore_lite的whl包与mindspore_lite的tar包是否版本一致,例如都为2.2.0版本;
  • mindspore_lite的whl包与mindspore_lite的tar包是否都为云侧版本

例如平台为linux x86_64下的昇腾环境,如下包组合是合适的

Q2 Ascend so库找不到相关错误

  • dlopen mindspore_lite/lib/libascend_kernel_plugin.so ,No such file or directory

错误信息

File "/home/xxx/miniconda3/envs/yyy/lib/python3.8/site-packages/mindspore_lite/model.py", line 95, in warpper
  return func(*args, **kwargs)
File "/home/xxx/miniconda3/envs/yyy/lib/python3.8/site-packages/mindspore_lite/model.py", line 235, in build_from_file
  raise RuntimeError(f"build from_file failed! Error is {ret.Tostring()}")
RuntimeError: build_from_file failed! Error is Common error code.
[WARNING] ME(15411,7f07f56be100, python) : 2023-10-16-00:51:42.509.780 [mindspore/lite/src/extend rt/cxx_api/dlutils.h:124] DLSopen]
dlopen /home/xxx/miniconda3/envs/yyy/lib/python3.8/site-packages/mindspore_lite/lib/libascend_kernel_plugin.so failed, error: libacl_cblas.so: cannot open shared object file: No such file or directory
[ERROR] ME(15411,7f07f56be100, python) :2023-10-16-00:51:42.509.877 [mindspo re/lite/src/extendrt/kernel/ascend/plugin/ascend_allocator_plugin.cc:70] Register] DLSopen failed, so path: /home/xxx/miniconda3/envs/ yyy/lib/python3.8/site-packages/mindspore_lite/lib/libascend_kernel_plugin.so , func name: CreateAclAllocator. err: dlopen /home/xxx/miniconda3/envs/yyy/lib/python3.8/site-packages/mindspore_lite/lib/libascend_ kernel_plugin.so failed, error: libacl_cblas.so: cannot open shared object file: No such file or directory
[ERROR] ME(15411,7f07f56be100, python):2023-10-16-00:51:42.509.893 [mindspore/lite/src/extendrt/infer_session.cc:66] HandleContext] failed register ascend allocator plugin.
...
raise RuntimeError(f"build_from_file failed! Error is {ret.ToString()}")
RuntimeError: build_from_file failed! Error is Common error code.

解决方法

该错误是mindspore_litetar包中的libascend_kernel plugin.so未加入到环境变量LD_LIBRARY_PATH导致,解决方法如下

  1. 查看是否安装了mindspore_lite的**云侧推理工具包**。如果未安装,请从 工具包tar.gz、whl包下载链接,下载Ascend版的云侧版本tar.gz包以及whl包安装,详细请见 mindspore lite 安装

  2. 找到mindspore_lite的安装路径,如路径为/your_path_to/mindspore-lite,cd到该目录下

  3. 查找libascend_kernel_plugin.so,命令为find ./ -name libascend_kernel_plugin.so,可以找到该so的路径为

./runtime/lib/libascend_kernel_plugin.so
  1. 将该路径加入到环境变量
export LD_LIBRARY_PATH=$LITE_HOME/runtime/lib:$LD_LIBRARY_PATH
  • Load dynamic library: libmindspore_ascend.so.2 failed. liboptiling.so: cannot open shared object file: No such file or directory
python -c "import mindspore;mindspore.set_context(device_target='Ascend');mindspore.run_check()"
[WARNING] ME(60105:13981374421 1776, MainProcess):2023-10-25-08: 14:33.640.411 [mindspore/run_check/_check_version.py:348] Using custom Ascend AI software package (Ascend Data Center Solution) path, package version checking is skipped. Please make sure Ascend AI software package (Ascend Data Center Solution) version is supported. For details, refer to the installation guidelines https://www.mindspore.cn/install
Traceback (most recent call last):
File "<string>", line 1, in module>
File "/xxx/py37/lib/python3.7/site-packages/mindspore/_checkparam.py", line 1313, in wrapper
  return func(*args, **kwargs)
File "/xxx/py37/1ib/python3.7/site-packages/mindspore/context.py", line 1456, in set_context
  ctx.set_device_target(kwargs['device target'])
File "/xxx/py37/lib/python3.7/site-packages/mindspore/context.py", line 381, in set_device_target
  self.set_param(ms_ctx_param.device_target, target)
File "/xxx/py37/lib/python3.7/site-packages/mindspore/context.py", line 175, in set_param
  self._context_handle.set_param(param, value)
RuntimeError: Unsupported device target Ascend. This process only supports one of the ['CPU']. Please check whether the Ascend environment is installed and configured correctly. and check whether current mindspore wheel package was built with "-e Ascend". For details, please refer to "Device load error message".

----------------------------------------------------
- Device load error message:
----------------------------------------------------
Load dynamic library: libmindspore_ascend.so.2 failed. liboptiling.so: cannot open shared object file: No such file or directory
Load dynamic library: 1ibmindspore ascend.so.1 failed. liboptiling.so: cannot open shared object file: No such file or directory

----------------------------------------------------
...

该错误为liboptiling.so未加入环境变量LD_LIBRARY_PATH导致,解决方法如下

  1. 查看是否安装了CANN。如果未安装,请安装昇腾AI处理器配套软件包,安装CANN。

  2. 找到CANN的安装路径,如路径为/your_path_to/cann,cd到该目录下

  3. 查找liboptiling.so,命令为find ./ -name liboptiling.so,可以找到该so的路径为

    ./CANN-7.0/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/x86_64/liboptiling.so
    ./CANN-7.0/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/aarch64/liboptiling.so
    ./CANN-7.0/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/minios/aarch64/liboptiling.so
    
  4. 将该路径加入到环境变量,若为x86_64,则

    export LD_LIBRARY_PATH=$ASCEND_HOME/CANN-7.0/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/x86_64/:$LD_LIBRARY_PATH
    

    当下列信息出现时,提示安装成功

    The result of multiplication calculation is correct. MindSpore has been installed on platform [Ascend] successfully!
    
  • Load dynamic library: libmindspore_ascend.so.2 failed. libaicpu_ascend_engine.so: cannot open shared object file: No such file or directory
RuntimeError: Unsupported device target Ascend. This process only supports one of the ['CPU']. Please check whether the Ascend environment is installed and configured correctly. and check whether current mindspore wheel package was built with "-e Ascend". For details, please refer to "Device load error message".

----------------------------------------------------
- Device load error message:
----------------------------------------------------
Load dynamic library: libmindspore_ascend.so.2 failed. libaicpu_ascend_engine.so: cannot open shared object file: No such file or directory
Load dynamic library: libmindspore_ascend.so.1 failed. libaicpu_ascend_engine.so: cannot open shared object file: No such file or directory

----------------------------------------------------
...

该错误为libaicpu_ascend_engine.so未加入环境变量LD_LIBRARY_PATH导致,解决方法如下

  1. 查看是否安装了CANN。如果未安装,请安装昇腾AI处理器配套软件包,安装CANN。

  2. 找到CANN的安装路径,如路径为/your_path_to/cann,cd到该目录下

  3. 查找libaicpu_ascend_engine.so,命令为find ./ -name libaicpu_ascend_engine.so,可以找到该so的路径为

    ./CANN-7.0/x86_64-linux/lib64/plugin/opskernel/libaicpu_ascend_engine.so
    ./CANN-7.0/compiler/lib64/plugin/opskernel/libaicpu_ascend_engine.so
    ./latest/x86_64-linux/lib64/plugin/opskernel/libaicpu_ascend_engine.so
    
  4. 将路径加入到环境变量,若为x86_64,则

    export LD_LIBRARY_PATH=$ASCEND_HOME/CANN-7.0/compiler/lib64/plugin/opskernel/:$LD_LIBRARY_PATH
    

Q3 Ascend Error Message A39999

  • 错误1
----------------------------------------------------
- Ascend Error Message:
----------------------------------------------------
E39999: Inner Error!
E39999 TsdOpen failed. devId=0, tdt error=31[FUNC:PrintfTsdError] [FILE: runtime.cc][LINE:2060]
     TraceBack (most recent call last):
     Start aicpu executor failed, retCode=0x7020009 devId=0[FUNC :DeviceRetain][FILE: runtime.cc][LINE:2698]
     check param failed, dev can not be NULL![FUNC:PrimaryContextRetain][FILE: runtime.cc][LINE:2544]
     Check param failed, ctx can not be NULL! [FUNC:PrimaryContextRetain][FILE: runtime.cc][LINE:2571]
     Check param failed, context can not be null.[FUNC:NewDevice][FILE:api impl.cc][LINE:1899]
     New device failed, retcode=0x70 10006[FUNC:SetDevice][FILE:api_impL-cc][LINE:1922]
     rtsetDevice execute failed, reason=[device retain error][FUNC:FuncErrorReason][FILE :error message manage.ccl[LINE:50]
     open device 0 failed runtime result = 507033.[FUNC: ReportCallError][FILE:log_inner.cpp][LINE:161]

(Please search "Ascend Error Message" at https://www.mindspore.cn for error code description)
  • 错误2
----------------------------------------------------
- Ascend Error Message:
----------------------------------------------------
E39999: Inner Error!
E39999 tsd client wait response fail, device response code[1]. load aicpu ops package failed, device[O], host pid[5653], error stack:
[TSDaemon] checksum aicpu package failed, ret=103, [tsd_common.cpp:2242:SaveProcessConfig]17580
Check head tag failed, ret=279, [package_worker.cpp:537:VerifyAicpuPackage]2369
Verify Aicpu package failed, srcPathlIhome/HMHiAiuser/aicpu_kernels/vf0_5653_Ascend310P-aicpu_syskernels.tar.gz]..[package_worker.cpp:567:DecompressionAicpuPackage]2369
Decompression AicpuPackage [/home/HwHiAiUser/aicpu_kernels/vf0_5653_Ascend310P-aicpu_syskernels.tar.gz] failed, [package_worker.cpp:218:LoadAICPUPackageForProcessMode]2369
Load aicpu package path[/home/HwHiAiUser/hdcd/device0/] fileName[ 5653_Ascend310P-aicpu_syskernels.tar.gz] failed, [inotify_watcher.cpp:311:HandleEvent]2369
[TSDaemon] load aicpu ops package failed, device[0], host pid[5653], [tsd_common.cpp:2054:CheckAndHandleTimeout]2374
[FUNC:WaitRsp][FILE:process_mode_manager.cpp][LINE:270]
  TraceBack (most recent call last):
  TsdOpen failed. devId=0, tdt error=31[FUNC:PrintfTsdError] [FILE: runtime.cc][LINE:2060]
  Start aicpu executor failed, retCode=Ox7020009 devId=0[FUNC:DeviceRetain][FILE: runtime.cc] [LINE:2698]
  Check param failed, dev can not be NULL! [FUNC:PrimaryContextRetain] [FILE: runtime.cc][LINE:2544]
  Check param failed, ctx can not be NULL! [FUNC:PrimaryContextRetain J[FILE: runtime.cc][LINE:2571]
  Check param failed, context can not be null. [FUNC:NewDevice] [FILE:api_impl.cc][LINE: 1893]
  New device failed, retCode=0x7010006[FUNC:SetDevice] [FILE:api impl.cc][LINE:1916]
  rtSetDevice execute failed, reason=[device retain errorl[FUNC:FuncErrorReason] [FILE:error_message_manage.cc][LINE:50]
  open device 0 failed, runtime result = 507033. [FUNC:ReportCalLError][FILE:log_inner.cpp]ILINE:161]
(Please search "Ascend Error Message" at https://www.mindspore.cn for error code description)

可能原因

  • 驱动版本与CANN不匹配

  • 环境变量未配置成功,导致aicpu启动失败,尝试将下列项加入环境变量

export ASCEND_OPP_PATH=${ASCEND_HOME}/latest/opp
export ASCEND_AICPU_PATH=${ASCEND_OPP_PATH}/..

Q4 acl open device 0 failed

推理时可能触发acl open device 0 failed,例如

benchmark --modelFile=dbnet_mobilenetv3_lite.mindir --device=Ascend --inputShapes='1,3,736,1280' --loopCount=100 - -wammUpLoopCount=10
ModelPath = dbnet_mobilenetv3_lite.mindir
ModelType = MindIR
InDatapath =
GroupInfoFile =
ConfigFilepath =
InDataType = bin
LoopCount = 100
DeviceType = Ascend
AccuracyThreshold = 0.5
CosineDistanceThreshold = -1.1
WarmUpLoopCount = 10
NumThreads = 2  InterOpParallelNum = 1
Fpl16Priority = 0   EnableparalÍel = 0
calibDataPath =
EnableGLTexture = 0
cpuBindMode = HIGHER CPU
CalibDataType = FLOAT
Resize Dims: 1 3 736 1280
start unified benchmark run
IERROR] ME (26748,7f6c73867fc0, benchmark) :2023-10-26-09:51 : 54.833.515 Imindspore/lite/src/extend rt/kernel/ascend/model/model_infer.cc:59] Init] Acl open device 0 failed.
[ERROR] ME (26748,7f6c73867fc0,benchmark):2023-10-26-09:51:54.833.573 [mindspore/lite/src/extend rt/kernel/ascend/src/custom_ascend_kernel.cc:141] Init] Model i
nfer init failed.   [ERROR] ME (26748, 7f6c73867fc0,benchmark) :2023-10-26-09:51:54.833.604 [mindspore/lite/src/extendrt/session/single_op_session.cc:198] BuildCustomAscendKernelImpl] kernel init failed CustomAscend
[ERROR] ME (26748,7f6c73867fc0, benchmark) :2023-10-26-09:51 :54.833.669 [mindspore/li te/src/extendrt/session/single_op_sess ion.cc:220] BuildCustomAscendKernel] Build ascend kernel failed for node: custom_0
[ERROR] ME (26748,7f6c73867fc0,benchmark) :2023-10-26-09:51 : 54.833.699 [mindspore/lite/src/extend rt/session/single_op_session.cc:302] CompileGraph] Failed to Build custom ascend kernel
[ERROR] ME (26748,7f6c73867fc0,benchmark) :2023-10-26-09:51:54.833.727 [mindspore/lite/s rc/extendrt/cxx_api/model/model_impl.cc:413] BuildByBufferImpl] compile graph failed.
[ERROR] ME (26748, 7f6c73867fc0, benchmark):2023-10-26-09:51:54.835.590 [mindspore/lite/tools/benchma rk/benchmark_unified_api.cc:1256] CompileGraph] ms_model_.Build failed while running
IERROR] ME (26748,7f6c73867fc0,benchmark) :2023-10-26-09:51:54.835.627 [mindspore/lite/tools/benchma rk/benchmark_unified_api.cc:1325] RunBenchmark] Compile graph failed.
[ERROR] ME(26748,7f6c73867fc0, benchmark):2023-10-26-09: 51:54.835.662 [mindspore/lite/tools/benchmark/ run_benchmark.cc :78] RunBenchmark] Run Benchmark dbnet_mobilenetv3_lite.mindi r Failed : -1
ms_model_.Build failed while running Run Benchmark dbnet mobilenetv3 lite.mindir Failed : -1

环境变量缺少配置acllib相关库

export NPU_HOST_LIB=$ASCEND_HOME/latest/acllib/lib64/stub
export DDK_PATH=$ASCEND_HOME/latest
export LD_LIBRARY_PATH=$ASCEND_HOME/latest/acllib/lib64
export ASCEND_AICPU_PATH=$ASCEND_HOME/latest/x86_64-linux
export LD_LIBRARY_PATH=$ASCEND_HOME/latest/x86_64-linux/lib64:$LD_LIBRARY_PATH

Q5 windows安装mindocr依赖失败

windows下执行

git clone git@gitee.com:mindspore-lab/mindocr.git
cd mindocr
pip install -e .

lanms安装失败,错误信息为

FileNotFoundError: [WinError 2] 系统找不到指定的文件。

lanma似乎不支持windows,https://github.com/argman/EAST/ 上仍存在适配windows的issue。推荐使用linux。 用windows可考虑用lanms-neo替换lanms并安装。安装过程可能遇到以下错误

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting lanms-neo==1.0.2
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/7b/fe/beff7e7e4455cb9f69c5734897ca8552a57f6423b062ec86b2ebc1d79c0d/lanms_neo-1.0.2.tar.gz (39 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: lanms-neo
  Building wheel for lanms-neo (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for lanms-neo (pyproject.toml) did not run successfully.
   exit code: 1
  ╰─> [10 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-cpython-37
      creating build\lib.win-amd64-cpython-37\lanms
      copying lanms\__init__.py -> build\lib.win-amd64-cpython-37\lanms
      running build_ext
      building 'lanms._C' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for lanms-neo
Failed to build lanms-neo
ERROR: Could not build wheels for lanms-neo, which is required to install pyproject.toml-based projects

安装开发工具。并执行pip install lanms-neo

删除requirement.txt中的lanms依赖项,继续执行pip install -r requirements.txt可完成安装。

Q6 RuntimeError: The device address type is wrong: type name in address:CPU, type name in context:Ascend

  • export时触发 RuntimeError: The device address type is wrong: type name in address:CPU, type name in context:Ascend,例如
[WARNING] ME(18680:139900608063296,MainProcess):2023-10-31-12:31:20.141.25 [mindspore/run_check/_check_version.py:348] Using custom Ascend AI software package (Ascend Data Center Solution) path, package version checking is skipped. Please make sure Ascend AI software package (Ascend Data Center Solution) version is supported. For details, refer to the installation guidelines https://www.mindspore.cn/install
[WARNING] ME(18680:139900608063296,MainProcess):2023-10-31-12:31:20.143.96 [mindspore/run_check/_check_version.py:460] Can not find the tbe operator implementation(need by mindspore-ascend). Please check whether the Environment Variable PYTHONPATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install
[WARNING] ME(18680:139900608063296,MainProcess):2023-10-31-12:31:20.144.71 [mindspore/run_check/_check_version.py:466] Can not find driver so(need by mindspore-ascend). Please check whether the Environment Variable LD_LIBRARY_PATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install
Traceback (most recent call last):
  File "tools/export.py", line 173, in <module>
    export(**vars(args))
  File "tools/export.py", line 73, in export
    net = build_model(model_cfg, pretrained=True, amp_level=amp_level)
  File "/xxx/mindocr/mindocr/models/builder.py", line 52, in build_model
    network = create_fn(**kwargs)
  File "/xxx/mindocr/mindocr/models/rec_svtr.py", line 122, in svtr_tiny_ch
    model = SVTR(model_config)
  File "/xxx/mindocr/mindocr/models/rec_svtr.py", line 26, in __init__
    BaseModel.__init__(self, config)
  File "/xxx/mindocr/mindocr/models/base_model.py", line 34, in __init__
    self.backbone = build_backbone(backbone_name, **config.backbone)
  File "/xxx/mindocr/mindocr/models/backbones/builder.py", line 48, in build_backbone
    backbone = backbone_class(**kwargs)
  File "/xxx/mindocr/mindocr/models/backbones/rec_svtr.py", line 486, in __init__
    ops.zeros((1, num_patches, embed_dim[0]), ms.float32)
  File "/xxx/py37/lib/python3.7/site-packages/mindspore/ops/function/array_func.py", line 1039, in zeros
    output = zero_op(size, value)
  File "/xxx/py37/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 314, in __call__
    return _run_op(self, self.name, args)
  File "/xxx/py37/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 913, in _run_op
    stub = _pynative_executor.run_op_async(obj, op_name, args)
  File "/xxx/py37/lib/python3.7/site-packages/mindspore/common/api.py", line 1186, in run_op_async
    return self._executor.run_op_async(*args)
RuntimeError: The device address type is wrong: type name in address:CPU, type name in context:Ascend

----------------------------------------------------
- C++ Call Stack: (For framework developers)
----------------------------------------------------
mindspore/ccsrc/plugin/device/ascend/hal/hardware/ge_device_res_manager.cc:72 AllocateMemory
  • MindSporeAscend模式下进行计算发生错误。触发RuntimeError: The device address type is wrong: type name in address:CPU, type name in context:Ascend,例如
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import mindspore as ms
[WARNING] ME(44720:140507814819648,MainProcess):2023-11-01-03:01:38.884.384 [mindspore/run_check/_check_version.py:348] Using custom Ascend AI software package (Ascend Data Center Solution) path, package version checking is skipped. Please make sure Ascend AI software package (Ascend Data Center Solution) version is supported. For details, refer to the installation guidelines https://www.mindspore.cn/install
[WARNING] ME(44720:140507814819648,MainProcess):2023-11-01-03:01:38.884.675 [mindspore/run_check/_check_version.py:466] Can not find driver so(need by mindspore-ascend). Please check whether the Environment Variable LD_LIBRARY_PATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install
>>> import mindspore.ops as ops
>>> ms.set_context(device_target="Ascend")
>>> ms.run_check()
MindSpore version:  2.2.0.20231025
The result of multiplication calculation is correct, MindSpore has been installed on platform [Ascend] successfully!
>>> x = ms.Tensor(np.ones([1,3,3,4]).astype(np.float32))
>>> y = ms.Tensor(np.ones([1,3,3,4]).astype(np.float32))
>>> print(ops.add(x, y))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore/common/_stub_tensor.py", line 49, in fun
    return method(*arg, **kwargs)
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore/common/tensor.py", line 493, in __str__
    return str(self.asnumpy())
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore/common/tensor.py", line 964, in asnumpy
    return Tensor_.asnumpy(self)
RuntimeError: The device address type is wrong: type name in address:CPU, type name in context:Ascend

----------------------------------------------------
- C++ Call Stack: (For framework developers)
----------------------------------------------------
mindspore/ccsrc/plugin/device/ascend/hal/hardware/ge_device_res_manager.cc:72 AllocateMemory

上述问题,为 Ascend 310/Ascend 310P3 下可能不支持MindSporeAscend模式下的计算。请到下载mindspore确认安装的版本是否与mindspore版本和硬件平台配套,或这你可以

Q7 模型转换相关问题

  • 调用converter_lite转换模型到mindir端侧模型时,报SetGraphInputShape] Failed to find input xxx in input_shape yyy:xxxxxxxxxxx

例如将 dbnet_resnet50.mindir 转为mindir端侧模型的过程中,设置了config.txt

[ascend_context]
input_format=NCHW
input_shape=args0:[1,3,736,1280]

并运行

converter_lite --saveType=MINDIR --fmk=MINDIR --optimize=ascend_oriented --modelFile=dbnet_resnet50.mindir --outputFile=dbnet_resnet50_lite --configFile=config.txt

报以下错误

[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.385 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:756] SetGraphInputShape] Failed to find input x in input_shape args0:1,3,736,1280
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.416 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:773] ConvertGraphToOm] Failed to set graph input shape
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.427 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:862] BuildGraph] Convert graph  to om failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.439 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:1320] Run] Build graph failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.450 [mindspore/lite/tools/converter/adapter/acl/acl_pass.cc:42] Run] Acl pass impl run failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.461 [mindspore/lite/tools/converter/anf_transform.cc:472] RunConvertPass] Acl pass failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.476 [mindspore/lite/tools/converter/anf_transform.cc:660] RunPass] Run convert pass failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.486 [mindspore/lite/tools/converter/anf_transform.cc:754] TransformFuncGraph] Proc online transform failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.555 [mindspore/lite/tools/converter/anf_transform.cc:855] Transform] optimizer failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.005.564 [mindspore/lite/tools/converter/converter_funcgraph.cc:471] Optimize] Transform anf graph failed.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.006.118 [mindspore/lite/tools/converter/converter.cc:1029] HandleGraphCommon] Optimize func graph failed: -2 NULL pointer returned.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.013.133 [mindspore/lite/tools/converter/converter.cc:979] Convert] Handle graph failed: -2 NULL pointer returned.
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.013.150 [mindspore/lite/tools/converter/converter.cc:1166] RunConverter] Convert model failed
[ERROR] LITE(30860,7f579d3f4f40,converter_lite):2023-11-10-03:19:29.013.163 [mindspore/lite/tools/converter/cxx_api/converter.cc:348] Convert] Convert model failed, ret=NULL pointer returned.
ERROR [mindspore/lite/tools/converter/converter_lite/main.cc:104] main] Convert failed. Ret: NULL pointer returned.
Convert failed. Ret: NULL pointer returned.

该问题变量名不匹配导致。注意到

Failed to find input x in input_shape args0:1,3,736,1280

可知config.txt中输入变量名args0与模型中的输入变量名x不匹配。将config.txtargs0改为x即可

  • 调用converter_lite转换模型到MindSpore Lite Mindir时,报错Can't find OpAdapter for LSTM

在Lite推理环境上通过export.py进行模型导出后,利用导出的模型调用converter_lite转换,例如运行

converter_lite \
  --saveType=MINDIR \
  --fmk=MINDIR \
  --optimize=ascend_oriented \
  --modelFile=./models/rec/CRNN/VGG7/crnn_vgg7.mindir \
  --outputFile=./models/rec/CRNN/VGG7/crnn_vgg7_lite \
  --configFile=./config.txt

报如下错误:

[WARNING] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.361 [mindspore/ccsrc/transform/graph_ir/utils.cc:59] FindAdapter] Can't find OpAdapter for LSTM
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.393 [mindspore/ccsrc/transform/graph_ir/convert.cc:4040] ConvertCNode] Cannot get adapter for Default/neck-RNNEncoder/seq_encoder-LSTM/rnn-_DynamicLSTMCPUGPU/LSTM-op90
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.437 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: @391_390_1_mindocr_models_base_model_BaseModel_construct_24_1:nout{[0]: ValueNode<Primitive> LSTM, [1]: nout, [2]: nout, [3]: nout, [4]: nout}.
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.457 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: ValueNode<Primitive> TupleGetItem.
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.561 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: @391_390_1_mindocr_models_base_model_BaseModel_construct_24_1:nout{[0]: ValueNode<Primitive> TupleGetItem, [1]: nout, [2]: ValueNode<Int64Imm> 0}.
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.582 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: ValueNode<Primitive> ReverseV2.
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.667 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: @391_390_1_mindocr_models_base_model_BaseModel_construct_24_1:nout{[0]: ValueNode<Primitive> ReverseV2, [1]: nout}.
[ERROR] GE_ADPT(837950,7feb6e13bf40,converter_lite):2024-10-26-07:37:40.545.734 [mindspore/ccsrc/transform/graph_ir/convert.cc:1034] ConvertAllNode] Failed to convert node: @391_390_1_mindocr_models_base_model_BaseModel_construct_24_1:param_neck.seq_encoder.bias_hh_l0.

遇到此情况,请使用昇腾训练环境通过export.py进行模型导出,然后在Lite推理环境上通过converter_lite将导出的.mindir转换为MindSpore Lite Mindir即可。

  • 通过export.py进行模型导出时,报错RuntimeError: Load op info form json config failed, version: Ascend310

在Lite推理环境上通过export.py进行模型导出,例如运行:

python tools/export.py \
      --model_name_or_config configs/det/fcenet/fce_icdar15.yaml \
      --data_shape 736 1280 \
      --local_ckpt_path ./fcenet_resnet50-43857f7f.ckpt

报如下错误:

  [ERROR] KERNEL(849474,7f7571d68740,python):2024-10-26-08:44:27.998.221 [mindspore/ccsrc/kernel/oplib/op_info_utils.cc:179] LoadOpInfoJson] Get op info json suffix path failed, soc_version: Ascend310
  [ERROR] KERNEL(849474,7f7571d68740,python):2024-10-26-08:44:27.998.362 [mindspore/ccsrc/kernel/oplib/op_info_utils.cc:118] GenerateOpInfos] Load op info json failed, version: Ascend310
  [ERROR] ANALYZER(849474,7f7571d68740,python):2024-10-26-08:44:30.168.028 [mindspore/ccsrc/pipeline/jit/ps/static_analysis/async_eval_result.cc:70] HandleException] Exception happened, check the information as below.
  RuntimeError: Load op info form json config failed, version: Ascend310

  ----------------------------------------------------
  - C++ Call Stack: (For framework developers)
  ----------------------------------------------------
  mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:320 Init

遇到此情况,请使用昇腾训练环境通过export.py进行模型导出。

  • 推理过程误用云侧mindir模型,报Save ge model to buffer failed.

例如推理过程中,det模型误用了云侧mindir模型,将抛出以下错误

[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:45.206.120 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:200] operator()] Save ge model to buffer failed.
[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:45.206.157 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:118] ParentProcess] Parent process process failed
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:45.277.253 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:200] operator()] Save ge model to buffer failed.
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:45.277.292 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:118] ParentProcess] Parent process process failed
[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.224 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:251] LoadMindIR] Convert MindIR model to OM model failed
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.280 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:781] ConvertGraphToOm] Model converter load mindir failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.307 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:862] BuildGraph] Convert graph  to om failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.332 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:1320] Run] Build graph failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.359 [mindspore/lite/tools/converter/adapter/acl/acl_pass.cc:42] Run] Acl pass impl run failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.388 [mindspore/lite/tools/converter/anf_transform.cc:472] RunConvertPass] Acl pass failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.430 [mindspore/lite/tools/converter/anf_transform.cc:660] RunPass] Run convert pass failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.453 [mindspore/lite/tools/converter/anf_transform.cc:754] TransformFuncGraph] Proc online transform failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.673 [mindspore/lite/tools/converter/anf_transform.cc:855] Transform] optimizer failed.
[ERROR] LITE(43138,7f02bddd9740,python3):2023-11-10-03:40:46.235.698 [mindspore/lite/tools/converter/converter_funcgraph.cc:471] Optimize] Transform anf graph failed.
[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:46.238.216 [mindspore/lite/src/extendrt/convert/runtime_convert.cc:214] RuntimeConvert] Convert model failed
[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:46.238.270 [mindspore/lite/src/extendrt/cxx_api/model/model_impl.cc:507] ConvertGraphOnline] Failed to converter graph
[ERROR] ME(43138,7f02bddd9740,python3):2023-11-10-03:40:46.238.351 [mindspore/lite/src/extendrt/cxx_api/model/model_impl.cc:395] BuildByBufferImpl] convert graph failed.
[ERROR] MINDOCR(43138:139649752078144,Process-1:18):2023-11-10-03:40:46.255.926 [src/parallel/framework/module_base.py:38] DetInferNode init failed: build_from_file failed! Error is Common error code.
Process Process-1:18:
Traceback (most recent call last):
  File "/root/miniconda3/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mindocr/deploy/py_infer/src/parallel/framework/module_base.py", line 39, in process_handler
    raise error
  File "/home/mindocr/deploy/py_infer/src/parallel/framework/module_base.py", line 34, in process_handler
    params = self.init_self_args()
  File "/home/mindocr/deploy/py_infer/src/parallel/module/detection/det_infer_node.py", line 12, in init_self_args
    self.text_detector.init(preprocess=False, model=True, postprocess=False)
  File "/home/mindocr/deploy/py_infer/src/infer/infer_base.py", line 29, in init
    self._init_model()
  File "/home/mindocr/deploy/py_infer/src/infer/infer_det.py", line 22, in _init_model
    device_id=self.args.device_id,
  File "/home/mindocr/deploy/py_infer/src/core/model/model.py", line 15, in __init__
    self.model = _INFER_BACKEND_MAP[backend](**kwargs)
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/lite_model.py", line 16, in __init__
    super().__init__(model_path, device, device_id)
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/model_base.py", line 28, in __init__
    self._init_model()
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/lite_model.py", line 33, in _init_model
    self.model.build_from_file(self.model_path, mslite.ModelType.MINDIR, context)
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore_lite/model.py", line 95, in warpper
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore_lite/model.py", line 235, in build_from_file
    raise RuntimeError(f"build_from_file failed! Error is {ret.ToString()}")
RuntimeError: build_from_file failed! Error is Common error code.
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.698 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:251] LoadMindIR] Convert MindIR model to OM model failed
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.755 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:781] ConvertGraphToOm] Model converter load mindir failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.782 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:862] BuildGraph] Convert graph  to om failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.807 [mindspore/lite/tools/converter/adapter/acl/src/acl_pass_impl.cc:1320] Run] Build graph failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.834 [mindspore/lite/tools/converter/adapter/acl/acl_pass.cc:42] Run] Acl pass impl run failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.864 [mindspore/lite/tools/converter/anf_transform.cc:472] RunConvertPass] Acl pass failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.904 [mindspore/lite/tools/converter/anf_transform.cc:660] RunPass] Run convert pass failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.305.928 [mindspore/lite/tools/converter/anf_transform.cc:754] TransformFuncGraph] Proc online transform failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.306.162 [mindspore/lite/tools/converter/anf_transform.cc:855] Transform] optimizer failed.
[ERROR] LITE(43123,7f02bddd9740,python3):2023-11-10-03:40:46.306.188 [mindspore/lite/tools/converter/converter_funcgraph.cc:471] Optimize] Transform anf graph failed.
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:46.308.599 [mindspore/lite/src/extendrt/convert/runtime_convert.cc:214] RuntimeConvert] Convert model failed
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:46.308.646 [mindspore/lite/src/extendrt/cxx_api/model/model_impl.cc:507] ConvertGraphOnline] Failed to converter graph
[ERROR] ME(43123,7f02bddd9740,python3):2023-11-10-03:40:46.308.712 [mindspore/lite/src/extendrt/cxx_api/model/model_impl.cc:395] BuildByBufferImpl] convert graph failed.
[ERROR] MINDOCR(43123:139649752078144,Process-1:17):2023-11-10-03:40:46.324.506 [src/parallel/framework/module_base.py:38] DetPreNode init failed: build_from_file failed! Error is Common error code.
Process Process-1:17:
Traceback (most recent call last):
  File "/root/miniconda3/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/root/miniconda3/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/mindocr/deploy/py_infer/src/parallel/framework/module_base.py", line 39, in process_handler
    raise error
  File "/home/mindocr/deploy/py_infer/src/parallel/framework/module_base.py", line 34, in process_handler
    params = self.init_self_args()
  File "/home/mindocr/deploy/py_infer/src/parallel/module/detection/det_pre_node.py", line 13, in init_self_args
    self.text_detector.init(preprocess=True, model=False, postprocess=False)
  File "/home/mindocr/deploy/py_infer/src/infer/infer_base.py", line 29, in init
    self._init_model()
  File "/home/mindocr/deploy/py_infer/src/infer/infer_det.py", line 22, in _init_model
    device_id=self.args.device_id,
  File "/home/mindocr/deploy/py_infer/src/core/model/model.py", line 15, in __init__
    self.model = _INFER_BACKEND_MAP[backend](**kwargs)
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/lite_model.py", line 16, in __init__
    super().__init__(model_path, device, device_id)
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/model_base.py", line 28, in __init__
    self._init_model()
  File "/home/mindocr/deploy/py_infer/src/core/model/backend/lite_model.py", line 33, in _init_model
    self.model.build_from_file(self.model_path, mslite.ModelType.MINDIR, context)
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore_lite/model.py", line 95, in warpper
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/py37/lib/python3.7/site-packages/mindspore_lite/model.py", line 235, in build_from_file
    raise RuntimeError(f"build_from_file failed! Error is {ret.ToString()}")
RuntimeError: build_from_file failed! Error is Common error code.

可能原因

  • mindir云侧模型在未转换为mindir端侧模型使用
  • converter_lite转换工具版本与推理时mindspore_lite版本不一致。例如用converter_lite 2.2转换得到的mindir端侧模型,用于mindspore_lite 2.1下推理

Q8 推理时相关问题

  • 使用deploy/py_infer/infer.py推理时,报TypeError: unhashable type: 'numpy.ndarray',具体错误为
[ERROR] MINDOCR(51913:140354674829120,Process-1:28):2023-11-10-06:52:34.304.673 [src/parallel/framework/module_base.py:66] ERROR occurred in RecPostNode module for test.jpg: unhashable type: 'numpy.ndarray'.
Traceback (most recent call last):
  File "/home/mindocr/deploy/py_infer/src/parallel/framework/module_base.py", line 62, in call_process
    self.process(send_data)
  File "/home/mindocr/deploy/py_infer/src/parallel/module/recognition/rec_post_node.py", line 24, in process
    output = self.text_recognizer.postprocess(data["pred"], batch)
  File "/home/mindocr/deploy/py_infer/src/infer/infer_rec.py", line 132, in postprocess
    return self.postprocess_ops(pred)
  File "/home/mindocr/deploy/py_infer/src/data_process/postprocess/builder.py", line 32, in __call__
    return self._ops_func(*args, **kwargs)
  File "/home/mindocr/mindocr/postprocess/rec_postprocess.py", line 153, in __call__
    raw_chars = [[self.character[idx] for idx in pred_indices[b]] for b in range(pred_indices.shape[0])]
  File "/home/mindocr/mindocr/postprocess/rec_postprocess.py", line 153, in <listcomp>
    raw_chars = [[self.character[idx] for idx in pred_indices[b]] for b in range(pred_indices.shape[0])]
  File "/home/mindocr/mindocr/postprocess/rec_postprocess.py", line 153, in <listcomp>
    raw_chars = [[self.character[idx] for idx in pred_indices[b]] for b in range(pred_indices.shape[0])]
TypeError: unhashable type: 'numpy.ndarray'

该错误为模型输出的shape有误。请检查如下事项

  • 使用恰当的模型。例如在 --rec_model_path 错误传入了检测模型,可触发此错误;
  • 使用推理模型(非训练模型),用converter_lite转换工具转为端侧mindir进行推理。

Q9 DBNet训练速率不及预期

执行以下命令,训练DBNet系列网络(包括DBNet MobileNetV3、DBNet ResNet-18、DBNet ResNet-50、DBNet++ ResNet-50等)时,训练帧率不及预期。例如,DBNet MobileNetV3在Ascend 910A上,训练速率仅80fps,不及预期的100fps。

python tools/train.py -c configs/det/dbnet/db_mobilenetv3_icdar15.yaml

由于DBNet数据预处理过程相对复杂,如训练服务器CPU单核运算能力较弱,则数据预处理可能成为性能瓶颈。

解决方法

  1. 尝试将配置文件中train.dataset.use_minddataeval.dataset.use_minddata的选项设置为True。MindOCR将采用MindSporeMindData执行部分数据预处理步骤:

    ...
    train:
      ckpt_save_dir: './tmp_det'
      dataset_sink_mode: True
      dataset:
        type: DetDataset
        dataset_root: /data/ocr_datasets
        data_dir: ic15/det/train/ch4_training_images
        label_file: ic15/det/train/det_gt.txt
        sample_ratio: 1.0
        use_minddata: True                          <-- 设置该选项
    ...
    eval:
      ckpt_load_path: tmp_det/best.ckpt
      dataset_sink_mode: False
      dataset:
        type: DetDataset
        dataset_root: /data/ocr_datasets
        data_dir: ic15/det/test/ch4_test_images
        label_file: ic15/det/test/det_gt.txt
        sample_ratio: 1.0
        use_minddata: True                          <-- 设置该选项
    ...
    
  2. 如训练服务器CPU核数较多,尝试调高配置文件中的train.loader.num_workers选项,提升数据预取的线程数:

    ...
    train:
      ...
      loader:
        shuffle: True
        batch_size: 10
        drop_remainder: True
        num_workers: 12                             <-- 设置该选项
    ...
    

Q10 libgomp-d22c30c5.so.1.0.0相关错误

运行mindocr时,可能报以下错误

ImportError: /root/mindocr_env/lib/python3.8/site-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block
可以尝试以下步骤 - 在python安装路径下查找libgomp-d22c30c5.so.1.0.0:
cd /root/mindocr_env/lib/python3.8
find ~ -name libgomp-d22c30c5.so.1.0.0
将查找到以下结果
/root/mindocr_env/lib/python3.8/site-packages/scikit_learn.libs/libgomp-d22c30c5.so.1.0.0
- 将so文件路径加入到环境变量LD_PRELOAD
export LD_PRELOAD=/root/mindocr_env/lib/python3.8/site-packages/scikit_learn.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD

Q11 当在lmdb dataset上训练abinet报数据管道错误

当在lmdb dataset上训练abinet报以下数据管道错误

mindocr.data.rec_lmdb_dataset WARNING - Error occurred during preprocess.
 Exception thrown from dataset pipeline. Refer to 'Dataset Pipeline Error Message'.

------------------------------------------------------------------
- Dataset Pipeline Error Message:
------------------------------------------------------------------
[ERROR] No cast for the specified DataType was found.

------------------------------------------------------------------
- C++ Call Stack: (For framework developers)
------------------------------------------------------------------
mindspore/ccsrc/minddata/dataset/kernels/py_func_op.cc(143).
可以尝试用如下步骤修复 - 找到mindspore的包路径 - 打开文件: mindspore/dataset/transforms/transform.py - 跳转到93行,可以得到如下内容:
93        if key in EXECUTORS_LIST:
94           # get the executor by process id and thread id
95            executor = EXECUTORS_LIST[key]
96            # remove the old transform which in executor and update the new transform
97            executor.UpdateOperation(self.parse())
98        else:
99            # create a new executor by process id and thread_id
100           executor = cde.Execute(self.parse())
101           # add the executor the global EXECUTORS_LIST
102           EXECUTORS_LIST[key] = executor
- 使用executor = cde.Execute(self.parse())替换97行, 得到如下内容:
93        if key in EXECUTORS_LIST:
94            # get the executor by process id and thread id
95            executor = EXECUTORS_LIST[key]
96            # remove the old transform which in executor and update the new transform
97            executor = cde.Execute(self.parse())
98        else:
99            # create a new executor by process id and thread_id
100           executor = cde.Execute(self.parse())
101           # add the executor the global EXECUTORS_LIST
102           EXECUTORS_LIST[key] = executor
- 保存后再次尝试训练即可

Q12 当在synthtext数据集上训练dbnet报运行时错误

当在synthtext数据集上训练dbnet报以下数据管道错误

Traceback (most recent call last):
  ...
  File "/root/archiconda3/envs/Python380/lib/python3.8/site-packages/mindspore/common/api.py", line 1608, in _exec_pip
    return self.graph_executor(args, phase)
RuntimeError: Run task for graph:kernel_graph_1 error! The details reger to 'Ascend Error Message'

请尝试将CANN更新到7.1。

Q13 安装seqeval相关错误

当运行pip install -r requirements.txt时,报以下错误

Collecting seqeval>=1.2.2 (from -r requirements.txt (line 19))
  Downloading http://mirrors.aliyun.com/pypi/packages/9d/2d/233c79d5b4e5ab1dbf111242299153f3caddddbb691219f363ad55ce783d/seqeval-1.2.2.tar.gz (43 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.6/43.6 kB 181.0 kB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
   exit code: 1
  ╰─> [48 lines of output]
      /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/__init__.py:80: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
      !!

              ********************************************************************************
              Requirements should be satisfied by a PEP 517 installer.
              If you are using pip, you can try `pip install --use-pep517`.
              ********************************************************************************

      !!
        dist.fetch_build_eggs(dist.setup_requires)
      WARNING: The repository located at mirrors.aliyun.com is not a trusted or secure host and is being ignored. If this repository is available via HTTPS we recommend you use HTTPS instead, otherwise you may silence this warning and allow it anyway with '--trusted-host mirrors.aliyun.com'.
      ERROR: Could not find a version that satisfies the requirement setuptools_scm (from versions: none)
      ERROR: No matching distribution found for setuptools_scm
      Traceback (most recent call last):
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/installer.py", line 101, in _fetch_build_egg_no_warn
          subprocess.check_call(cmd)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/subprocess.py", line 373, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['/home/ma-user/anaconda3/envs/MindSpore/bin/python3.9', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpusgt0k69', '--quiet', 'setuptools_scm']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-m2kqztlz/seqeval_da00f708dc0e483b92cd18083513d5e7/setup.py", line 27, in <module>
          setup(
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/__init__.py", line 102, in setup
          _install_setup_requires(attrs)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/__init__.py", line 75, in _install_setup_requires
          _fetch_build_eggs(dist)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/__init__.py", line 80, in _fetch_build_eggs
          dist.fetch_build_eggs(dist.setup_requires)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/dist.py", line 636, in fetch_build_eggs
          return _fetch_build_eggs(self, requires)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/installer.py", line 38, in _fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/pkg_resources/__init__.py", line 829, in resolve
          dist = self._resolve_dist(
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/pkg_resources/__init__.py", line 865, in _resolve_dist
          dist = best[req.key] = env.best_match(
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1135, in best_match
          return self.obtain(req, installer)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1147, in obtain
          return installer(requirement)
        File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/setuptools/installer.py", line 103, in _fetch_build_egg_no_warn
          raise DistutilsError(str(e)) from e
      distutils.errors.DistutilsError: Command '['/home/ma-user/anaconda3/envs/MindSpore/bin/python3.9', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmpusgt0k69', '--quiet', 'setuptools_scm']' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
尝试以下步骤修复: - 更新setuptools: pip3 install --upgrade setuptools - 更新setuptools_scm: pip3 install --upgrade setuptools_scm - 安装seqevalpip3 install seqeval -i https://pypi.tuna.tsinghua.edu.cn/simple

Q14 安装lanms相关错误

当安装lanms时,报

ImportError: Python version mismatch: module was compiled for version 3.8, while the interpreter is running version 3.7.
该问题可能是当前存在多个python3环境导致,你可使用以下步骤解决该问题 - 执行pip3 install lanms -i https://pypi.tuna.tsinghua.edu.cn/simple,得到lanms-1.0.2.tar.gz的下载链接(如https://pypi.tuna.tsinghua.edu.cn/packages/96/c0/50dc2c857ed060e907adaef31184413a7706e475c322236d346382e45195/lanms-1.0.2.tar.gz) - 使用该下载链接,下载lanms-1.0.2.tar.gz,执行tar -zxvf lanms-1.0.2.tar.gz以解压该包 - cd lanms-1.0.2 - 编辑Makefile,在第1,2行中,用python3.7-config替代python3-config,得到如下修改
CXXFLAGS = -I include  -std=c++11 -O3 $(shell python3.7-config --cflags)
LDFLAGS = $(shell python3.7-config --ldflags)
...
保存该Makefile, 执行过程将匹配到python 3.7环境 - 执行python setup.py install以安装lanms