Lic2022的baseline源码,在AIStudio可以正常跑,本地跑时train_query,infer_dial,infer_dial均无错误,只在infer_query时出现以下错误
paddlepaddle:2.2.2
cuda:11.2
cudnn:8.2
$ sh ./scripts/local/job.sh ./projects/lic2022/conf/query_infer.conf
2022-04-18 15:40:25,456-INFO: [topology.py:169:init] HybridParallelInfo: rank_id: 0, mp_degree: 1, sharding_degree: 1, pp_degree: 1, dp_degree: 1, mp_group: [0], sharding_group: [0], pp_group: [0], dp_gr
oup: [0], check/clip group: [0]
W0418 15:40:25.456908 14688 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
W0418 15:40:25.472528 14688 device_context.cc:465] device: 0, cuDNN Version: 8.2.
[WARN] Using constant learning rate because of warmup_steps is not positive while using NoamScheduler.
Loading parameters from ./projects/lic2022/model_zoo/query_finetune.pdparams.
Loading has done!
Traceback (most recent call last):
File "./knover/scripts/infer.py", line 140, in
infer(args)
File "./knover/scripts/infer.py", line 83, in infer
predictions = task.infer_step(model, data)
File "e:\jupyternotebookproject\lic2022\knover\knover\core\task.py", line 46, in infer_step
predictions = model.infer_step(inputs)
File "e:\jupyternotebookproject\lic2022\knover\knover\core\model.py", line 508, in infer_step
predictions = self._model(*inputs, mode="infer")
File "e:\jupyternotebookproject\lic2022\knover\knover\core\model.py", line 180, in call
outputs = self.infer_step(inputs)
File "e:\jupyternotebookproject\lic2022\knover\knover\core\model.py", line 170, in infer_step
predictions = self.infer(inputs, outputs)
File "e:\jupyternotebookproject\lic2022\knover\knover\models\unified_transformer.py", line 297, in infer
outputs = self.generator(self, inputs, outputs)
File "e:\jupyternotebookproject\lic2022\knover\knover\modules\generator.py", line 163, in call
state = self._update_state(state, probs)
File "e:\jupyternotebookproject\lic2022\knover\knover\modules\generator.py", line 390, in _update_state
state["predictions"] = paddle.concat([state["predictions"], pred], axis=1)
File "E:\software\Anaconda3\envs\Knover\lib\site-packages\paddle\tensor\manipulation.py", line 345, in concat
return paddle.fluid.layers.concat(input=x, axis=axis, name=name)
File "E:\software\Anaconda3\envs\Knover\lib\site-packages\paddle\fluid\layers\tensor.py", line 327, in concat
return _C_ops.concat(input, 'axis', axis)
ValueError: (InvalidArgument) Tensor holds the wrong type, it holds int, but desires to be int64_t.
[Hint: Expected valid == true, but received valid:0 != true:1.] (at ../paddle/fluid/framework/tensor_impl.h:33)
[operator < concat > error]
INFO 2022-04-18 15:40:36,201 launch_utils.py:341] terminate all the procs
ERROR 2022-04-18 15:40:36,201 launch_utils.py:604] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2022-04-18 15:40:39,210 launch_utils.py:341] terminate all the procs
INFO 2022-04-18 15:40:39,210 launch.py:311] Local processes completed.
exit_code=0
[[ 0 != 0 ]]
exit 0