请问大佬下面这个问题是为什么
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
/root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv
warn(f"Failed to load image Python extension: {e}")
/root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv
warn(f"Failed to load image Python extension: {e}")
01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 1
01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 0
01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 0, total 2.
01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 1, total 2.
01/21 05:42:20 AM Model slimmable_mbnet_v1_bn_uniform created, param count: 7676204
01/21 05:42:20 AM Data processing configuration for current model + dataset:
01/21 05:42:20 AM input_size: (3, 224, 224)
01/21 05:42:20 AM interpolation: bicubic
01/21 05:42:20 AM mean: (0.485, 0.456, 0.406)
01/21 05:42:20 AM std: (0.229, 0.224, 0.225)
01/21 05:42:20 AM crop_pct: 0.875
01/21 05:42:20 AM NVIDIA APEX not installed. AMP off.
01/21 05:42:21 AM Using torch DistributedDataParallel. Install NVIDIA Apex for Apex DDP.
01/21 05:42:21 AM Scheduled epochs: 40
01/21 05:42:21 AM Training folder does not exist at: images/train
01/21 05:42:21 AM Training folder does not exist at: images/train
Killing subprocess 239
Killing subprocess 240
Traceback (most recent call last):
File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/root/anaconda3/envs/0108/bin/python', '-u', 'train.py', '--local_rank=1', 'images', '-c', './configs/mobilenetv1_bn_uniform_reset_bn.yml']' returned non-zero exit status 1.