Hi,
Have been trying to run training (though on single GPU).
-> bash tools/dist_train.sh configs/trans10kv2/trans2seg/trans2seg_medium.yaml 1
And below is error stack. Please can you help?
Many thanks.
- CONFIG=configs/trans10kv2/trans2seg/trans2seg_medium.yaml
- GPUS=8
++ dirname tools/dist_train.sh
- python -m torch.distributed.launch --nproc_per_node=8 tools/train.py --config-file configs/trans10kv2/trans2seg/trans2seg_medium.yaml 1
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "tools/train.py", line 20, in
from segmentron.solver.loss import get_segmentation_loss
File "/home/paperspace/Trans2Seg/segmentron/solver/loss.py", line 9, in
from ..models.pointrend import point_sample
ModuleNotFoundError: No module named 'segmentron.models.pointrend'
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/paperspace/Torch/lib/python3.8/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/paperspace/Torch/lib/python3.8/site-packages/torch/distributed/launch.py", line 258, in main
raise subprocess.CalledProcessError(returncode=process.returncode,
subprocess.CalledProcessError: Command '['/home/paperspace/Torch/bin/python', '-u', 'tools/train.py', '--local_rank=7', '--config-file', 'configs/trans10kv2/trans2seg/trans2seg_medium.yaml', '1']' returned non-zero exit status 1.