Hello, when trying to train the model by myself, I met the following error:
Traceback (most recent call last):
File ".../site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File ".../GOF_NeurIPS2021/train.py", line 340, in train
scaler.step(optimizer_G)
File ".../site-packages/torch/cuda/amp/grad_scaler.py", line 337, in step
assert len(optimizer_state["found_inf_per_device"]) > 0, "No inf checks were recorded for this optimizer."
The environment is the same as in requirements.txt
(besides, the package name mcubes
should be PyMCubes
?).
I tried to comment that line in grad_scaler.py
, although it can train now, the results seem not converging (output is still random noise after around 30000 steps).
Any help would be appreciated!