There is a problem when I training culane dataset. Loss is very large at begining, and became 1.000 at 3001 step. I don't know why this happened. Could you help me how to slove this problem.
I just detele validation process when training culane.
Log as follows:
2021-06-04 17:23:20,933 - resa - INFO - epoch: 0 step: 501 lr: 0.0025 seg_loss: 0.9741 exist_loss: 0.0574 data: 0.0
2021-06-04 17:25:35,022 - resa - INFO - epoch: 0 step: 1001 lr: 0.0050 seg_loss: 0.9215 exist_loss: 0.0538 data: 0.
2021-06-04 17:27:49,330 - resa - INFO - epoch: 0 step: 1501 lr: 0.0074 seg_loss: 0.8975 exist_loss: 0.0536 data: 0.
2021-06-04 17:30:03,792 - resa - INFO - epoch: 0 step: 2001 lr: 0.0099 seg_loss: 0.9029 exist_loss: 0.0558 data: 0.
2021-06-04 17:32:18,565 - resa - INFO - epoch: 0 step: 2501 lr: 0.0123 seg_loss: 0.9990 exist_loss: 0.0552 data: 0.
2021-06-04 17:34:30,550 - resa - INFO - epoch: 0 step: 3001 lr: 0.0147 seg_loss: 1.0000 exist_loss: 0.0562 data: 0.
2021-06-04 17:36:42,521 - resa - INFO - epoch: 0 step: 3501 lr: 0.0171 seg_loss: 1.0000 exist_loss: 0.0559 data: 0.
2021-06-04 17:38:55,629 - resa - INFO - epoch: 0 step: 4001 lr: 0.0195 seg_loss: 1.0000 exist_loss: 0.0549 data: 0.
2021-06-04 17:41:07,304 - resa - INFO - epoch: 0 step: 4501 lr: 0.0218 seg_loss: 1.0000 exist_loss: 0.0545 data: 0.
2021-06-04 17:43:19,887 - resa - INFO - epoch: 0 step: 5001 lr: 0.0242 seg_loss: 1.0000 exist_loss: 0.0572 data: 0.
2021-06-04 17:45:32,259 - resa - INFO - epoch: 0 step: 5501 lr: 0.0241 seg_loss: 1.0000 exist_loss: 0.0560 data: 0.
2021-06-04 17:47:45,136 - resa - INFO - epoch: 0 step: 6001 lr: 0.0240 seg_loss: 1.0000 exist_loss: 0.0572 data: 0.
2021-06-04 17:49:57,754 - resa - INFO - epoch: 0 step: 6501 lr: 0.0239 seg_loss: 1.0000 exist_loss: 0.0528 data: 0.
2021-06-04 17:52:11,383 - resa - INFO - epoch: 0 step: 7001 lr: 0.0238 seg_loss: 1.0000 exist_loss: 0.0566 data: 0.
2021-06-04 17:54:24,219 - resa - INFO - epoch: 0 step: 7501 lr: 0.0237 seg_loss: 1.0000 exist_loss: 0.0565 data: 0.
2021-06-04 17:56:37,749 - resa - INFO - epoch: 0 step: 8001 lr: 0.0236 seg_loss: 1.0000 exist_loss: 0.0520 data: 0.
2021-06-04 17:58:50,918 - resa - INFO - epoch: 0 step: 8501 lr: 0.0236 seg_loss: 1.0000 exist_loss: 0.0554 data: 0.
2021-06-04 18:01:05,131 - resa - INFO - epoch: 0 step: 9001 lr: 0.0235 seg_loss: 1.0000 exist_loss: 0.0578 data: 0.
2021-06-04 18:03:18,808 - resa - INFO - epoch: 0 step: 9501 lr: 0.0234 seg_loss: 1.0000 exist_loss: 0.0551 data: 0.
2021-06-04 18:05:32,230 - resa - INFO - epoch: 0 step: 10001 lr: 0.0233 seg_loss: 1.0000 exist_loss: 0.0547 data: 0
2021-06-04 18:07:45,673 - resa - INFO - epoch: 0 step: 10501 lr: 0.0232 seg_loss: 1.0000 exist_loss: 0.0583 data: 0
2021-06-04 18:09:59,493 - resa - INFO - epoch: 0 step: 11001 lr: 0.0231 seg_loss: 1.0000 exist_loss: 0.0552 data: 0
2021-06-04 18:10:28,531 - resa - INFO - epoch: 0 step: 11110 lr: 0.0231 seg_loss: 1.0000 exist_loss: 0.0582 data: 0
2021-06-04 18:10:29,854 - resa - INFO - epoch: 1 step: 11111 lr: 0.0231 seg_loss: 1.0000 exist_loss: 0.0569 data: 0
2021-06-04 18:12:44,115 - resa - INFO - epoch: 1 step: 11611 lr: 0.0230 seg_loss: 1.0000 exist_loss: 0.0572 data: 0
2021-06-04 18:14:58,217 - resa - INFO - epoch: 1 step: 12111 lr: 0.0229 seg_loss: 1.0000 exist_loss: 0.0521 data: 0
2021-06-04 18:17:13,108 - resa - INFO - epoch: 1 step: 12611 lr: 0.0229 seg_loss: 1.0000 exist_loss: 0.0525 data: 0
2021-06-04 18:19:28,087 - resa - INFO - epoch: 1 step: 13111 lr: 0.0228 seg_loss: 1.0000 exist_loss: 0.0525 data: 0
2021-06-04 18:21:42,435 - resa - INFO - epoch: 1 step: 13611 lr: 0.0227 seg_loss: 1.0000 exist_loss: 0.0548 data: 0
2021-06-04 18:23:57,255 - resa - INFO - epoch: 1 step: 14111 lr: 0.0226 seg_loss: 1.0000 exist_loss: 0.0517 data: 0
2021-06-04 18:26:12,311 - resa - INFO - epoch: 1 step: 14611 lr: 0.0225 seg_loss: 1.0000 exist_loss: 0.0545 data: 0
2021-06-04 18:28:26,983 - resa - INFO - epoch: 1 step: 15111 lr: 0.0224 seg_loss: 1.0000 exist_loss: 0.0541 data: 0
2021-06-04 18:30:42,254 - resa - INFO - epoch: 1 step: 15611 lr: 0.0223 seg_loss: 1.0000 exist_loss: 0.0572 data: 0
2021-06-04 18:32:57,022 - resa - INFO - epoch: 1 step: 16111 lr: 0.0223 seg_loss: 1.0000 exist_loss: 0.0539 data: 0
2021-06-04 18:35:12,342 - resa - INFO - epoch: 1 step: 16611 lr: 0.0222 seg_loss: 1.0000 exist_loss: 0.0523 data: 0
2021-06-04 18:37:26,588 - resa - INFO - epoch: 1 step: 17111 lr: 0.0221 seg_loss: 1.0000 exist_loss: 0.0534 data: 0
2021-06-04 18:39:41,054 - resa - INFO - epoch: 1 step: 17611 lr: 0.0220 seg_loss: 0.9999 exist_loss: 0.0555 data: 0
2021-06-04 18:41:55,410 - resa - INFO - epoch: 1 step: 18111 lr: 0.0219 seg_loss: 1.0000 exist_loss: 0.0558 data: 0
2021-06-04 18:44:09,407 - resa - INFO - epoch: 1 step: 18611 lr: 0.0218 seg_loss: 1.0000 exist_loss: 0.0560 data: 0
2021-06-04 18:46:23,547 - resa - INFO - epoch: 1 step: 19111 lr: 0.0218 seg_loss: 1.0000 exist_loss: 0.0509 data: 0
2021-06-04 18:48:37,960 - resa - INFO - epoch: 1 step: 19611 lr: 0.0217 seg_loss: 1.0000 exist_loss: 0.0538 data: 0
2021-06-04 18:50:52,882 - resa - INFO - epoch: 1 step: 20111 lr: 0.0216 seg_loss: 1.0000 exist_loss: 0.0511 data: 0
2021-06-04 18:53:07,808 - resa - INFO - epoch: 1 step: 20611 lr: 0.0215 seg_loss: 0.9999 exist_loss: 0.0565 data: 0
2021-06-04 18:55:19,656 - resa - INFO - epoch: 1 step: 21111 lr: 0.0214 seg_loss: 1.0000 exist_loss: 0.0548 data: 0
2021-06-04 18:57:31,602 - resa - INFO - epoch: 1 step: 21611 lr: 0.0213 seg_loss: 1.0000 exist_loss: 0.0541 data: 0
2021-06-04 18:59:44,497 - resa - INFO - epoch: 1 step: 22111 lr: 0.0212 seg_loss: 1.0000 exist_loss: 0.0561 data: 0
2021-06-04 19:00:13,252 - resa - INFO - epoch: 1 step: 22220 lr: 0.0212 seg_loss: 1.0000 exist_loss: 0.0573 data: 0
2021-06-04 19:00:14,471 - resa - INFO - epoch: 2 step: 22221 lr: 0.0212 seg_loss: 1.0000 exist_loss: 0.0575 data: 0
2021-06-04 19:02:28,169 - resa - INFO - epoch: 2 step: 22721 lr: 0.0211 seg_loss: 0.9999 exist_loss: 0.0528 data: 0
2021-06-04 19:04:42,306 - resa - INFO - epoch: 2 step: 23221 lr: 0.0210 seg_loss: 1.0000 exist_loss: 0.0573 data: 0
2021-06-04 19:06:56,522 - resa - INFO - epoch: 2 step: 23721 lr: 0.0210 seg_loss: 1.0000 exist_loss: 0.0523 data: 0
2021-06-04 19:09:10,937 - resa - INFO - epoch: 2 step: 24221 lr: 0.0209 seg_loss: 1.0000 exist_loss: 0.0551 data: 0
2021-06-04 19:11:25,091 - resa - INFO - epoch: 2 step: 24721 lr: 0.0208 seg_loss: 1.0000 exist_loss: 0.0555 data: 0
2021-06-04 19:13:39,234 - resa - INFO - epoch: 2 step: 25221 lr: 0.0207 seg_loss: 1.0000 exist_loss: 0.0557 data: 0
2021-06-04 19:15:53,421 - resa - INFO - epoch: 2 step: 25721 lr: 0.0206 seg_loss: 1.0000 exist_loss: 0.0538 data: 0
2021-06-04 19:18:08,001 - resa - INFO - epoch: 2 step: 26221 lr: 0.0205 seg_loss: 1.0000 exist_loss: 0.0580 data: 0
2021-06-04 19:20:22,586 - resa - INFO - epoch: 2 step: 26721 lr: 0.0204 seg_loss: 1.0000 exist_loss: 0.0567 data: 0
2021-06-04 19:22:37,095 - resa - INFO - epoch: 2 step: 27221 lr: 0.0204 seg_loss: 1.0000 exist_loss: 0.0550 data: 0
2021-06-04 19:24:51,392 - resa - INFO - epoch: 2 step: 27721 lr: 0.0203 seg_loss: 1.0000 exist_loss: 0.0547 data: 0
2021-06-04 19:27:05,857 - resa - INFO - epoch: 2 step: 28221 lr: 0.0202 seg_loss: 1.0000 exist_loss: 0.0526 data: 0
2021-06-04 19:29:20,278 - resa - INFO - epoch: 2 step: 28721 lr: 0.0201 seg_loss: 1.0000 exist_loss: 0.0557 data: 0
2021-06-04 19:31:34,059 - resa - INFO - epoch: 2 step: 29221 lr: 0.0200 seg_loss: 1.0000 exist_loss: 0.0525 data: 0
2021-06-04 19:33:47,738 - resa - INFO - epoch: 2 step: 29721 lr: 0.0199 seg_loss: 1.0000 exist_loss: 0.0534 data: 0
2021-06-04 19:36:01,522 - resa - INFO - epoch: 2 step: 30221 lr: 0.0198 seg_loss: 1.0000 exist_loss: 0.0553 data: 0
2021-06-04 19:38:15,620 - resa - INFO - epoch: 2 step: 30721 lr: 0.0197 seg_loss: 1.0000 exist_loss: 0.0551 data: 0
2021-06-04 19:40:30,060 - resa - INFO - epoch: 2 step: 31221 lr: 0.0197 seg_loss: 1.0000 exist_loss: 0.0548 data: 0
2021-06-04 19:42:44,289 - resa - INFO - epoch: 2 step: 31721 lr: 0.0196 seg_loss: 1.0000 exist_loss: 0.0547 data: 0
2021-06-04 19:44:58,634 - resa - INFO - epoch: 2 step: 32221 lr: 0.0195 seg_loss: 1.0000 exist_loss: 0.0527 data: 0
2021-06-04 19:47:11,064 - resa - INFO - epoch: 2 step: 32721 lr: 0.0194 seg_loss: 1.0000 exist_loss: 0.0527 data: 0
2021-06-04 19:49:23,852 - resa - INFO - epoch: 2 step: 33221 lr: 0.0193 seg_loss: 1.0000 exist_loss: 0.0540 data: 0
2021-06-04 19:49:52,867 - resa - INFO - epoch: 2 step: 33330 lr: 0.0193 seg_loss: 1.0000 exist_loss: 0.0565 data: 0
2021-06-04 19:49:54,094 - resa - INFO - epoch: 3 step: 33331 lr: 0.0193 seg_loss: 1.0000 exist_loss: 0.0555 data: 0
2021-06-04 19:52:07,179 - resa - INFO - epoch: 3 step: 33831 lr: 0.0192 seg_loss: 1.0000 exist_loss: 0.0549 data: 0
2021-06-04 19:54:20,207 - resa - INFO - epoch: 3 step: 34331 lr: 0.0191 seg_loss: 1.0000 exist_loss: 0.0500 data: 0
2021-06-04 19:56:33,132 - resa - INFO - epoch: 3 step: 34831 lr: 0.0190 seg_loss: 1.0000 exist_loss: 0.0512 data: 0
2021-06-04 19:58:46,082 - resa - INFO - epoch: 3 step: 35331 lr: 0.0189 seg_loss: 1.0000 exist_loss: 0.0560 data: 0
2021-06-04 20:00:59,093 - resa - INFO - epoch: 3 step: 35831 lr: 0.0189 seg_loss: 1.0000 exist_loss: 0.0498 data: 0
2021-06-04 20:03:12,343 - resa - INFO - epoch: 3 step: 36331 lr: 0.0188 seg_loss: nan exist_loss: nan data: 0.0486
2021-06-04 20:05:26,001 - resa - INFO - epoch: 3 step: 36831 lr: 0.0187 seg_loss: nan exist_loss: nan data: 0.0477
2021-06-04 20:07:39,841 - resa - INFO - epoch: 3 step: 37331 lr: 0.0186 seg_loss: nan exist_loss: nan data: 0.0492
2021-06-04 20:09:53,744 - resa - INFO - epoch: 3 step: 37831 lr: 0.0185 seg_loss: nan exist_loss: nan data: 0.0483
2021-06-04 20:12:08,046 - resa - INFO - epoch: 3 step: 38331 lr: 0.0184 seg_loss: nan exist_loss: nan data: 0.0481
2021-06-04 20:14:22,881 - resa - INFO - epoch: 3 step: 38831 lr: 0.0183 seg_loss: nan exist_loss: nan data: 0.0487
2021-06-04 20:16:36,676 - resa - INFO - epoch: 3 step: 39331 lr: 0.0183 seg_loss: nan exist_loss: nan data: 0.0480
2021-06-04 20:18:50,936 - resa - INFO - epoch: 3 step: 39831 lr: 0.0182 seg_loss: nan exist_loss: nan data: 0.0473
2021-06-04 20:21:03,928 - resa - INFO - epoch: 3 step: 40331 lr: 0.0181 seg_loss: nan exist_loss: nan data: 0.0493