2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,096] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:53,097] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 4294967296, reducing to 4294967296
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:55,597] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 4294967296, reducing to 2147483648.0
[2022-09-14 22:57:57,418] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,418] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,418] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,418] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,419] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,419] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,419] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:57,419] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 2147483648.0, reducing to 1073741824.0
[2022-09-14 22:57:59,286] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,286] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,286] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,287] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,287] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,287] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,287] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:57:59,287] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 1073741824.0, reducing to 536870912.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:01,089] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 536870912.0, reducing to 268435456.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:03,610] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 268435456.0, reducing to 134217728.0
[2022-09-14 22:58:04,962] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:04,963] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 134217728.0, reducing to 67108864.0
[2022-09-14 22:58:07,493] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,493] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,493] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,493] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,494] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,494] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,494] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:07,494] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 67108864.0, reducing to 33554432.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:10,152] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 33554432.0, reducing to 16777216.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:11,943] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 16777216.0, reducing to 8388608.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:13,765] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 8388608.0, reducing to 4194304.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:16,346] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 4194304.0, reducing to 2097152.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,958] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:18,959] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 2097152.0, reducing to 1048576.0
[2022-09-14 22:58:20,798] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,798] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,798] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,799] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,799] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,799] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,799] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:20,799] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 1048576.0, reducing to 524288.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:25,414] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 524288.0, reducing to 262144.0
[2022-09-14 22:58:28,171] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 4 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,171] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 6 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,171] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 1 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,172] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 3 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,172] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 2 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,172] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 0 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,172] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 5 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
[2022-09-14 22:58:28,172] [INFO] [stage2.py:1387:step] [deepspeed] fp16 dynamic loss scale overflow! Rank 7 Skipping step. Attempted loss scale: 262144.0, reducing to 131072.0
Traceback (most recent call last):
RuntimeError: CUDA out of memory. Tried to allocate 4.63 GiB (GPU 6; 31.75 GiB total capacity; 20.85 GiB already allocated; 4.46 GiB free; 25.71 GiB reserved in total by PyTorch)