执行如下指令:python projects/SWINTS/train_net.py --num-gpus 4 --config-file projects/SWINTS/configs/SWINTS-swin-pretrain.yaml
4 A100 80G batchsize=24,rec loss始终在6-7左右
[08/03 14:38:52] detectron2 INFO: Rank of current process: 0. World size: 4
[08/03 14:38:53] detectron2 INFO: Environment info:
sys.platform linux
Python 3.8.1 (default, Jan 8 2020, 22:29:32) [GCC 7.3.0]
numpy 1.23.1
detectron2 0.4 @/home/SwinTextSpotter/detectron2
Compiler GCC 7.5
CUDA compiler CUDA 11.3
detectron2 arch flags 6.1
DETECTRON2_ENV_MODULE
PyTorch 1.10.0+cu113 @/miniconda/lib/python3.8/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0,1,2,3 NVIDIA A100 80GB PCIe (arch=8.0)
CUDA_HOME /usr/local/cuda
Pillow 9.2.0
torchvision 0.11.0+cu113 @/miniconda/lib/python3.8/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore 0.1.5.post20220512
iopath 0.1.7
cv2 4.6.0
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX512
- CUDA Runtime 11.3
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
- CuDNN 8.2
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
[08/03 14:38:53] detectron2 INFO: Command line arguments: Namespace(config_file='projects/SWINTS/configs/SWINTS-swin-pretrain.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=4, num_machines=1, opts=[], resume=False)
[08/03 14:38:53] detectron2 INFO: Contents of args.config_file=projects/SWINTS/configs/SWINTS-swin-pretrain.yaml:
BASE: "Base-SWINTS_swin.yaml"
MODEL:
WEIGHTS: "ckpt/swin_imagenet_pretrain.pth"
SWINTS:
NUM_PROPOSALS: 300
NUM_CLASSES: 2
DATASETS:
TRAIN: ("totaltext_train","icdar_2015_train","icdar_2013_train","icdar_2017_validation_mlt","icdar_2017_mlt","icdar_curvesynthtext_train1","icdar_curvesynthtext_train2",)
TEST: ("totaltext_test",)
SOLVER:
STEPS: (120000,140000)
MAX_ITER: 150000
CHECKPOINT_PERIOD: 5000
INPUT:
FORMAT: "RGB"
[08/03 14:38:53] detectron2 INFO: Running with full config:
CUDNN_BENCHMARK: False
DATALOADER:
ASPECT_RATIO_GROUPING: True
FILTER_EMPTY_ANNOTATIONS: True
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: ()
PROPOSAL_FILES_TRAIN: ()
TEST: ('totaltext_test',)
TRAIN: ('totaltext_train', 'icdar_2015_train', 'icdar_2013_train', 'icdar_2017_validation_mlt', 'icdar_2017_mlt', 'icdar_curvesynthtext_train1', 'icdar_curvesynthtext_train2')
GLOBAL:
HACK: 1.0
INPUT:
CROP:
CROP_INSTANCE: False
ENABLED: True
SIZE: [0.1, 0.1]
TYPE: relative_range
FORMAT: RGB
MASK_FORMAT: polygon
MAX_SIZE_TEST: 1824
MAX_SIZE_TRAIN: 1600
MIN_SIZE_TEST: 1000
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800, 832, 864, 896)
MIN_SIZE_TRAIN_SAMPLING: choice
RANDOM_FLIP: horizontal
MODEL:
ANCHOR_GENERATOR:
ANGLES: [[-90, 0, 90]]
ASPECT_RATIOS: [[0.5, 1.0, 2.0]]
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES: [[32, 64, 128, 256, 512]]
BACKBONE:
FREEZE_AT: -1
NAME: build_swint_fpn_backbone
DEVICE: cuda
FPN:
FUSE_TYPE: sum
IN_FEATURES: ['stage2', 'stage3', 'stage4', 'stage5']
NORM:
OUT_CHANNELS: 256
TOP_LEVELS: 2
KEYPOINT_ON: False
LOAD_PROPOSALS: False
MASK_ON: True
META_ARCHITECTURE: SWINTS
PANOPTIC_FPN:
COMBINE:
ENABLED: True
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN: [123.675, 116.28, 103.53]
PIXEL_STD: [58.395, 57.12, 57.375]
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
REC_HEAD:
BATCH_SIZE: 128
NUM_CLASSES: 107
POOLER_RESOLUTION: (28, 28)
RESOLUTION: (32, 32)
RESNETS:
DEFORM_MODULATED: False
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE: [False, False, False, False]
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES: ['res4']
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: True
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES: ['p3', 'p4', 'p5', 'p6', 'p7']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.4, 0.5]
NMS_THRESH_TEST: 0.5
NORM:
NUM_CLASSES: 80
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS: ((10.0, 10.0, 5.0, 5.0), (20.0, 20.0, 10.0, 10.0), (30.0, 30.0, 15.0, 15.0))
IOUS: (0.5, 0.6, 0.7)
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
CLS_AGNOSTIC_BBOX_REG: False
CONV_DIM: 256
FC_DIM: 1024
NAME:
NORM:
NUM_CONV: 0
NUM_FC: 0
POOLER_RESOLUTION: 7
POOLER_SAMPLING_RATIO: 2
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: False
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES: ['p2', 'p3', 'p4', 'p5']
IOU_LABELS: [0, 1]
IOU_THRESHOLDS: [0.5]
NAME: Res5ROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 80
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: True
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS: (512, 512, 512, 512, 512, 512, 512, 512)
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: True
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: False
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM:
NUM_CONV: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
BOUNDARY_THRESH: -1
CONV_DIMS: [-1]
HEAD_NAME: StandardRPNHead
IN_FEATURES: ['res4']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.3, 0.7]
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 6000
PRE_NMS_TOPK_TRAIN: 12000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
COMMON_STRIDE: 4
CONVS_DIM: 128
IGNORE_VALUE: 255
IN_FEATURES: ['p2', 'p3', 'p4', 'p5']
LOSS_WEIGHT: 1.0
NAME: SemSegFPNHead
NORM: GN
NUM_CLASSES: 54
SWINT:
APE: False
DEPTHS: [2, 2, 6, 2]
DROP_PATH_RATE: 0.2
EMBED_DIM: 96
MLP_RATIO: 4
NUM_HEADS: [3, 6, 12, 24]
OUT_FEATURES: ['stage2', 'stage3', 'stage4', 'stage5']
WINDOW_SIZE: 7
SWINTS:
ACTIVATION: relu
ALPHA: 0.25
CLASS_WEIGHT: 2.0
DEEP_SUPERVISION: True
DIM_DYNAMIC: 64
DIM_FEEDFORWARD: 2048
DROPOUT: 0.0
GAMMA: 2.0
GIOU_WEIGHT: 2.0
HIDDEN_DIM: 256
IOU_LABELS: [0, 1]
IOU_THRESHOLDS: [0.5]
L1_WEIGHT: 5.0
MASK_DIM: 60
MASK_WEIGHT: 2.0
NHEADS: 8
NO_OBJECT_WEIGHT: 0.1
NUM_CLASSES: 2
NUM_CLS: 3
NUM_DYNAMIC: 2
NUM_HEADS: 6
NUM_MASK: 3
NUM_PROPOSALS: 300
NUM_REG: 3
PATH_COMPONENTS: ./projects/SWINTS/LME/coco_2017_train_class_agnosticTrue_whitenTrue_sigmoidTrue_60_siz28.npz
PRIOR_PROB: 0.01
REC_WEIGHT: 1.0
TEST_NUM_PROPOSALS: 100
WEIGHTS: ckpt/swin_imagenet_pretrain.pth
OUTPUT_DIR: ./output
SEED: 40244023
SOLVER:
AMP:
ENABLED: False
BACKBONE_MULTIPLIER: 1.0
BASE_LR: 7.5e-05
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 5000
CLIP_GRADIENTS:
CLIP_TYPE: full_model
CLIP_VALUE: 1.0
ENABLED: True
NORM_TYPE: 2.0
GAMMA: 0.1
IMS_PER_BATCH: 24
LR_SCHEDULER_NAME: WarmupMultiStepLR
MAX_ITER: 150000
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: ADAMW
REFERENCE_WORLD_SIZE: 0
STEPS: (120000, 140000)
WARMUP_FACTOR: 0.01
WARMUP_ITERS: 1000
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: False
FLIP: True
MAX_SIZE: 4000
MIN_SIZES: (400, 500, 600, 700, 800, 900, 1000, 1100, 1200)
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 100000
EXPECTED_RESULTS: []
INFERENCE_TH_TEST: 0.4
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: False
NUM_ITER: 200
USE_NMS_IN_TSET: True
VERSION: 2
VIS_PERIOD: 0
[08/03 14:38:53] detectron2 INFO: Full config saved to ./output/config.yaml
[08/03 14:38:55] d2.engine.defaults INFO: Model:
[08/03 14:38:56] d2.data.datasets.coco INFO: Loaded 1255 images in COCO format from datasets/totaltext/totaltext_train.json
[08/03 14:38:56] d2.data.datasets.coco INFO: Loaded 1000 images in COCO format from datasets/icdar2015/icdar_2015_train.json
[08/03 14:38:56] d2.data.datasets.coco INFO: Loaded 229 images in COCO format from datasets/icdar2013/annotations/icdar_2013.json
[08/03 14:38:56] d2.data.datasets.coco INFO: Loaded 1797 images in COCO format from datasets/mlt2017/annotations/icdar_2017_validation_mlt.json
[08/03 14:38:57] d2.data.datasets.coco INFO: Loaded 7160 images in COCO format from datasets/mlt2017/annotations/icdar_2017_mlt.json
[08/03 14:39:18] d2.data.datasets.coco INFO: Loading datasets/syntext1/annotations/syntext_word_eng.json takes 21.03 seconds.
[08/03 14:39:19] d2.data.datasets.coco INFO: Loaded 94723 images in COCO format from datasets/syntext1/annotations/syntext_word_eng.json
[08/03 14:39:35] d2.data.datasets.coco INFO: Loading datasets/syntext2/annotations/ecms_v1_maxlen25.json takes 7.61 seconds.
[08/03 14:39:35] d2.data.datasets.coco INFO: Loaded 54327 images in COCO format from datasets/syntext2/annotations/ecms_v1_maxlen25.json
[08/03 14:39:36] d2.data.build INFO: Removed 2230 images with no usable annotations. 158261 images left.
[08/03 14:39:42] d2.data.build INFO: Distribution of instances among all 1 categories:
[36m| category | #instances |
|:----------:|:-------------|
| text | 1868890 |
| | |[0m
[08/03 14:39:42] d2.data.build INFO: Using training sampler TrainingSampler
[08/03 14:39:42] d2.data.common INFO: Serializing 158261 elements to byte tensors and concatenating them all ...
[08/03 14:39:48] d2.data.common INFO: Serialized dataset takes 460.55 MiB
[08/03 14:39:50] fvcore.common.checkpoint INFO: [Checkpointer] Loading from ckpt/swin_imagenet_pretrain.pth ...
[08/03 14:39:51] d2.engine.train_loop INFO: Starting training from iteration 0
[08/04 14:53:46 d2.utils.events]: eta: 1 day, 14:59:42 iter: 56279 total_loss: 16.8 loss_ce: 0.1334 loss_giou: 0.29 loss_bbox: 0.1194 loss_feat: 0.5878 loss_dice: 0.119 loss_rec: 6.447 loss_ce_0: 0.6154 loss_giou_0: 0.7695 loss_bbox_0: 0.313 loss_feat_0: 1.204 loss_dice_0: 0.2254 loss_ce_1: 0.3727 loss_giou_1: 0.3973 loss_bbox_1: 0.1591 loss_feat_1: 0.8152 loss_dice_1: 0.1459 loss_ce_2: 0.31 loss_giou_2: 0.3258 loss_bbox_2: 0.1341 loss_feat_2: 0.6907 loss_dice_2: 0.1311 loss_ce_3: 0.2053 loss_giou_3: 0.3006 loss_bbox_3: 0.122 loss_feat_3: 0.6204 loss_dice_3: 0.1216 loss_ce_4: 0.1435 loss_giou_4: 0.2943 loss_bbox_4: 0.1182 loss_feat_4: 0.6 loss_dice_4: 0.1188 time: 1.5465 data_time: 0.0930 lr: 7.5e-05 max_mem: 66803M
[08/04 14:54:16 d2.utils.events]: eta: 1 day, 14:59:52 iter: 56299 total_loss: 17.43 loss_ce: 0.1425 loss_giou: 0.2743 loss_bbox: 0.1041 loss_feat: 0.5848 loss_dice: 0.1222 loss_rec: 6.74 loss_ce_0: 0.6106 loss_giou_0: 0.7541 loss_bbox_0: 0.3087 loss_feat_0: 1.213 loss_dice_0: 0.2271 loss_ce_1: 0.378 loss_giou_1: 0.3861 loss_bbox_1: 0.1526 loss_feat_1: 0.8113 loss_dice_1: 0.1538 loss_ce_2: 0.3031 loss_giou_2: 0.3231 loss_bbox_2: 0.1218 loss_feat_2: 0.6802 loss_dice_2: 0.1366 loss_ce_3: 0.2059 loss_giou_3: 0.2907 loss_bbox_3: 0.1099 loss_feat_3: 0.6195 loss_dice_3: 0.127 loss_ce_4: 0.1465 loss_giou_4: 0.2804 loss_bbox_4: 0.1056 loss_feat_4: 0.5916 loss_dice_4: 0.123 time: 1.5464 data_time: 0.0578 lr: 7.5e-05 max_mem: 66803M
[08/04 14:54:47 d2.utils.events]: eta: 1 day, 14:59:22 iter: 56319 total_loss: 16.86 loss_ce: 0.1324 loss_giou: 0.2875 loss_bbox: 0.1126 loss_feat: 0.5857 loss_dice: 0.1232 loss_rec: 6.414 loss_ce_0: 0.6167 loss_giou_0: 0.7572 loss_bbox_0: 0.3231 loss_feat_0: 1.195 loss_dice_0: 0.2153 loss_ce_1: 0.3705 loss_giou_1: 0.3863 loss_bbox_1: 0.1542 loss_feat_1: 0.8011 loss_dice_1: 0.1507 loss_ce_2: 0.2973 loss_giou_2: 0.32 loss_bbox_2: 0.126 loss_feat_2: 0.6879 loss_dice_2: 0.1377 loss_ce_3: 0.1973 loss_giou_3: 0.2947 loss_bbox_3: 0.1173 loss_feat_3: 0.6282 loss_dice_3: 0.1278 loss_ce_4: 0.1405 loss_giou_4: 0.2881 loss_bbox_4: 0.113 loss_feat_4: 0.5965 loss_dice_4: 0.1255 time: 1.5464 data_time: 0.0538 lr: 7.5e-05 max_mem: 66803M