Hi I was just wondering how many epochs did you train on the nuscenes - kitti baseline (without SN or ROS) to get the result of 17.92 for 3D AP as reported in the paper? I've trained the secondiou_old_anchor.yaml cfg file for 3 epochs (as with ROS and SN) and only got the following results:
Car [email protected], 0.70, 0.70:
bbox AP:60.1853, 48.0934, 48.1308
bev AP:31.2696, 27.1278, 26.6632
3d AP:4.5068, 3.2667, 3.3441
aos AP:42.78, 34.73, 35.04
Infos were generated with the original OpenPCDet repo - do I need to regenerate them for KITTI and nuscenes?
Also I noticed when evaluating the nuscenes-kitti for SN and ROS, they were only trained to 3 epochs. I re-trained the nuscenes dataset with the exact same secondiou_old_anchor_ros.yaml and didn't manage to reproduce the result from the model zoo. Update: I've re-generated the infos with st3d repo and it gives a lower performance for secondiou_old_anchor_ros.yaml.
Training was done with the command
python train.py --cfg_file cfgs/da-nuscenes-kitti_models/secondiou/secondiou_old_anchor_ros.yaml --batch_size 4 --epochs 3 --extra_tag st3d_infos
Cfg file for secondiou_old_anchor_ros.yaml is as below. Dataset_config files were unchanged.
CLASS_NAMES: ['car']
DATA_CONFIG:
_BASE_CONFIG_: cfgs/dataset_configs/da_nuscenes_kitti_dataset.yaml
MAX_SWEEPS: 1
PRED_VELOCITY: False
BALANCED_RESAMPLING: False
SHIFT_COOR: [0.0, 0.0, 1.8]
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['normalize_object_size']
AUG_CONFIG_LIST:
- NAME: random_object_scaling
SCALE_UNIFORM_NOISE: [0.75, 1.0]
- NAME: normalize_object_size
SIZE_RES: [-0.75, -0.34, -0.2]
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x', 'y']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.3925, 0.3925]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.95, 1.05]
DATA_CONFIG_TAR:
_BASE_CONFIG_: cfgs/dataset_configs/da_kitti_dataset.yaml
TARGET: True
FOV_POINTS_ONLY: False
CLASS_NAMES: ['Car']
SHIFT_COOR: [0.0, 0.0, 1.6]
MODEL:
NAME: SECONDNetIoU
VFE:
NAME: MeanVFE
BACKBONE_3D:
NAME: VoxelBackBone8x
MAP_TO_BEV:
NAME: HeightCompression
NUM_BEV_FEATURES: 256
BACKBONE_2D:
NAME: BaseBEVBackbone
LAYER_NUMS: [5, 5]
LAYER_STRIDES: [1, 2]
NUM_FILTERS: [128, 256]
UPSAMPLE_STRIDES: [1, 2]
NUM_UPSAMPLE_FILTERS: [256, 256]
DENSE_HEAD:
NAME: AnchorHeadSingle
CLASS_AGNOSTIC: False
USE_DIRECTION_CLASSIFIER: True
DIR_OFFSET: 0.78539
DIR_LIMIT_OFFSET: 0.0
NUM_DIR_BINS: 2
ANCHOR_GENERATOR_CONFIG: [
{
'class_name': 'car',
'anchor_sizes': [[4.2, 2.0, 1.6]],
'anchor_rotations': [0, 1.57],
'anchor_bottom_heights': [0],
'align_center': False,
'feature_map_stride': 8,
'matched_threshold': 0.55,
'unmatched_threshold': 0.4
}
]
TARGET_ASSIGNER_CONFIG:
NAME: AxisAlignedTargetAssigner
POS_FRACTION: -1.0
SAMPLE_SIZE: 512
NORM_BY_NUM_EXAMPLES: False
MATCH_HEIGHT: False
BOX_CODER: ResidualCoder
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'loc_weight': 2.0,
'dir_weight': 0.2,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
ROI_HEAD:
NAME: SECONDHead
CLASS_AGNOSTIC: True
SHARED_FC: [256, 256]
IOU_FC: [256, 256]
DP_RATIO: 0.3
NMS_CONFIG:
TRAIN:
NMS_TYPE: nms_gpu
MULTI_CLASSES_NMS: False
NMS_PRE_MAXSIZE: 9000
NMS_POST_MAXSIZE: 512
NMS_THRESH: 0.8
TEST:
NMS_TYPE: nms_gpu
MULTI_CLASSES_NMS: False
NMS_PRE_MAXSIZE: 1024
NMS_POST_MAXSIZE: 100
NMS_THRESH: 0.7
ROI_GRID_POOL:
GRID_SIZE: 7
IN_CHANNEL: 512
DOWNSAMPLE_RATIO: 8
TARGET_CONFIG:
BOX_CODER: ResidualCoder
ROI_PER_IMAGE: 128
FG_RATIO: 0.5
SAMPLE_ROI_BY_EACH_CLASS: True
CLS_SCORE_TYPE: raw_roi_iou
CLS_FG_THRESH: 0.75
CLS_BG_THRESH: 0.25
CLS_BG_THRESH_LO: 0.1
HARD_BG_RATIO: 0.8
REG_FG_THRESH: 0.55
LOSS_CONFIG:
IOU_LOSS: BinaryCrossEntropy
LOSS_WEIGHTS: {
'rcnn_iou_weight': 1.0,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
EVAL_METRIC: kitti
NMS_CONFIG:
MULTI_CLASSES_NMS: False
NMS_TYPE: nms_gpu
NMS_THRESH: 0.01
NMS_PRE_MAXSIZE: 4096
NMS_POST_MAXSIZE: 500
OPTIMIZATION:
OPTIMIZER: adam_onecycle
LR: 0.003
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
``