hello , I have some questiones about implementation details.
Data are obtained using the HR-LR data pairs obtained by the down-sampling code provided in BasicSR. The training data was DF2K (900 DIV2K + 2650 Flickr2K), and the test data was Set5.
I run this command to prune the EDSR_16_256 model to EDSR_16_48. Only the pruning ratio and storage path name are modified compared to the command provided by the official.
Prune from 256 to 48, pr=0.8125, x2, ASSL
python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method ASSL --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_ASSL_0.8125_RGP0.0001_RUL0.5_Pretrain_06011101
Results
model_just_finished_prune ---> 33.739dB
fine-tuning after one epoch ---> 37.781dB
fine-tuning after 756 epoch ---> 37.940dB
The result (37.940dB) I obtained with the code provided by the official is still a certain gap from the result in the paper (38.12dB). I should have overlooked some details.
I also compared L1-norm method provided in the code.
Prune from 256 to 48, pr=0.8125, x2, L1
python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method L1 --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_L1_0.8125_06011101
Results
model_just_finished_prune ---> 13.427dB
fine-tuning after one epoch ---> 33.202dB
fine-tuning after 756 epoch ---> 37.933dB
The difference between the results of L1-norm method and those of ASSL seems negligible at this pruning ratio (256->48)
Is there something I missed? Looking forward to your reply! >-<