FcaNet: Frequency Channel Attention Networks
PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".
Simplest usage
Models pretrained on ImageNet can be simply accessed by (without any configuration or installation):
model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)
Install
Please see INSTALL.md
Models
Classification models on ImageNet
Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.
Model | Reported | Evaluation Results | Link |
---|---|---|---|
FcaNet34 | 75.07 | 75.02 | GoogleDrive/BaiduDrive(code:m7v8) |
FcaNet50 | 78.52 | 78.57 | GoogleDrive/BaiduDrive(code:mgkk) |
FcaNet101 | 79.64 | 79.63 | GoogleDrive/BaiduDrive(code:8t0j) |
FcaNet152 | 80.08 | 80.02 | GoogleDrive/BaiduDrive(code:5yeq) |
Detection and instance segmentation models on COCO
Model | Backbone | AP | AP50 | AP75 | Link |
---|---|---|---|---|---|
Faster RCNN | FcaNet50 | 39.0 | 61.1 | 42.3 | GoogleDrive/BaiduDrive(code:q15c) |
Faster RCNN | FcaNet101 | 41.2 | 63.3 | 44.6 | GoogleDrive/BaiduDrive(code:pgnx) |
Mask RCNN | Fca50 det Fca50 seg |
40.3 36.2 |
62.0 58.6 |
44.1 38.1 |
GoogleDrive/BaiduDrive(code:d9rn) |
Training
Please see launch_training_classification.sh
and launch_training_detection.sh
for training on ImageNet and COCO, respectively.
Testing
Please see launch_eval_classification.sh
and launch_eval_detection.sh
for testing on ImageNet and COCO, respectively.
FAQ
Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.
Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.
Method | ImageNet Top-1 Acc | Link |
---|---|---|
Learnable tensor, random initialization | 77.914 | GoogleDrive/BaiduDrive(code:p2hl) |
Learnable tensor, DCT initialization | 78.352 | GoogleDrive/BaiduDrive(code:txje) |
Fixed tensor, random initialization | 77.742 | GoogleDrive/BaiduDrive(code:g5t9) |
Fixed tensor, DCT initialization (Ours) | 78.574 | GoogleDrive/BaiduDrive(code:mgkk) |
To verify this results, one can select the cooresponding types of tensor in the L73-L83 in model/layer.py
, uncomment it and train the whole network.
TODO
- Object detection models
- Instance segmentation models
- Fix the incorrect results of detection models
- Make the switching between configs more easier