PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Overview

PySlowFast

PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficient training. This repository includes implementations of the following methods:

Introduction

The goal of PySlowFast is to provide a high-performance, light-weight pytorch codebase provides state-of-the-art video backbones for video understanding research on different tasks (classification, detection, and etc). It is designed in order to support rapid implementation and evaluation of novel video research ideas. PySlowFast includes implementations of the following backbone network architectures:

  • SlowFast
  • Slow
  • C2D
  • I3D
  • Non-local Network
  • X3D

Updates

License

PySlowFast is released under the Apache 2.0 license.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the PySlowFast Model Zoo.

Installation

Please find installation instructions for PyTorch and PySlowFast in INSTALL.md. You may follow the instructions in DATASET.md to prepare the datasets.

Quick Start

Follow the example in GETTING_STARTED.md to start playing video models with PySlowFast.

Visualization Tools

We offer a range of visualization tools for the train/eval/test processes, model analysis, and for running inference with trained model. More information at Visualization Tools.

Contributors

PySlowFast is written and maintained by Haoqi Fan, Yanghao Li, Bo Xiong, Wan-Yen Lo, Christoph Feichtenhofer.

Citing PySlowFast

If you find PySlowFast useful in your research, please use the following BibTeX entry for citation.

@misc{fan2020pyslowfast,
  author =       {Haoqi Fan and Yanghao Li and Bo Xiong and Wan-Yen Lo and
                  Christoph Feichtenhofer},
  title =        {PySlowFast},
  howpublished = {\url{https://github.com/facebookresearch/slowfast}},
  year =         {2020}
}
Comments
  • Flopcount mismatch and bad results on UCF101 compared to GluonCV implementation

    Flopcount mismatch and bad results on UCF101 compared to GluonCV implementation

    I tried to train SlowFast 4x16, Res50 version on UCF101 and found this. a) The flop count for this model is displayed as follows:

    [INFO: misc.py:  100]: Params: 33,791,021
    [INFO: misc.py:  101]: Mem: 261.55078125 MB
    [WARNING: flop_count.py:  120]: Skipped operation aten::clone 664 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::add_ 111 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::mul_ 330 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::eq 110 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::batch_norm 110 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::relu_ 102 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::max_pool3d 4 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::cat 5 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::add 32 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::avg_pool3d 2 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::permute 1 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::dropout 1 time(s)
    [WARNING: flop_count.py:  120]: Skipped operation aten::Int 1 time(s)
    [INFO: misc.py:  103]: FLOPs: 27.63797632 GFLOPs
    

    However, in the original paper, for the 4x16, R50 version, the flopcount should be 36.1 GFLOPs.

    b) The top result I can get for training SlowFast 4x16, R50 on UCF101 from scratch is 73.49 for top1 accuracy and 88.82 for top5 accuracy, and that is trained for 512 epochs, while using GluonCV implementation, the same model trained with only 64 epochs could achieve as high as 92 for top1 accuracy.

    Combined with the GFLOP count mismatch, is it reasonable to say there could be some error in the model?

    opened by huang-ziyuan 34
  • Getting nan prediction in demo_net.py with AVA on custom video

    Getting nan prediction in demo_net.py with AVA on custom video

    When running python tools/run_net.py --cfg /home/ubuntu/slowfast/configs/AVA/c2/SLOWFAST_32x2_R101_50_50.yaml i am getting nan predictions in https://github.com/facebookresearch/SlowFast/blob/master/tools/demo_net.py#L242 Python version - 3.6.10 (on 3.7.7 didnt work too) Pytorch - 1.5.1 Torchvision - 0.6.1 Cuda - 10.0 Detectron2 - 0.2 Here is config: TRAIN: ENABLE: False DATASET: ava BATCH_SIZE: 16 EVAL_PERIOD: 1 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True CHECKPOINT_FILE_PATH: "/home/ubuntu/SLOWFAST_32x2_R101_50_50.pkl" CHECKPOINT_TYPE: pytorch DATA: NUM_FRAMES: 32 SAMPLING_RATE: 2 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] DETECTION: ENABLE: True ALIGNED: False AVA: BGR: False DETECTION_SCORE_THRESH: 0.8 TEST_PREDICT_BOX_LISTS: ["person_box_67091280_iou90/ava_detection_val_boxes_and_labels.csv"] SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 5 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 101 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [2, 2]] SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [1, 1]] NONLOCAL: LOCATION: [[[], []], [[], []], [[6, 13, 20], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product POOL: [[[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]]] BN: USE_PRECISE_STATS: False NUM_BATCHES_PRECISE: 200 SOLVER: MOMENTUM: 0.9 WEIGHT_DECAY: 1e-7 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 80 ARCH: slowfast MODEL_NAME: SlowFast LOSS_FUNC: bce DROPOUT_RATE: 0.5 HEAD_ACT: sigmoid TEST: ENABLE: False DATASET: ava BATCH_SIZE: 8 DATA_LOADER: NUM_WORKERS: 2 PIN_MEMORY: True DEMO: ENABLE: True LABEL_FILE_PATH: "./demo/AVA/ava.names" DATA_SOURCE: "/home/ubuntu/gestures_dataset_right_wave_15.mp4"

    DISPLAY_WIDTH: 640

    DISPLAY_HEIGHT: 480

    DETECTRON2_OBJECT_DETECTION_MODEL_CFG: "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml" DETECTRON2_OBJECT_DETECTION_MODEL_WEIGHTS: "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl" NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

    And here is part of logs: /home/ubuntu/slowfast/slowfast/models/head_helper.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert out.shape[2] == 1 /home/ubuntu/detectron2_repo/detectron2/layers/roi_align.py:105: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 /home/ubuntu/slowfast/slowfast/models/head_helper.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert out.shape[2] == 1 /home/ubuntu/detectron2_repo/detectron2/layers/roi_align.py:105: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 [WARNING: flop_count.py: 63]: Skipped operation aten::batch_norm 215 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::batch_norm 215 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::relu_ 204 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::relu_ 204 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::max_pool3d 7 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::max_pool3d 7 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::add 69 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::add 69 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::div 3 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::div 3 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::avg_pool3d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::avg_pool3d 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation prim::PythonOp 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation prim::PythonOp 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::max_pool2d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::max_pool2d 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::dropout 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::dropout 1 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::sigmoid 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::sigmoid 1 time(s) [INFO: misc.py: 160]: Flops: 146.54916608 G [07/11 21:40:19][INFO] slowfast.utils.misc: 160: Flops: 146.54916608 G [WARNING: activation_count.py: 54]: Skipped operation aten::batch_norm 215 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::batch_norm 215 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::relu_ 204 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::relu_ 204 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::max_pool3d 7 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::max_pool3d 7 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::add 69 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::add 69 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::einsum 6 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::einsum 6 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::div 3 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::div 3 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::avg_pool3d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::avg_pool3d 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation prim::PythonOp 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation prim::PythonOp 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::max_pool2d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::max_pool2d 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::dropout 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::dropout 1 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::sigmoid 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::sigmoid 1 time(s) [INFO: misc.py: 165]: Activations: 293.60136 M [07/11 21:40:19][INFO] slowfast.utils.misc: 165: Activations: 293.60136 M [INFO: misc.py: 168]: nvidia-smi [07/11 21:40:19][INFO] slowfast.utils.misc: 168: nvidia-smi

    Please feel free to ask any additional information if needed.

    question 
    opened by Serhii-Tiurin 16
  • Run error

    Run error

    I use command python ./tools/run_net.py --cfg ./configs/Kinetics/C2D_8x8_R50.yaml DATA.PATH_TO_DATA_DIR ~/data/kinetics400/ NUM_GPUS 2 TRAIN.BATCH_SIZE 16 to train a model,but it has some problems.

    RuntimeError: Caught RuntimeError in DataLoader worker process 0 RuntimeError: Failed to fetch video after 10 retries

    I am very confused.Thanks

    opened by onlyonewater 16
  • AVA benchmark with provided models and configs

    AVA benchmark with provided models and configs

    Hi, thanks for your great codebase. Could you release the performance for the configs in AVA folder

    --configs/
           AVA/
               C2D_8x8_R50_SHORT.yaml
               SLOWFAST_32x2_R50_SHORT.yaml
    
    documentation 
    opened by tonysy 14
  • demo Error

    demo Error

    hI python tools/run_net.py --cfg configs/Kinetics/c2/SLOWFAST_8x8_R50.yaml DATA.PATH_TO_DATA_DIR 当我运行这个的时候出现这个错误,可能是我下载的test.csv的问题,您能提供我一下吗?

    Traceback (most recent call last): File "/home/vcaadmin/miniconda3/envs/slowfast/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/data/gao/slowfast/slowfast/utils/multiprocessing.py", line 50, in run func(cfg) File "/data/gao/sf/SlowFast-master/tools/test_net.py", line 160, in test test_loader = loader.construct_loader(cfg, "test") File "/data/gao/slowfast/slowfast/datasets/loader.py", line 80, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "/data/gao/slowfast/slowfast/datasets/build.py", line 32, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "/data/gao/slowfast/slowfast/datasets/kinetics.py", line 74, in init self._construct_loader() File "/data/gao/slowfast/slowfast/datasets/kinetics.py", line 92, in _construct_loader assert len(path_label.split()) == 2 AssertionError

    thanks a lot!!

    question 
    opened by gtgtgt1117 12
  • Running error

    Running error

    Dear author,

    I am trying to evaluate your algorithm performance on ava data sets using command line "python tools/run_net.py --cfg configs/AVA/c2/SLOWFAST_64x2_R101_50_50.yaml DATA.PATH_TO_DATA_DIR data/ava/ TEST.CHECKPOINT_FILE_PATH model/SLOWFAST_64x2_R101_50_50.pkl TRAIN.ENABLE False NUM_GPUS 1". I got the following error,

    Traceback (most recent call last): File "tools/run_net.py", line 152, in main() File "tools/run_net.py", line 147, in main test(cfg=cfg) File "/media/workspace_karl/slowfast/tools/test_net.py", line 138, in test convert_from_caffe2=cfg.TEST.CHECKPOINT_TYPE == "caffe2", File "/media/workspace_karl/slowfast/slowfast/utils/checkpoint.py", line 186, in load_checkpoint caffe2_checkpoint = pickle.load(f, encoding="latin1") _pickle.UnpicklingError: unpickling stack underflow

    I wonder if you could point out what is going wrong.

    Many Thanks!

    opened by yangsusanyang 10
  • UnicodeDecodeError when loading the pre-trained model

    UnicodeDecodeError when loading the pre-trained model

    Thank you for your great work!

    When I tried to load the model you provided here, I encountered the following error:

    Traceback (most recent call last):
      File "tools/run_net.py", line 152, in <module>
        main()
      File "tools/run_net.py", line 128, in main
        train(cfg=cfg)
      File "/home/ubuntu/projects/SlowFast/tools/train_net.py", line 217, in train
        convert_from_caffe2=cfg.TRAIN.CHECKPOINT_TYPE == "caffe2",
      File "/home/ubuntu/projects/SlowFast/slowfast/utils/checkpoint.py", line 212, in load_checkpoint
        checkpoint = torch.load(path_to_checkpoint, map_location="cpu")
      File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.7/site-packages/torch/serialization.py", line 426, in load
        return _load(f, map_location, pickle_module, **pickle_load_args)
      File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.7/site-packages/torch/serialization.py", line 603, in _load
        magic_number = pickle_module.load(f, **pickle_load_args)
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd4 in position 1: invalid continuation byte
    

    This problem occurred probably because I tried to decode in a different way from way of encoding. I would appreciate it if you could show me how this problem has happened, and how to solve it.

    Thank you for your consideration!

    opened by Shumpei-Kikuta 10
  • X3D is slower than slowfast

    X3D is slower than slowfast

    I compare the speed between X3D_L and SLOWFAST_8x8_R50 on Kinetics-400 dataset. Surprisingly, although the GFlops of X3D_L is smaller(24.8 vs 65.7) , it runs about 3x slower than SLOWFAST_8x8_R50.

    Is there something important need to do to speed up the inference of X3D_L?

    opened by gooners1886 8
  • Unexpected bus error encountered in worker.

    Unexpected bus error encountered in worker.

    I tried to run SLOWFAST_8x8_R50 to inference on a small amount of kinetics-400 test data (like only 5 videos), on a google cloud compute engine machine with 8 K80 GPUs (each has 12GB gpu memory). but it seems that it cannot be run due to some unexpected error: ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

    Can someone help with this error ? I simply don't know what went wrong .....

    question 
    opened by dixonhsiao 8
  • Training Speed of AVA

    Training Speed of AVA

    Hi, thanks for sharing your implementation for AVA. Is it convenient for you to add trained model with "configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml" settings ?

    I am trying to fine-tune on AVA dataset with pre-trained kinetics weights using SLOWFAST_32x2_R50_SHORT.yaml.

    I observed training an epoch is around 3.50 hour on 4 2080Ti gpus (batch-size 24). Training 20 epochs will take around 3 days.

    enhancement 
    opened by hzhang57 8
  • Reproduction and comparision to video-nonlocal-net

    Reproduction and comparision to video-nonlocal-net

    Hi, Thank you for making this code available.

    I am trying to reproduce the result from your code. I got 72.6% top1 accuracy for I3D, from (Kinetics/I3D_8x8_R50.yaml)[https://github.com/facebookresearch/SlowFast/blob/master/configs/Kinetics/I3D_8x8_R50.yaml], on 8 GPUs without modifying any other parameter, and for 196 epochs.

    I suppose that is from the fact that version of the kinetics dataset that I have is missing about 600 validation and 9K training videos. It would be a great help you please make some comment about this?

    Another question about comparing the results of this repo with (video-nonlocal-net repo)[https://github.com/facebookresearch/video-nonlocal-net]. For instance, I3D_8x8_R50 have an accuracy of 73.5 here and 73.5 there but the initialisation here is random their (video-nonlocal-net repo) uses imagenet initialization. However, the number of epochs there is much less (~100 epochs).

    What makes random initialisation work as good as imagenet initialisation, as in video-nl-net repo? Is it just the number of epochs or something else? How does one compare them?

    Looking forward to your answers, Gurkirt

    opened by gurkirt 8
  • Reproduce AVA results of MAE_ST

    Reproduce AVA results of MAE_ST

    Hi there,

    Could authors share the configs that you used to produce the AVA v2.2 results in the Masked Autoencoders As Spatiotemporal Learners paper?

    Throughout the repo I cannot find any related configs for ViT. The hyperparameters that mentioned in the paper (https://arxiv.org/pdf/2205.09113.pdf, appendix A. Table 6) seem to be unreasonable to me. With batch size 128, the learning rate is 7.2 for ViT-L with SGD optimizer.

    Thanks

    opened by yuanliangzhe 0
  • RuntimeError: Distributed package doesn't have NCCL built in

    RuntimeError: Distributed package doesn't have NCCL built in

    Screenshot from 2022-12-12 17-11-14

    I am getting this error while training the SlowFast model with kinetics data in my local machine.

    Please, give me a solution to resolve the issue.

    Thank you.

    opened by Bhavik-Ardeshna 0
  • INSTALL.md is not up-to-date. Here is an updated version in my SlowFast fork

    INSTALL.md is not up-to-date. Here is an updated version in my SlowFast fork

    Hello, INSTALL.md file is not up-to-date and has below issues:

    • torchvision with ffmpeg support (video_probe_from_memory errors),
    • Pytorchvideo: pip install lacks some necessary functions,
    • sklearn not supported anymore, scikit-learn can be explicitly installed,
    • python version: python 3.9 not supported for building torchvision with ffmpeg,
    • order of installation steps is confusing, etc.

    I included more details about the above problems and their solutions in my own SlowFast fork in the INSTALL.md file here. I hope these changes in the installation steps help you build and work with SlowFast easier.

    opened by alpargun 0
  • NumPy array not writable error in slowfast/datasets/decoder.py when using torchvision backend.

    NumPy array not writable error in slowfast/datasets/decoder.py when using torchvision backend. "np.array" typecasting required

    When using torchvision backend I receive the error:

    /home/alp/Desktop/slowfast-with-torchvision/SlowFast/slowfast/datasets/decoder.py:273: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1666642991888/work/torch/csrc/utils/tensor_numpy.cpp:199.)

    I am using

    • pytorch=1.13.0
    • torchvision=0.14.0
    • ffmpeg=4.2

    In order to fix this problem, the line, should be fixed to: video_tensor = torch.from_numpy(np.frombuffer(np.array(video_handle), dtype=np.uint8))

    When I made the above change, I no longer receive the warning.

    opened by alpargun 0
  • maskfeat mvit-s fine-tuning on k400 config

    maskfeat mvit-s fine-tuning on k400 config

    Hi, I am trying to reproduce mvit-s fine-tuning on k400, and find that the config file (configs/masked_ssl/k400_MVITv2_S_16x4_FT.yaml) is different from the config in the paper in the SOLVER part, in config file, we warmup for 20 epochs, and max epoch is 100, but in paper, warmup for 5 epochs, and training for 200 epochs. which config should i refer to? Another question is that MViT-L is fine-tuning on larger saptial sizes of 312 and 352, are you using a higer resolution (larger than short-side 256) version k400 dataset?

    opened by cir7 0
  • pyav decode leads memory leaks issue when 'duration' is none

    pyav decode leads memory leaks issue when 'duration' is none

    Hi, I follow #541 changes to use pyav backend decode videos, and when i use 'pyav' for trainging, if 'duration' param is None, which is conditional at decoder.py Line 436, the memory will increase rapidly to the machine limit and training thread will stacked.

    I try simply raise error when duration is none, just drop out these videos and training can works well, but i dont know why this happens and it will be better if i can use these videos for training again. Can someone helps? Thanks!

    opened by wnzhyee 1
Owner
Meta Research
Meta Research
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

null 152 Jan 2, 2023
LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

TuZheng 405 Jan 4, 2023
PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle ?? PaddlePaddle Visual Transformers (PaddleViT or

null 1k Dec 28, 2022
LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

LWCC: A LightWeight Crowd Counting library for Python LWCC is a lightweight crowd counting framework for Python. It wraps four state-of-the-art models

Matija Teršek 39 Dec 28, 2022
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Meta Research 663 Jan 6, 2023
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

TCV-X21 validation for divertor turbulence simulations Quick links Intro Welcome to TCV-X21. We're glad you've found us! This repository is designed t

null 0 Dec 18, 2021
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

Twitter Research 239 Jan 2, 2023
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Reproducing code of hair style replacement method from Barbershorp.

Barbershorp Reproducing code of hair style replacement method from Barbershorp. Also reproduces II2S, an improved version of Image2StyleGAN. Requireme

null 1 Dec 24, 2021
PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

Asaf 3 Dec 27, 2022
State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

Ritvik Rastogi 60 May 29, 2022
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
State of the art Semantic Sentence Embeddings

Contrastive Tension State of the art Semantic Sentence Embeddings Published Paper · Huggingface Models · Report Bug Overview This is the official code

Fredrik Carlsson 88 Dec 30, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
State-of-the-art data augmentation search algorithms in PyTorch

MuarAugment Description MuarAugment is a package providing the easiest way to a state-of-the-art data augmentation pipeline. How to use You can instal

null 43 Dec 12, 2022