PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Meta Research

Last update: Jan 3, 2023

Related tags

Deep Learning SlowFast

Overview

PySlowFast

PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficient training. This repository includes implementations of the following methods:

Introduction

The goal of PySlowFast is to provide a high-performance, light-weight pytorch codebase provides state-of-the-art video backbones for video understanding research on different tasks (classification, detection, and etc). It is designed in order to support rapid implementation and evaluation of novel video research ideas. PySlowFast includes implementations of the following backbone network architectures:

SlowFast
Slow
C2D
I3D
Non-local Network
X3D

Updates

We now support Multiscale Vision Transformers on Kinetics and ImageNet. See projects/mvit for more information.
We now support PyTorchVideo models and datasets. See projects/pytorchvideo for more information.
We now support X3D Models. See projects/x3d for more information.
We now support Multigrid Training for efficiently training video models. See projects/multigrid for more information.
PySlowFast is released in conjunction with our ICCV 2019 Tutorial.

License

PySlowFast is released under the Apache 2.0 license.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the PySlowFast Model Zoo.

Installation

Please find installation instructions for PyTorch and PySlowFast in INSTALL.md. You may follow the instructions in DATASET.md to prepare the datasets.

Quick Start

Follow the example in GETTING_STARTED.md to start playing video models with PySlowFast.

Visualization Tools

We offer a range of visualization tools for the train/eval/test processes, model analysis, and for running inference with trained model. More information at Visualization Tools.

Contributors

PySlowFast is written and maintained by Haoqi Fan, Yanghao Li, Bo Xiong, Wan-Yen Lo, Christoph Feichtenhofer.

Citing PySlowFast

If you find PySlowFast useful in your research, please use the following BibTeX entry for citation.

@misc{fan2020pyslowfast,
  author =       {Haoqi Fan and Yanghao Li and Bo Xiong and Wan-Yen Lo and
                  Christoph Feichtenhofer},
  title =        {PySlowFast},
  howpublished = {\url{https://github.com/facebookresearch/slowfast}},
  year =         {2020}
}

Comments

Flopcount mismatch and bad results on UCF101 compared to GluonCV implementation

I tried to train SlowFast 4x16, Res50 version on UCF101 and found this. a) The flop count for this model is displayed as follows:

[INFO: misc.py:  100]: Params: 33,791,021
[INFO: misc.py:  101]: Mem: 261.55078125 MB
[WARNING: flop_count.py:  120]: Skipped operation aten::clone 664 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::add_ 111 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::mul_ 330 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::eq 110 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::batch_norm 110 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::relu_ 102 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::max_pool3d 4 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::cat 5 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::add 32 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::avg_pool3d 2 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::permute 1 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::dropout 1 time(s)
[WARNING: flop_count.py:  120]: Skipped operation aten::Int 1 time(s)
[INFO: misc.py:  103]: FLOPs: 27.63797632 GFLOPs

However, in the original paper, for the 4x16, R50 version, the flopcount should be 36.1 GFLOPs.

b) The top result I can get for training SlowFast 4x16, R50 on UCF101 from scratch is 73.49 for top1 accuracy and 88.82 for top5 accuracy, and that is trained for 512 epochs, while using GluonCV implementation, the same model trained with only 64 epochs could achieve as high as 92 for top1 accuracy.

Combined with the GFLOP count mismatch, is it reasonable to say there could be some error in the model?

opened by huang-ziyuan 34

Getting nan prediction in demo_net.py with AVA on custom video

When running python tools/run_net.py --cfg /home/ubuntu/slowfast/configs/AVA/c2/SLOWFAST_32x2_R101_50_50.yaml i am getting nan predictions in https://github.com/facebookresearch/SlowFast/blob/master/tools/demo_net.py#L242 Python version - 3.6.10 (on 3.7.7 didnt work too) Pytorch - 1.5.1 Torchvision - 0.6.1 Cuda - 10.0 Detectron2 - 0.2 Here is config: TRAIN: ENABLE: False DATASET: ava BATCH_SIZE: 16 EVAL_PERIOD: 1 CHECKPOINT_PERIOD: 1 AUTO_RESUME: True CHECKPOINT_FILE_PATH: "/home/ubuntu/SLOWFAST_32x2_R101_50_50.pkl" CHECKPOINT_TYPE: pytorch DATA: NUM_FRAMES: 32 SAMPLING_RATE: 2 TRAIN_JITTER_SCALES: [256, 320] TRAIN_CROP_SIZE: 224 TEST_CROP_SIZE: 256 INPUT_CHANNEL_NUM: [3, 3] DETECTION: ENABLE: True ALIGNED: False AVA: BGR: False DETECTION_SCORE_THRESH: 0.8 TEST_PREDICT_BOX_LISTS: ["person_box_67091280_iou90/ava_detection_val_boxes_and_labels.csv"] SLOWFAST: ALPHA: 4 BETA_INV: 8 FUSION_CONV_CHANNEL_RATIO: 2 FUSION_KERNEL_SZ: 5 RESNET: ZERO_INIT_FINAL_BN: True WIDTH_PER_GROUP: 64 NUM_GROUPS: 1 DEPTH: 101 TRANS_FUNC: bottleneck_transform STRIDE_1X1: False NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]] SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [2, 2]] SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [1, 1]] NONLOCAL: LOCATION: [[[], []], [[], []], [[6, 13, 20], []], [[], []]] GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]] INSTANTIATION: dot_product POOL: [[[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]], [[2, 2, 2], [2, 2, 2]]] BN: USE_PRECISE_STATS: False NUM_BATCHES_PRECISE: 200 SOLVER: MOMENTUM: 0.9 WEIGHT_DECAY: 1e-7 OPTIMIZING_METHOD: sgd MODEL: NUM_CLASSES: 80 ARCH: slowfast MODEL_NAME: SlowFast LOSS_FUNC: bce DROPOUT_RATE: 0.5 HEAD_ACT: sigmoid TEST: ENABLE: False DATASET: ava BATCH_SIZE: 8 DATA_LOADER: NUM_WORKERS: 2 PIN_MEMORY: True DEMO: ENABLE: True LABEL_FILE_PATH: "./demo/AVA/ava.names" DATA_SOURCE: "/home/ubuntu/gestures_dataset_right_wave_15.mp4"

DISPLAY_WIDTH: 640

DISPLAY_HEIGHT: 480

DETECTRON2_OBJECT_DETECTION_MODEL_CFG: "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml" DETECTRON2_OBJECT_DETECTION_MODEL_WEIGHTS: "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl" NUM_GPUS: 1 NUM_SHARDS: 1 RNG_SEED: 0 OUTPUT_DIR: .

And here is part of logs: /home/ubuntu/slowfast/slowfast/models/head_helper.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert out.shape[2] == 1 /home/ubuntu/detectron2_repo/detectron2/layers/roi_align.py:105: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 /home/ubuntu/slowfast/slowfast/models/head_helper.py:111: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert out.shape[2] == 1 /home/ubuntu/detectron2_repo/detectron2/layers/roi_align.py:105: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert rois.dim() == 2 and rois.size(1) == 5 [WARNING: flop_count.py: 63]: Skipped operation aten::batch_norm 215 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::batch_norm 215 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::relu_ 204 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::relu_ 204 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::max_pool3d 7 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::max_pool3d 7 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::add 69 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::add 69 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::div 3 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::div 3 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::avg_pool3d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::avg_pool3d 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation prim::PythonOp 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation prim::PythonOp 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::max_pool2d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::max_pool2d 2 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::dropout 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::dropout 1 time(s) [WARNING: flop_count.py: 63]: Skipped operation aten::sigmoid 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.flop_count: 63: Skipped operation aten::sigmoid 1 time(s) [INFO: misc.py: 160]: Flops: 146.54916608 G [07/11 21:40:19][INFO] slowfast.utils.misc: 160: Flops: 146.54916608 G [WARNING: activation_count.py: 54]: Skipped operation aten::batch_norm 215 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::batch_norm 215 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::relu_ 204 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::relu_ 204 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::max_pool3d 7 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::max_pool3d 7 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::add 69 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::add 69 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::einsum 6 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::einsum 6 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::div 3 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::div 3 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::avg_pool3d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::avg_pool3d 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation prim::PythonOp 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation prim::PythonOp 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::max_pool2d 2 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::max_pool2d 2 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::dropout 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::dropout 1 time(s) [WARNING: activation_count.py: 54]: Skipped operation aten::sigmoid 1 time(s) [07/11 21:40:19][WARNING] fvcore.nn.activation_count: 54: Skipped operation aten::sigmoid 1 time(s) [INFO: misc.py: 165]: Activations: 293.60136 M [07/11 21:40:19][INFO] slowfast.utils.misc: 165: Activations: 293.60136 M [INFO: misc.py: 168]: nvidia-smi [07/11 21:40:19][INFO] slowfast.utils.misc: 168: nvidia-smi

Please feel free to ask any additional information if needed.
question

opened by Serhii-Tiurin 16
Run error

I use command python ./tools/run_net.py --cfg ./configs/Kinetics/C2D_8x8_R50.yaml DATA.PATH_TO_DATA_DIR ~/data/kinetics400/ NUM_GPUS 2 TRAIN.BATCH_SIZE 16 to train a model,but it has some problems.

RuntimeError: Caught RuntimeError in DataLoader worker process 0 RuntimeError: Failed to fetch video after 10 retries

I am very confused.Thanks

opened by onlyonewater 16
AVA benchmark with provided models and configs
Hi, thanks for your great codebase. Could you release the performance for the configs in AVA folder

--configs/ AVA/ C2D_8x8_R50_SHORT.yaml SLOWFAST_32x2_R50_SHORT.yaml
documentation
opened by tonysy 14
demo Error

hI python tools/run_net.py --cfg configs/Kinetics/c2/SLOWFAST_8x8_R50.yaml DATA.PATH_TO_DATA_DIR 当我运行这个的时候出现这个错误,可能是我下载的test.csv的问题,您能提供我一下吗?

Traceback (most recent call last): File "/home/vcaadmin/miniconda3/envs/slowfast/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/data/gao/slowfast/slowfast/utils/multiprocessing.py", line 50, in run func(cfg) File "/data/gao/sf/SlowFast-master/tools/test_net.py", line 160, in test test_loader = loader.construct_loader(cfg, "test") File "/data/gao/slowfast/slowfast/datasets/loader.py", line 80, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "/data/gao/slowfast/slowfast/datasets/build.py", line 32, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "/data/gao/slowfast/slowfast/datasets/kinetics.py", line 74, in init self._construct_loader() File "/data/gao/slowfast/slowfast/datasets/kinetics.py", line 92, in _construct_loader assert len(path_label.split()) == 2 AssertionError

thanks a lot!!
question

opened by gtgtgt1117 12
Running error

Dear author,

I am trying to evaluate your algorithm performance on ava data sets using command line "python tools/run_net.py --cfg configs/AVA/c2/SLOWFAST_64x2_R101_50_50.yaml DATA.PATH_TO_DATA_DIR data/ava/ TEST.CHECKPOINT_FILE_PATH model/SLOWFAST_64x2_R101_50_50.pkl TRAIN.ENABLE False NUM_GPUS 1". I got the following error,

Traceback (most recent call last): File "tools/run_net.py", line 152, in main() File "tools/run_net.py", line 147, in main test(cfg=cfg) File "/media/workspace_karl/slowfast/tools/test_net.py", line 138, in test convert_from_caffe2=cfg.TEST.CHECKPOINT_TYPE == "caffe2", File "/media/workspace_karl/slowfast/slowfast/utils/checkpoint.py", line 186, in load_checkpoint caffe2_checkpoint = pickle.load(f, encoding="latin1") _pickle.UnpicklingError: unpickling stack underflow

I wonder if you could point out what is going wrong.

Many Thanks!

opened by yangsusanyang 10

UnicodeDecodeError when loading the pre-trained model

Thank you for your great work!

When I tried to load the model you provided here, I encountered the following error:

Traceback (most recent call last):
  File "tools/run_net.py", line 152, in <module>
    main()
  File "tools/run_net.py", line 128, in main
    train(cfg=cfg)
  File "/home/ubuntu/projects/SlowFast/tools/train_net.py", line 217, in train
    convert_from_caffe2=cfg.TRAIN.CHECKPOINT_TYPE == "caffe2",
  File "/home/ubuntu/projects/SlowFast/slowfast/utils/checkpoint.py", line 212, in load_checkpoint
    checkpoint = torch.load(path_to_checkpoint, map_location="cpu")
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.7/site-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.7/site-packages/torch/serialization.py", line 603, in _load
    magic_number = pickle_module.load(f, **pickle_load_args)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd4 in position 1: invalid continuation byte

This problem occurred probably because I tried to decode in a different way from way of encoding. I would appreciate it if you could show me how this problem has happened, and how to solve it.

Thank you for your consideration!

opened by Shumpei-Kikuta 10

X3D is slower than slowfast

I compare the speed between X3D_L and SLOWFAST_8x8_R50 on Kinetics-400 dataset. Surprisingly, although the GFlops of X3D_L is smaller(24.8 vs 65.7) , it runs about 3x slower than SLOWFAST_8x8_R50.

Is there something important need to do to speed up the inference of X3D_L?

opened by gooners1886 8
Unexpected bus error encountered in worker.

I tried to run SLOWFAST_8x8_R50 to inference on a small amount of kinetics-400 test data (like only 5 videos), on a google cloud compute engine machine with 8 K80 GPUs (each has 12GB gpu memory). but it seems that it cannot be run due to some unexpected error: ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).

Can someone help with this error ? I simply don't know what went wrong .....
question

opened by dixonhsiao 8
Training Speed of AVA

Hi, thanks for sharing your implementation for AVA. Is it convenient for you to add trained model with "configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml" settings ?

I am trying to fine-tune on AVA dataset with pre-trained kinetics weights using SLOWFAST_32x2_R50_SHORT.yaml.

I observed training an epoch is around 3.50 hour on 4 2080Ti gpus (batch-size 24). Training 20 epochs will take around 3 days.
enhancement

opened by hzhang57 8
Reproduction and comparision to video-nonlocal-net

Hi, Thank you for making this code available.

I am trying to reproduce the result from your code. I got 72.6% top1 accuracy for I3D, from (Kinetics/I3D_8x8_R50.yaml)[https://github.com/facebookresearch/SlowFast/blob/master/configs/Kinetics/I3D_8x8_R50.yaml], on 8 GPUs without modifying any other parameter, and for 196 epochs.

I suppose that is from the fact that version of the kinetics dataset that I have is missing about 600 validation and 9K training videos. It would be a great help you please make some comment about this?

Another question about comparing the results of this repo with (video-nonlocal-net repo)[https://github.com/facebookresearch/video-nonlocal-net]. For instance, I3D_8x8_R50 have an accuracy of 73.5 here and 73.5 there but the initialisation here is random their (video-nonlocal-net repo) uses imagenet initialization. However, the number of epochs there is much less (~100 epochs).

What makes random initialisation work as good as imagenet initialisation, as in video-nl-net repo? Is it just the number of epochs or something else? How does one compare them?

Looking forward to your answers, Gurkirt

opened by gurkirt 8
Reproduce AVA results of MAE_ST

Hi there,

Could authors share the configs that you used to produce the AVA v2.2 results in the Masked Autoencoders As Spatiotemporal Learners paper?

Throughout the repo I cannot find any related configs for ViT. The hyperparameters that mentioned in the paper (https://arxiv.org/pdf/2205.09113.pdf, appendix A. Table 6) seem to be unreasonable to me. With batch size 128, the learning rate is 7.2 for ViT-L with SGD optimizer.

Thanks

opened by yuanliangzhe 0
RuntimeError: Distributed package doesn't have NCCL built in

I am getting this error while training the SlowFast model with kinetics data in my local machine.

Please, give me a solution to resolve the issue.

Thank you.

opened by Bhavik-Ardeshna 0
INSTALL.md is not up-to-date. Here is an updated version in my SlowFast fork
Hello, INSTALL.md file is not up-to-date and has below issues:

torchvision with ffmpeg support (video_probe_from_memory errors),

Pytorchvideo: pip install lacks some necessary functions,

sklearn not supported anymore, scikit-learn can be explicitly installed,

python version: python 3.9 not supported for building torchvision with ffmpeg,

order of installation steps is confusing, etc.

I included more details about the above problems and their solutions in my own SlowFast fork in the INSTALL.md file here. I hope these changes in the installation steps help you build and work with SlowFast easier.
opened by alpargun 0
NumPy array not writable error in slowfast/datasets/decoder.py when using torchvision backend. "np.array" typecasting required
When using torchvision backend I receive the error:

/home/alp/Desktop/slowfast-with-torchvision/SlowFast/slowfast/datasets/decoder.py:273: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1666642991888/work/torch/csrc/utils/tensor_numpy.cpp:199.)

I am using

pytorch=1.13.0

torchvision=0.14.0

ffmpeg=4.2

In order to fix this problem, the line, should be fixed to: video_tensor = torch.from_numpy(np.frombuffer(np.array(video_handle), dtype=np.uint8))

When I made the above change, I no longer receive the warning.
opened by alpargun 0
maskfeat mvit-s fine-tuning on k400 config

Hi, I am trying to reproduce mvit-s fine-tuning on k400, and find that the config file (configs/masked_ssl/k400_MVITv2_S_16x4_FT.yaml) is different from the config in the paper in the SOLVER part, in config file, we warmup for 20 epochs, and max epoch is 100, but in paper, warmup for 5 epochs, and training for 200 epochs. which config should i refer to? Another question is that MViT-L is fine-tuning on larger saptial sizes of 312 and 352, are you using a higer resolution (larger than short-side 256) version k400 dataset?

opened by cir7 0
pyav decode leads memory leaks issue when 'duration' is none

Hi, I follow #541 changes to use pyav backend decode videos, and when i use 'pyav' for trainging, if 'duration' param is None, which is conditional at decoder.py Line 436, the memory will increase rapidly to the machine limit and training thread will stacked.

I try simply raise error when duration is none, just drop out these videos and training can works well, but i dont know why this happens and it will be better if i can use these videos for training again. Can someone helps? Thanks!

opened by wnzhyee 1

Owner

Meta Research

GitHub

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

152 Jan 2, 2023

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

405 Jan 4, 2023

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle ?? PaddlePaddle Visual Transformers (PaddleViT or

1k Dec 28, 2022

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

LWCC: A LightWeight Crowd Counting library for Python LWCC is a lightweight crowd counting framework for Python. It wraps four state-of-the-art models

39 Dec 28, 2022

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

663 Jan 6, 2023

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

463 Dec 28, 2022

A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

TCV-X21 validation for divertor turbulence simulations Quick links Intro Welcome to TCV-X21. We're glad you've found us! This repository is designed t

0 Dec 18, 2021

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

239 Jan 2, 2023

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

197 Nov 29, 2022

Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

16 Sep 19, 2022

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

809 Dec 16, 2022

Reproducing code of hair style replacement method from Barbershorp.

Barbershorp Reproducing code of hair style replacement method from Barbershorp. Also reproduces II2S, an improved version of Image2StyleGAN. Requireme

1 Dec 24, 2021

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

3 Dec 27, 2022

State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

60 May 29, 2022

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

16 Oct 14, 2022

State of the art Semantic Sentence Embeddings

Contrastive Tension State of the art Semantic Sentence Embeddings Published Paper · Huggingface Models · Report Bug Overview This is the official code

88 Dec 30, 2022

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

2.8k Jan 8, 2023

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

19 Sep 29, 2022

State-of-the-art data augmentation search algorithms in PyTorch

MuarAugment Description MuarAugment is a package providing the easiest way to a state-of-the-art data augmentation pipeline. How to use You can instal

43 Dec 12, 2022

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Related tags

Overview

PySlowFast

Introduction

Updates

License

Model Zoo and Baselines

Installation

Quick Start

Visualization Tools

Contributors

Citing PySlowFast

Comments

DISPLAY_WIDTH: 640

DISPLAY_HEIGHT: 480

Owner

Meta Research

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Repository for reproducing `Model-Based Robust Deep Learning`

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Reproducing code of hair style replacement method from Barbershorp.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

State of the Art Neural Networks for Deep Learning

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

State of the art Semantic Sentence Embeddings

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

State-of-the-art data augmentation search algorithms in PyTorch