All public open-source implementations of convnets benchmarks

Overview

convnet-benchmarks

Easy benchmarking of all public open-source implementations of convnets. A summary is provided in the section below.

Machine: 6-core Intel Core i7-5930K CPU @ 3.50GHz + NVIDIA Titan X + Ubuntu 14.04 x86_64

Imagenet Winners Benchmarking

I pick some popular imagenet models, and I clock the time for a full forward + backward pass. I average my times over 10 runs. I ignored dropout and softmax layers.

Notation

Input is described as {batch_size}x{num_filters}x{filter_width}x{filter_height}. Where batch_size is the number of images used in a minibatch, num_filters is the number of channels in an image, filter_width is the width of the image, and filter_height is the height of the image.

One small note:

The CuDNN benchmarks are done using Torch bindings. One can also do the same via Caffe bindings or bindings of any other library. This note is here to clarify that Caffe (native) and Torch (native) are the convolution kernels which are present as a default fallback. Some of the frameworks like TensorFlow and Chainer are benchmarked with CuDNN, but it is not explicitly mentioned, and hence one might think that these frameworks as a whole are faster, than for example Caffe, which might not be the case.

AlexNet (One Weird Trick paper) - Input 128x3x224x224

Library Class Time (ms) forward (ms) backward (ms)
CuDNN[R4]-fp16 (Torch) cudnn.SpatialConvolution 71 25 46
Nervana-neon-fp16 ConvLayer 78 25 52
CuDNN[R4]-fp32 (Torch) cudnn.SpatialConvolution 81 27 53
TensorFlow conv2d 81 26 55
Nervana-neon-fp32 ConvLayer 87 28 58
fbfft (Torch) fbnn.SpatialConvolution 104 31 72
Chainer Convolution2D 177 40 136
cudaconvnet2* ConvLayer 177 42 135
CuDNN[R2] * cudnn.SpatialConvolution 231 70 161
Caffe (native) ConvolutionLayer 324 121 203
Torch-7 (native) SpatialConvolutionMM 342 132 210
CL-nn (Torch) SpatialConvolutionMM 963 388 574
Caffe-CLGreenTea ConvolutionLayer 1442 210 1232

Overfeat [fast] - Input 128x3x231x231

Library Class Time (ms) forward (ms) backward (ms)
Nervana-neon-fp16 ConvLayer 176 58 118
Nervana-neon-fp32 ConvLayer 211 69 141
CuDNN[R4]-fp16 (Torch) cudnn.SpatialConvolution 242 86 156
CuDNN[R4]-fp32 (Torch) cudnn.SpatialConvolution 268 94 174
TensorFlow conv2d 279 90 189
fbfft (Torch) SpatialConvolutionCuFFT 342 114 227
Chainer Convolution2D 620 135 484
cudaconvnet2* ConvLayer 723 176 547
CuDNN[R2] * cudnn.SpatialConvolution 810 234 576
Caffe ConvolutionLayer 823 355 468
Torch-7 (native) SpatialConvolutionMM 878 379 499
CL-nn (Torch) SpatialConvolutionMM 963 388 574
Caffe-CLGreenTea ConvolutionLayer 2857 616 2240

OxfordNet [Model-A] - Input 64x3x224x224

Library Class Time (ms) forward (ms) backward (ms)
Nervana-neon-fp16 ConvLayer 254 82 171
Nervana-neon-fp32 ConvLayer 320 103 217
CuDNN[R4]-fp16 (Torch) cudnn.SpatialConvolution 471 140 331
CuDNN[R4]-fp32 (Torch) cudnn.SpatialConvolution 529 162 366
TensorFlow conv2d 540 158 382
Chainer Convolution2D 885 251 632
fbfft (Torch) SpatialConvolutionCuFFT 1092 355 737
cudaconvnet2* ConvLayer 1229 408 821
CuDNN[R2] * cudnn.SpatialConvolution 1099 342 757
Caffe ConvolutionLayer 1068 323 745
Torch-7 (native) SpatialConvolutionMM 1105 350 755
CL-nn (Torch) SpatialConvolutionMM 3437 875 2562
Caffe-CLGreenTea ConvolutionLayer 5620 988 4632

GoogleNet V1 - Input 128x3x224x224

Library Class Time (ms) forward (ms) backward (ms)
Nervana-neon-fp16 ConvLayer 230 72 157
Nervana-neon-fp32 ConvLayer 270 84 186
TensorFlow conv2d 445 135 310
CuDNN[R4]-fp16 (Torch) cudnn.SpatialConvolution 462 112 349
CuDNN[R4]-fp32 (Torch) cudnn.SpatialConvolution 470 130 340
Chainer Convolution2D 687 189 497
Caffe ConvolutionLayer 1935 786 1148
CL-nn (Torch) SpatialConvolutionMM 7016 3027 3988
Caffe-CLGreenTea ConvolutionLayer 9462 746 8716

Layer-wise Benchmarking (Last Updated April 2015)

Spatial Convolution layer (3D input 3D output, densely connected)

forward + backprop (wrt input and weights)
Original Library Class/Function Benchmarked Time (ms) forward (ms) backward (ms)
fbfft SpatialConvolutionCuFFT 256 101 155
cuda-convnet2 * ConvLayer 977 201 776
cuda-convnet** pylearn2.cuda_convnet 1077 312 765
CuDNN R2 * cudnn.SpatialConvolution 1019 269 750
Theano CorrMM 1225 407 818
Caffe ConvolutionLayer 1231 396 835
Torch-7 SpatialConvolutionMM 1265 418 877
DeepCL ConvolutionLayer 6280 2648 3632
cherry-picking**** best per layer 235 79 155

This table is NOT UPDATED For TITAN-X. These numbers below were on Titan Black and are here only for informational and legacy purposes.

Original Library Class/Function Benchmarked Time (ms) forward (ms) backward (ms)
Theano (experimental)*** conv2d_fft 1178 304 874
Torch-7 nn.SpatialConvolutionBHWD 1892 581 1311
ccv ccv_convnet_layer 809+bw 809
Theano (legacy) conv2d 70774 3833 66941
  • * indicates that the library was tested with Torch bindings of the specific kernels.
  • ** indicates that the library was tested with Pylearn2 bindings.
  • *** This is an experimental module which used FFT to calculate convolutions. It uses a lot of memory according to @benanne
  • **** The last row shows results obtainable when choosing the best-performing library for each layer.
  • L1 - Input: 128x128 Batch-size 128, Feature maps: 3->96, Kernel Size: 11x11, Stride: 1x1
  • L2 - Input: 64x64 Batch-size 128, Feature maps: 64->128, Kernel Size: 9x9, Stride: 1x1
  • L3 - Input: 32x32 Batch-size 128, Feature maps: 128->128, Kernel Size: 9x9, Stride: 1x1
  • L4 - Input: 16x16 Batch-size 128, Feature maps: 128->128, Kernel Size: 7x7, Stride: 1x1
  • L5 - Input: 13x13 Batch-size 128, Feature maps: 384->384, Kernel Size: 3x3, Stride: 1x1
  • The table is ranked according to the total time forward+backward calls for layers (L1 + L2 + L3 + L4 + L5)
Breakdown
forward

Columns L1, L2, L3, L4, L5, Total are times in milliseconds

Original Library Class/Function Benchmarked L1 L2 L3 L4 L5 Total
fbfft SpatialConvolutionCuFFT 57 27 6 2 9 101
cuda-convnet2 * ConvLayer 36 113 40 4 8 201
cuda-convnet** pylearn2.cuda_convnet 38 183 68 7 16 312
CuDNN R2 cudnn.SpatialConvolution 56 143 53 6 11 269
Theano CorrMM 91 143 121 24 28 407
Caffe ConvolutionLayer 93 136 116 24 27 396
Torch-7 nn.SpatialConvolutionMM 94 149 123 24 28 418
DeepCL ConvolutionLayer 738 1241 518 47 104 2648
cherry-picking**** best per layer 36 27 6 2 8 79
backward (gradInput + gradWeight)

Columns L1, L2, L3, L4, L5, Total are times in milliseconds

Original Library Class/Function Benchmarked L1 L2 L3 L4 L5 Total
fbfft SpatialConvolutionCuFFT 76 45 12 4 18 155
cuda-convnet2 * ConvLayer 103 467 162 15 29 776
cuda-convnet** pylearn2.cuda_convnet 136 433 147 15 34 765
CuDNN R2 cudnn.SpatialConvolution 139 401 159 19 32 750
Theano CorrMM 179 405 174 29 31 818
Caffe ConvolutionLayer 200 405 172 28 30 835
Torch-7 nn.SpatialConvolutionMM 206 432 178 29 32 877
DeepCL ConvolutionLayer 484 2144 747 59 198 3632
cherry-picking**** best per layer 76 45 12 4 18 155
Comments
  • Benchmark TensorFlow

    Benchmark TensorFlow

    Google's TensorFlow benchmarks are here!

    I've run the benchmarks on the Imagenet Winners. When I saw issues with the numbers, memory etc., I emailed @Yangqing to confirm what I'm seeing, and that it is expected.

    With that disclaimer out of the way, here's some things that you should know about TensorFlow (as of the pip version that I installed today):

    • in-place ReLU seems non-existent in practice.
      • Yangqing says: "right now there are little in-place operations in TensorFlow and we pretty much rely on the scheduler and the memory pool to allocate and deallocate memory"
    • Supports CuDNN R2. No R3 support yet, Yangqing says the next version they are going to support is likely R4.

    Coming to the benchmarks:

    • Googlenet with batchsize 128 goes Out of Memory. The largest batch-size I could fit is 16 (tried 16, 32, 64, 128)
    • VGG with batchsize 64 goes Out of Memory (Edit: VGG memory issue was solved by using the BFC allocator updated by GOOG). ~~The largest batch-size I could fit is 32 (tried 32, 64).~~
    • I've also computed Torch7+CuDNN-R2 baselines for these batch-sizes.

    AlexNet (One Weird Trick paper) - Input 128x3x224x224

    | Library | Time (ms) | forward (ms) | backward (ms) | | :-: | --: | --: | --: | | CuDNN-R3 (Torch) | 96 | 32 | 64 | | Nervana (Neon) | 101 | 32 | 69 | | CuDNN-R2 (Torch) | 231 | 70 | 161 | | TensorFlow | 326 | 96 | 230 |

    Overfeat [fast] - Input 128x3x231x231

    | Library | Time (ms) | forward (ms) | backward (ms) | | :-: | --: | --: | --: | | CuDNN-R3 (Torch) | 326 | 113 | 213 | | fbfft (Torch) | 342 | 114 | 227 | | CuDNN-R2 (Torch) | 810 | 234 | 576 | | TensorFlow | 1084 | 316 | 768 |

    OxfordNet [Model-A] - Input 64x3x224x224

    | Library | Time (ms) | forward (ms) | backward (ms) | | :-: | --: | --: | --: | | Nervana | 590 | 180 | 410 | | CuDNN-R3 (Torch) | 615 | 196 | 418 | | CuDNN-R2 (Torch) | 1099 | 342 | 757 | | TensorFlow | 1840 | 545 | 1295 |

    GoogleNet V1 - Input 16x3x224x224

    | Library | Time (ms) | forward (ms) | backward (ms) | | :-: | --: | --: | --: | | CuDNN-R2 (Torch) | 564 | 174 | 390 | | TensorFlow | 590 | 54 | 536 |

    Note that at batch size of 16, googlenet with CuDNN-R2 + Torch likely runs into dispatching overhead, so it's an exotic comparison, but not practically very interesting or encouraging.

    There you go.

    I'm assuming that the first release of TensorFlow is still quite unpolished, and that they will improve it over time with various memory and time optimizations baked in.

    opened by soumith 112
  • [August 2015] Rejigging the marks...

    [August 2015] Rejigging the marks...

    With Cudnn R3 coming in, improvements to Nervana, and a new kid on the block called Chainer, faster Facebook kernels, I will be doing a minor re-run of the benchmarks to see how things have improved.

    Target date: August 15th.

    I am still thinking quite a lot on how to take the benchmarks forward, beyond ConvNets, beyond Images (into NLP, Video and Audio) and beyond single-GPU. If any domain experts have suggestions (especially for Audio and NLP), please do write to me.

    The only thing that stopped me from multi-GPU benchmarks was the lack of enough frameworks to do benchmarking. This somewhat seemed to have changed, and a decent number of frameworks now support multi-GPU, so will plan on that.

    More fun to come soon.

    Checklist:

    • [x] CuDNN R3
      • fp 16
      • fp 32
    • [x] Nervana Neon
      • fp 16
      • fp 32
    • [ ] Chainer
    • [x] CL-Torch
    • [x] CL-Caffe (greentea)
    • [x] FB-CuNN
    opened by soumith 56
  • Theano fft experimental version

    Theano fft experimental version

    This add the benchmark of Theano fft experimental version. I also try to make it more clear that this is work in progress and which conclusion can't be infered.

    opened by nouiz 29
  • Theano benchmark: Use pylearn2 only for cuda-convnet wrapper

    Theano benchmark: Use pylearn2 only for cuda-convnet wrapper

    This changes the Theano benchmark to use the Theano conv2d op directly instead of setting up an MLP in pylearn2. It fixes the problem of the experimental FFT convolution crashing for some of the backpropagation timings.

    opened by f0k 16
  • Update Theano benchmark for latest version

    Update Theano benchmark for latest version

    Recently, the cuDNN- and gemm-based convolutions have been enabled by default in Theano. This PR updates the compile modes such that the correct versions are benchmarked again. (So there is no need to update the timings, this PR just ensures that you get the correct numbers when running with the latest Theano version.)

    opened by f0k 12
  • Theano benchmark: Removed two lines resetting the compile mode

    Theano benchmark: Removed two lines resetting the compile mode

    The Theano benchmark code had two forgotten lines resetting the compile mode halfway through the benchmark so everything run afterwards wouldn't use the GPU by default. Fixed by this PR.

    opened by f0k 11
  • cuda-convnet2

    cuda-convnet2

    added the relevant config files and a basic README.

    looks like i have to add some low-level clocking and synchronization code inside to do layer-wise benchmarking (right now the entire network :forward is asynchronous), dropped an email to alex and he said that's the best approach as well.

    opened by soumith 11
  • Backpropagation benchmarks for Theano

    Backpropagation benchmarks for Theano

    This adds benchmarks for the backward pass to the Theano benchmark suite. It pretends that whatever was in sharedY after the forward pass benchmark is already the gradient of the (not-actually-existing) cost wrt. the output, so it just does the backward step to compute the gradient wrt. the weights.

    Thinking again, are you actually interested in computing the gradient wrt. the weights, or rather the gradient wrt. the input? Or both? Or should that be two separate benchmarks? Let me know and I'll update the pull request.

    By the way, I've omitted the GFLOP formula as I guess you already figured it out for some of the other libraries and can just copy it over. Otherwise, FilterActs, WeightActs and ImageActs in pylearn2 also have a flops() method that should do the correct thing.

    opened by f0k 10
  • add imagenet benchmarks for theano and lasagne

    add imagenet benchmarks for theano and lasagne

    Hello,

    I am interested in comparing performance of theano against tensorflow and torch. I have ported tensorflow implementation in theano/lasagne. Mostly it is line to line correspondance. I am slightly confused by stat calculations of time measurements, so I replaced them by numpy stats.

    opened by kshmelkov 8
  • Greentea / Caffe with OpenCL benchmarks

    Greentea / Caffe with OpenCL benchmarks

    This adds benchmarking for https://github.com/naibaf7/caffe (see also: https://github.com/BVLC/caffe/pull/2610) It is called project Greentea and contains a complete OpenCL backend for Caffe.

    Tested & found working on Fedora 22 & Ubuntu 14.04. The installation script is written so that it should work on Ubuntu 13.04 to 15.04. On Fedora 22, the packages need to be installed manually.

    The biggest pitfall here is a faulty OpenCL installation and/or incompatible libraries. Make sure to install CUDA & nVidia drivers respectively FGLRX for AMD correctly first.

    Note that it will be quite a bit slower than Caffe with CUDA. The OpenCL backend is still a work-in-progress project. The standard compilation also uses ViennaCL-BLAS for simplicity which is often slower than AMD's clBLAS.

    opened by naibaf7 8
  • CPU Convnet Benchmarks: Caffe vs. Torch Discrepancies (20x) on Jetson TX1 A57 CPU

    CPU Convnet Benchmarks: Caffe vs. Torch Discrepancies (20x) on Jetson TX1 A57 CPU

    Caffe is 20x faster than Torch when benchmarking the ARM Cortex A57 CPU on the NVIDIA Jetson TX1. I performed the same test on an Intel Xeon E5-2637 CPU using Caffe + openBLAS (CPU) vs. Torch + openBLAS (CPU) and the differences are fairly small (< 30% difference).

    Does anyone have any tips/tricks to get the Torch CPU code to be on par with Caffe CPU code on the ARM A57?

    Lua Benchmark:

    1. Imports Caffe's bvlc_alexnet model to a nn specification in Lua using LoadCaffe (https://github.com/szagoruyko/loadcaffe).
    2. Torch is installed using the standard installation method shown here http://torch.ch/docs/getting-started.html. OpenBLAS is detected, and I verify that there are 4 threads by looking at the number of luajit threads that are spawned whenever I call the benchmark.
    3. 'th benchmark.lua' will load the AlexNet model and will time the time it takes to perform model:forward(inputs) for some random inputs.

    Test configuration as follows: model: bvlc_alexnet, batch_size = 100, input size = 3x227,227, iter = 1, threads = 4. The images per second (inference, forward pass only): 0.25 FPS (or 400,000 ms per batch of 100). My Lua benchmark code for the CPU can be downloaded here: http://homes.cs.washington.edu/~cdel/download/benchmark_A57.tgz

    Caffe Benchmark:

    1. I build Caffe with OpenBLAS, and I set OPENBLAS_NUM_THREADS = 4. GNU configure shows that OPENBLAS and Neon vector instructions are enabled on the ARM A57.
    2. I run build/tools/caffe time --model=models/bvlc_alexnet/deploy.prototxt --iterations=1

    Test configuration for Caffe for the ARM A57 CPU is: bvlc_alexnet, batch_size = 100, input_size = 3x227x27, iter = 1, threads = 4 and get a resulting images per second (inference, forward pass only): 5.2 FPS (or 19036 ms per batch of 100).

    opened by ghost 7
  • convnet-benchmark is not working with tensorflow 1.8 on AMD or Nvidia cards

    convnet-benchmark is not working with tensorflow 1.8 on AMD or Nvidia cards

    If you build and install tensorflow 1.8 from https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/tree/r1.8-rocm and run the benchmark tests, you will get 0 values either in AMD or Nvidia cards with their latest release public driver.

    2018-07-09 20:30:59.599248: Forward across 100 steps, 0.000 +/- 0.000 sec / batch

    Is this benchmark no more applicable for tensorflow1.8 latest version? Is dev is going to update it to work for tensorflow1.8 and higher version for the future?

    opened by pramenku 2
  • cltorch googlenet.lua: attempt to index global 'cudnn' (a nil value)

    cltorch googlenet.lua: attempt to index global 'cudnn' (a nil value)

    Operating System: Ubuntu 16.04.3 LTS, Linux kernel 4.13.0 GPU: AMD RX 580 ROCm backend.

    ~/convnet-benchmarks/cltorch$ th imagenet_winners/benchmark.lua 
    libthclnn_searchpath    /storage/home/yige/torch-cl/install/lib/lua/5.1/libTHCLNN.so
    Running on device: gfx803
    Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
    Using OpenCL device: gfx803
    ModelType: OverFeat[fast]       Kernels: clnn   Input shape: 128x3x231x231
    clnn                                    :updateOutput():     673.86
    clnn                                 :updateGradInput():     344.01
    clnn                               :accGradParameters():     480.27
    clnn                                           :Forward:     673.86
    clnn                                          :Backward:     824.27
    clnn                                             :TOTAL:    1498.13
    ModelType: AlexNet      Kernels: clnn   Input shape: 128x3x224x224
    clnn                                    :updateOutput():     311.44
    clnn                                 :updateGradInput():     158.93
    clnn                               :accGradParameters():     623.55
    clnn                                           :Forward:     311.44
    clnn                                          :Backward:     782.48
    clnn                                             :TOTAL:    1093.92
    ModelType: VGG Model-A  Kernels: clnn   Input shape: 64x3x224x224
    clnn                                    :updateOutput():     671.76
    clnn                                 :updateGradInput():     508.20
    clnn                               :accGradParameters():    1174.33
    clnn                                           :Forward:     671.76
    clnn                                          :Backward:    1682.52
    clnn                                             :TOTAL:    2354.29
    /storage/home/yige/torch-cl/install/bin/luajit: ./imagenet_winners/googlenet.lua:33: attempt to index global 'cudnn' (a nil value)
    stack traceback:
            ./imagenet_winners/googlenet.lua:33: in function <./imagenet_winners/googlenet.lua:30>
            imagenet_winners/benchmark.lua:34: in main chunk
            [C]: in function 'dofile'
            ...e/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
            [C]: at 0x00405e90
    
    opened by yige-hu 0
  • worse chainer convnet-benchmarks performance on cupy-2.0.0 as compared to cupy-1.0.0.1

    worse chainer convnet-benchmarks performance on cupy-2.0.0 as compared to cupy-1.0.0.1

    Hello, would you please help explain this issue? Thanks in advance. We found that convnet-benchmarks performance on cupy-2.0.0 is worse than that on cupy-1.0.0.1. We don't know whether it is problem of cupy or convnet-benchmarks scripts. We reported this issue in https://github.com/cupy/cupy/issues/753, got no response yet.

    ---------------------details-------------------------- Test Environment: P100 Test action: 1, install chainer 2, get convnet-benchmarks code: git clone https://github.com/mitmul/convnet-benchmarks 3, test cases 3.1: case "pip install cupy==1.0.0.1" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '1.0.0.1') ('CUDA:', True) ('CUDA Version:', u'V8.0.61') ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 16.15312328338623, ' ms') ('Average Backward: ', 35.27830085754395, ' ms') ('Average Total: ', 51.431424140930176, ' ms')

    3.2: case "pip install cupy==2.0.0" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '2.0.0') ('CUDA:', True) ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 35.381299591064455, ' ms') ('Average Backward: ', 63.26389694213867, ' ms') ('Average Total: ', 98.64519653320312, ' ms')

    3.3: case "pip install cupy==2.0.0rc1" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '2.0.0rc1') ('CUDA:', True) ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 35.5438117980957, ' ms') ('Average Backward: ', 63.336796569824216, ' ms') ('Average Total: ', 98.88060836791992, ' ms')

    Notice: when run "case cupy==2.0.0*", you need to comment following lines in train_imagenet.py. #if chainer.cuda.available:

    cuda_v = cupy.cuda.compiler._get_nvcc_version().split()[-1].decode('utf-8')

    print('CUDA Version:', cuda_v)

    opened by mingxiaoh 2
  • Tensorflow benchmark files not updated after migration?

    Tensorflow benchmark files not updated after migration?

    After trying benchmark_googlenet.py to benchmark, i ran into "TypeError: Expected int32, got list containing Tensors of type '_Message' instead." After searching through i found some links that the tensorflow might not support the old features. Such as tf.concat function.

    i think the files uses the old functions from the tensorflow.

    opened by hit1001 0
Owner
Soumith Chintala
/\︿╱\ _________________________________ \0_ 0 /╱\╱____________________________ \▁︹_/
Soumith Chintala
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 4, 2023
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:

Joseph P. Robinson 41 Dec 12, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

null 185 Dec 26, 2022
Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

Order of the Overflow 46 Sep 13, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022
Benchmarks for semi-supervised domain generalization.

Semi-Supervised Domain Generalization This code is the official implementation of the following paper: Semi-Supervised Domain Generalization with Stoc

Kaiyang 49 Dec 10, 2022
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 1, 2023
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Self-Supervised Policy Adaptation during Deployment PyTorch implementation of PAD and evaluation benchmarks from Self-Supervised Policy Adaptation dur

Nicklas Hansen 101 Nov 1, 2022
Benchmarks for the Optimal Power Flow Problem

Power Grid Lib - Optimal Power Flow This benchmark library is curated and maintained by the IEEE PES Task Force on Benchmarks for Validation of Emergi

A Library of IEEE PES Power Grid Benchmarks 207 Dec 8, 2022
Benchmark spaces - Benchmarks of how well different two dimensional spaces work for clustering algorithms

benchmark_spaces Benchmarks of how well different two dimensional spaces work fo

Bram Cohen 6 May 7, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
Pytorch Lightning 1.2k Jan 6, 2023
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

DLR-RM 4.7k Jan 1, 2023
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

Rakshitha Godahewa 80 Dec 30, 2022