Simple, efficient and flexible vision toolbox for mxnet framework.

Ligeng Zhu

Last update: Oct 19, 2019

Related tags

Overview

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox is a toolbox aiming to provide a general and simple interface for vision tasks. This project is greatly inspired by PyTorch and torchvision. Detailed copyright files are on the way. Improvements and suggestions are welcome.

Installation

MXBox is now available on PyPi.

pip install mxbox

Features

Define preprocess as a flow

transform = transforms.Compose([
    transforms.RandomSizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.mx.ToNdArray(),
    transforms.mx.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                            std  = [ 0.229, 0.224, 0.225 ]),
])

PS: By default, mxbox uses PIL to read and transform images. But it also supports other backends like accimage and skimage.

More usages can be found in documents and examples.

Build an multi-thread DataLoader in few lines

Common datasets such as cifar10, cifar100, SVHN, MNIST are out-of-the-box. You can simply load them from mxbox.datasets.

from mxbox import transforms, datasets, DataLoader
trans = transforms.Compose([
        transforms.mx.ToNdArray(), 
        transforms.mx.Normalize(mean = [ 0.485, 0.456, 0.406 ],
                                std  = [ 0.229, 0.224, 0.225 ]),
])
dataset = datasets.CIFAR10('~/.mxbox/cifar10', transform=trans, download=True)

batch_size = 32
feedin_shapes = {
    'batch_size': batch_size,
    'data': [mx.io.DataDesc(name='data', shape=(batch_size, 3, 32, 32), layout='NCHW')],
    'label': [mx.io.DataDesc(name='softmax_label', shape=(batch_size, ), layout='N')]
}
loader = DataLoader(dataset, feedin_shapes, threads=8, shuffle=True)

Or you can also easily create your own, which only requires to implement __getitem__ and __len__.

class TooYoungScape(mxbox.Dataset):
    def __init__(self, root, lst, transform=None):
        self.root = root
        with open(osp.join(root, lst), 'r') as fp:
            self.lst = [line.strip().split('\t') for line in fp.readlines()]
        self.transform = transform

    def __getitem__(self, index):
        img = self.pil_loader(osp.join(self.root, self.lst[index][0]))
        if self.transform is not None:
            img = self.transform(img)
        return {'data': img, 'softmax_label': img}

    def __len__(self):
        return len(self.lst)
        
dataset = TooYoungScape('~/.mxbox/TooYoungScape', "train.lst", transform=trans)
loader = DataLoader(dataset, feedin_shapes, threads=8, shuffle=True)

Load popular model with pretrained weights

Note: current under construction, many models lack of pretrained weights and some of their definition files are missing.

vgg = mxbox.models.vgg(num_classes=10, pretrained=True)
resnet = mxbox.models.resnet152(num_classes=10, pretrained=True)

TODO list

FLAG options?
Efficient prefetch.
Common Models preparation.
More friendly error logging.

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

Applied Research Center (ARC), Tencent PCG

84 Jan 4, 2023

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework

Table of Contents Features Install Using PIP Stable version Development version Using Docker Stable version Development version Using HomeBrew Stable

2.2k Jan 8, 2023

Bagua is a flexible and performant distributed training algorithm development framework.

786 Dec 17, 2022

Pocsploit is a lightweight, flexible and novel open source poc verification framework

208 Dec 24, 2022

A flexible framework of neural networks for deep learning

5.8k Jan 6, 2023

5.5k Feb 12, 2021

ObjectDetNet is an easy, flexible, open-source object detection framework

Getting started with the ObjectDetNet ObjectDetNet is an easy, flexible, open-source object detection framework which allows you to easily train, resu

5 Aug 25, 2020

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

198 Dec 27, 2022

Simple, efficient and flexible vision toolbox for mxnet framework.

Related tags

Overview

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework.

Installation

Features

TODO list

You might also like...

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Bagua is a flexible and performant distributed training algorithm development framework.

Pocsploit is a lightweight, flexible and novel open source poc verification framework

A flexible framework of neural networks for deep learning

A flexible framework of neural networks for deep learning

ObjectDetNet is an easy, flexible, open-source object detection framework

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Owner

Ligeng Zhu

InsightFace: 2D and 3D Face Analysis Project on MXNet and PyTorch

Tensorboard for pytorch (and chainer, mxnet, numpy, ...)

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Modular Probabilistic Programming on MXNet

MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision