HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

Overview

drawing

PyPI version GitHub Issues Contributions welcome License: MIT Downloads

HugsVision is an open-source and easy to use all-in-one huggingface wrapper for computer vision.

The goal is to create a fast, flexible and user-friendly toolkit that can be used to easily develop state-of-the-art computer vision technologies, including systems for Image Classification, Semantic Segmentation, Object Detection, Image Generation, Denoising and much more.

⚠️ HugsVision is currently in beta. ⚠️

Quick installation

HugsVision is constantly evolving. New features, tutorials, and documentation will appear over time. HugsVision can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users than want to run experiments and modify/customize the toolkit. HugsVision supports both CPU and GPU computations. For most recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.

Anaconda setup

conda create --name HugsVision python=3.6 -y
conda activate HugsVision

More information on managing environments with Anaconda can be found in the conda cheat sheet.

Install via PyPI

Once you have created your Python environment (Python 3.6+) you can simply type:

pip install hugsvision

Install with GitHub

Once you have created your Python environment (Python 3.6+) you can simply type:

git clone https://github.com/qanastek/HugsVision.git
cd HugsVision
pip install -r requirements.txt
pip install --editable .

Any modification made to the hugsvision package will be automatically interpreted as we installed it with the --editable flag.

Example Usage

Let's train a binary classifier that can distinguish people with or without Pneumothorax thanks to their radiography.

Steps:

  1. Move to the recipe directory cd recipes/pneumothorax/binary_classification/
  2. Download the dataset here ~779 MB.
  3. Transform the dataset into a directory based one, thanks to the process.py script.
  4. Train the model: python train_example_vit.py --imgs="./pneumothorax_binary_classification_task_data/" --name="pneumo_model_vit" --epochs=1
  5. Rename <MODEL_PATH>/config.json to <MODEL_PATH>/preprocessor_config.json in my case, the model is situated at the output path like ./out/MYVITMODEL/1_2021-08-10-00-53-58/model/
  6. Make a prediction: python predict.py --img="42.png" --path="./out/MYVITMODEL/1_2021-08-10-00-53-58/model/"

Models recipes

You can find all the currently available models or tasks under the recipes/ folder.

Training a Transformer Image Classifier to help radiologists detect Pneumothorax cases: A demonstration of how to train a Image Classifier Transformer model that can distinguish people with or without Pneumothorax thanks to their radiography with HugsVision.
Training a End-To-End Object Detection with Transformers to detect blood cells: A demonstration of how to train a E2E Object Detection Transformer model which can detect and identify blood cells with HugsVision.
Training a Transformer Image Classifier to help endoscopists: A demonstration of how to train a Image Classifier Transformer model that can help endoscopists to automate detection of various anatomical landmarks, phatological findings or endoscopic procedures in the gastrointestinal tract with HugsVision.
Training and using a TorchVision Image Classifier in 5 min to identify skin cancer: A fast and easy tutorial to train a TorchVision Image Classifier that can help dermatologist in their identification procedures Melanoma cases with HugsVision and HAM10000 dataset.

HuggingFace Spaces

You can try some of the models or tasks on HuggingFace thanks to theirs amazing spaces :

Model architectures

All the model checkpoints provided by 🤗 Transformers and compatible with our tasks can be seamlessly integrated from the huggingface.co model hub where they are uploaded directly by users and organizations.

Before starting implementing, please check if your model has an implementation in PyTorch by refering to this table.

🤗 Transformers currently provides the following architectures for Computer Vision:

  1. ViT (from Google Research, Brain Team) released with the paper An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.
  2. DeiT (from Facebook AI and Sorbonne University) released with the paper Training data-efficient image transformers & distillation through attention by Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
  3. BEiT (from Microsoft Research) released with the paper BEIT: BERT Pre-Training of Image Transformers by Hangbo Bao, Li Dong and Furu Wei.
  4. DETR (from Facebook AI) released with the paper End-to-End Object Detection with Transformers by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov and Sergey Zagoruyko.

Build PyPi package

Build: python setup.py sdist bdist_wheel

Upload: twine upload dist/*

Citation

If you want to cite the tool you can use this:

@misc{HugsVision,
  title={HugsVision},
  author={Yanis Labrak},
  publisher={GitHub},
  journal={GitHub repository},
  howpublished={\url{https://github.com/qanastek/HugsVision}},
  year={2021}
}
Comments
  • tuple object has no attribute 'keys'

    tuple object has no attribute 'keys'

    I've been trying to finetune the vision transformer on custom dataset. I followed the steps from one of the tutorial notebook and ran into the following error: AttributeError: tuple object has no attribute 'keys'

    I thought I did something wrong so, I decided to try the demo tutorial. But the same error shows up. It seems like a bug. Please see the attached screenshot. Screenshot 2022-05-13 194032

    opened by M-Melodious 15
  • issues and errors in first tryout in ubuntu 20

    issues and errors in first tryout in ubuntu 20

    sudo su
    apt update
    apt upgrade
    shutdown -r now
    
    sudo su
    cd
    apt install python3-pip
    pip install gradio
    git clone https://huggingface.co/spaces/HugsVision/Skin-Cancer
    cd Skin-Cancer/
    pip install -r requirements.txt
    nano app.py
    
    
    ##
    interface.launch(server_name="0.0.0.0")
    ##
    

    and run with

    python3 app.py
    
    	/usr/local/lib/python3.8/dist-packages/torchvision/transforms/transforms.py:287: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
    	  warnings.warn(
    	2022-01-10 23:04:23.268536: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
    	2022-01-10 23:04:23.268576: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    	/usr/local/lib/python3.8/dist-packages/torchvision/transforms/transforms.py:287: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
    	  warnings.warn(
    	[['images/akiec.jpg'], ['images/bcc.jpg'], ['images/bkl.jpg'], ['images/df.jpg'], ['images/mel.jpg'], ['images/nv.jpg'], ['images/vasc.jpg']]
    	/usr/local/lib/python3.8/dist-packages/gradio/interface.py:188: UserWarning: The `capture_session` parameter in the `Interface` is deprecated and has no effect.
    	  warnings.warn("The `capture_session` parameter in the `Interface` is deprecated and has no effect.")
    	/usr/local/lib/python3.8/dist-packages/gradio/interface.py:205: UserWarning: 'darkhuggingface' theme name is deprecated, using dark-huggingface instead.
    	  warnings.warn(f"'{theme}' theme name is deprecated, using {DEPRECATED_THEME_MAP[theme]} instead.")
    	/usr/local/lib/python3.8/dist-packages/gradio/interface.py:248: UserWarning: The `allow_flagging` parameter in `Interface` nowtakes a string value ('auto', 'manual', or 'never'), not a boolean. Setting parameter to: 'never'.
    	  warnings.warn("The `allow_flagging` parameter in `Interface` now"
    	/usr/local/lib/python3.8/dist-packages/gradio/interface.py:271: UserWarning: The `show_tips` parameter in the `Interface` is deprecated. Please use the `show_tips` parameter in `launch()` instead
    	  warnings.warn("The `show_tips` parameter in the `Interface` is deprecated. Please use the `show_tips` parameter in `launch()` instead")
    	/usr/local/lib/python3.8/dist-packages/gradio/interface.py:293: UserWarning: The `encrypt` parameter in the `Interface` classwill be deprecated. Please provide this parameterin `launch()` instead
    	  warnings.warn(
    	Running on local URL:  http://localhost:7860/
    
    	To create a public link, set `share=True` in `launch()`.
    

    SNAG-0868

    [2022-01-10 23:04:49,518] ERROR in app: Exception on /api/predict/ [POST]
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2073, in wsgi_app
    	response = self.full_dispatch_request()
      File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1518, in full_dispatch_request
    	rv = self.handle_user_exception(e)
      File "/usr/local/lib/python3.8/dist-packages/flask_cors/extension.py", line 165, in wrapped_function
    	return cors_after_request(app.make_response(f(*args, **kwargs)))
      File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1516, in full_dispatch_request
    	rv = self.dispatch_request()
      File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1502, in dispatch_request
    	return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
      File "/usr/local/lib/python3.8/dist-packages/gradio/networking.py", line 93, in wrapper
    	return func(*args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/gradio/networking.py", line 232, in predict
    	prediction, durations = process_example(app.interface, example_id)
      File "/usr/local/lib/python3.8/dist-packages/gradio/process_examples.py", line 12, in process_example
    	prediction, durations = interface.process(raw_input)
      File "/usr/local/lib/python3.8/dist-packages/gradio/interface.py", line 408, in process
    	processed_input = [input_component.preprocess(raw_input[i])
      File "/usr/local/lib/python3.8/dist-packages/gradio/interface.py", line 408, in <listcomp>
    	processed_input = [input_component.preprocess(raw_input[i])
    IndexError: list index out of range
    

    how can solve this?

    bug 
    opened by johnfelipe 3
  • ImportError: cannot import name 'DetrFeatureExtractor' from 'transformers' (unknown location)

    ImportError: cannot import name 'DetrFeatureExtractor' from 'transformers' (unknown location)

    HugsVision/hugsvision/inference/SemanticSegmentationInference.py in 4 5 import torch ----> 6 from transformers import DetrFeatureExtractor, DetrForObjectDetection, pipeline 7 8 import numpy as np

    ImportError: cannot import name 'DetrFeatureExtractor' from 'transformers' (unknown location)

    bug model 
    opened by Breeze-Zero 2
  • Change CPU / GPU device identifier for training and inference

    Change CPU / GPU device identifier for training and inference

    Note: Will be merged in the version 0.72

    #27

    | Device | Status | |:------:|:------:| | GPU | ✔️ | | CPU | ✔️ | | TPU | ⌛ | | IPU | ⌛ | | ARM / M1 | ⌛ | | APU | ⌛ |

    bug model 
    opened by qanastek 0
  • Fix documentation for kvasir

    Fix documentation for kvasir

    from hugsvision.nnet.VisionClassifierTrainer import VisionClassifierTrainer
    from transformers import ViTFeatureExtractor, ViTForImageClassification
    
    trainer = VisionClassifierTrainer(
    	model_name   = "MyKvasirV2Model",
    	train        = train,
    	test         = test,
    	output_dir   = "./out/",
    	max_epochs   = 20,
    	batch_size   = 32, # On RTX 2080 Ti
    	lr	     = 2e-5,
    	fp16	     = True,
    	model = ViTForImageClassification.from_pretrained(
    	    huggingface_model,
    	    num_labels = len(label2id),
    	    label2id   = label2id,
    	    id2label   = id2label
    	),
    	feature_extractor = ViTFeatureExtractor.from_pretrained(
    		huggingface_model,
    	),
    )
    
    documentation 
    opened by qanastek 0
  • freeze in webpage using ubuntu 20 and python 3.8

    freeze in webpage using ubuntu 20 and python 3.8

    (env) root@felipe:~/Skin-Cancer# python3 app.py
    /root/Skin-Cancer/env/lib/python3.8/site-packages/torchvision/transforms/transforms.py:332: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
      warnings.warn(
    /root/Skin-Cancer/env/lib/python3.8/site-packages/torchvision/transforms/transforms.py:332: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
      warnings.warn(
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: `optional` parameter is deprecated, and it has no effect
      warnings.warn(value)
    [['images/akiec.jpg'], ['images/bcc.jpg'], ['images/bkl.jpg'], ['images/df.jpg'], ['images/mel.jpg'], ['images/nv.jpg'], ['images/vasc.jpg']]
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: The 'type' parameter has been deprecated. Use the Number component instead.
      warnings.warn(value)
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: `allow_screenshot` parameter is deprecated, and it has no effect
      warnings.warn(value)
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: `capture_session` parameter is deprecated, and it has no effect
      warnings.warn(value)
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: `show_tips` is deprecated in `Interface()`, please use it within `launch()` instead.
      warnings.warn(value)
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/deprecation.py:40: UserWarning: `encrypt` is deprecated in `Interface()`, please use it within `launch()` instead.
      warnings.warn(value)
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/interface.py:289: UserWarning: Currently, only the 'default' theme is supported.
      warnings.warn("Currently, only the 'default' theme is supported.")
    /root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/interface.py:362: UserWarning: The `allow_flagging` parameter in `Interface` nowtakes a string value ('auto', 'manual', or 'never'), not a boolean. Setting parameter to: 'never'.
      warnings.warn(
    Running on local URL:  http://localhost:7860/
    
    To create a public link, set `share=True` in `launch()`.
    Traceback (most recent call last):
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/routes.py", line 275, in predict
    	output = await app.blocks.process_api(body, username, session_state)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/blocks.py", line 274, in process_api
    	predictions = await run_in_threadpool(block_fn.fn, *processed_input)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    	return await anyio.to_thread.run_sync(func, *args)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    	return await get_asynclib().run_sync_in_worker_thread(
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    	return await future
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    	result = context.run(func, *args)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/interface.py", line 500, in <lambda>
    	lambda *args: self.run_prediction(args)[0]
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/gradio/interface.py", line 682, in run_prediction
    	prediction = predict_fn(*processed_input)
      File "app.py", line 35, in predict_image
    	model = TorchVisionClassifierInference(
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/hugsvision/inference/TorchVisionClassifierInference.py", line 29, in __init__
    	self.model = torch.load(self.model_path + "best_model.pth", map_location=self.device)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/torch/serialization.py", line 713, in load
    	return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
      File "/root/Skin-Cancer/env/lib/python3.8/site-packages/torch/serialization.py", line 920, in _legacy_load
    	magic_number = pickle_module.load(f, **pickle_load_args)
    _pickle.UnpicklingError: invalid load key, 'v'.
    

    SNAG-0948

    dont show anything passed 500 seconds

    how solve this?

    opened by johnfelipe 0
  • more memory efficient handling of augmentation

    more memory efficient handling of augmentation

    I have a ~200k images dataset so the augmentation step in VisionDataset.fromImageFolder will easily use up all the RAM when creating the new_ds

    I currently just set augmentation=False for this big dataset, just wondering if you have plans to implement this feature in a more memory efficient way

    opened by ohjho 0
Owner
Labrak Yanis
👨🏻‍🎓 Student in Master of Science in Computer Science, Avignon University 🇫🇷 🏛 Research Scientist - Machine Learning in Healthcare
Labrak Yanis
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

null 152 Jan 2, 2023
Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.

Monk - A computer vision toolkit for everyone Why use Monk Issue: Want to begin learning computer vision Solution: Start with Monk's hands-on study ro

Tessellate Imaging 507 Dec 4, 2022
This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

Aryclenio Xavier Barros 26 Oct 14, 2022
State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

Ritvik Rastogi 60 May 29, 2022
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

Image Classification Project Killer in PyTorch This repo is designed for those who want to start their experiments two days before the deadline and ki

null 349 Dec 8, 2022
State of the art Semantic Sentence Embeddings

Contrastive Tension State of the art Semantic Sentence Embeddings Published Paper · Huggingface Models · Report Bug Overview This is the official code

Fredrik Carlsson 88 Dec 30, 2022
LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

TuZheng 405 Jan 4, 2023
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
State-of-the-art data augmentation search algorithms in PyTorch

MuarAugment Description MuarAugment is a package providing the easiest way to a state-of-the-art data augmentation pipeline. How to use You can instal

null 43 Dec 12, 2022
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.

Karttikeya Manglam 40 Nov 18, 2022
A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

Miles Zhang 54 Dec 21, 2022
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

Hugging Face 865 Dec 24, 2022
FastReID is a research platform that implements state-of-the-art re-identification algorithms.

FastReID is a research platform that implements state-of-the-art re-identification algorithms.

JDAI-CV 2.8k Jan 7, 2023
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle ?? PaddlePaddle Visual Transformers (PaddleViT or

null 1k Dec 28, 2022