Pre-trained Deep Learning models and demos (high quality and extremely fast)

Overview

OpenVINO™ Toolkit - Open Model Zoo repository

Stable release Gitter chat Apache License Version 2.0

This repository includes optimized deep learning models and a set of demos to expedite development of high-performance deep learning inference applications. Use these free pre-trained models instead of training your own models to speed-up the development and production deployment process.

Intel is committed to the respect of human rights and avoiding complicity in human rights abuses, a policy reflected in the Intel Global Human Rights Principles. Accordingly, by accessing the Intel material on this platform you agree that you will not use the material in a product or application that causes or contributes to a violation of an internationally recognized human right.

Repository Components:

License

Open Model Zoo is licensed under Apache License Version 2.0.

Online Documentation

Other Usage Examples

How to Contribute

We welcome community contributions to the Open Model Zoo repository. If you have an idea how to improve the product, please share it with us doing the following steps:

You can find additional information about model contribution here.

We will review your contribution and, if any additional fixes or modifications are needed, may give you feedback to guide you. When accepted, your pull request will be merged into the GitHub* repositories.

Open Model Zoo is licensed under Apache License, Version 2.0. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.

Support

Please report questions, issues and suggestions using:


* Other names and brands may be claimed as the property of others.

Comments
  • Add online ASR mode to Speech Recognition Demo

    Add online ASR mode to Speech Recognition Demo

    This commit doesn't change quality metrics (they're exactly the same) and speed, but adds a new mode to the demo (and a new interface to the underlying ctcdecode_numpy package). "Online mode" is a mode with lower speech recognition latency: in "offline mode" recognition result was available only after processing whole file.

    opened by AlexeyKruglov 36
  • Object Detection Demo Jupyter Notebook

    Object Detection Demo Jupyter Notebook

    Jupyter Notebook for Async Object Detection. I took the Python demo and made it into a notebook that allows a user to select a model and a video (or upload a video of their own), and set settings for max_num_requests, num_threads and num_streams.

    The notebook downloads the model, optionally converts it, does inference, and shows the results. It shows "live" results of a selected model on one video, and on three frames of multiple selected models (to quickly compare results). By default currently only Intel models are used because they do not require using the model converter, so this notebook can be used with the pip version of OpenVINO. I can update this when the model optimizer becomes part of the pip distribution.

    The target audience for this notebook includes less experienced developers. The goal of this specific notebook is to demonstrate Object Detection with the Async API. I will also make more introductionary notebooks for using OpenVINO/OMZ in general.

    Try without installing

    You can try it here: https://mybinder.org/v2/gh/helena-intel/open_model_zoo/object-detection-demo-notebook-binder?filepath=%2Fdemos%2Fobject_detection_demo%2Fjupyter-python%2Fobject_detection_demo.ipynb There is also a version where the code is hidden: https://mybinder.org/v2/gh/helena-intel/open_model_zoo/detection-async-notebook?urlpath=voila%2Frender%2Fobject_detection_demo.ipynb Note that performance may be slow and it may randomly crash.

    README/Usage notes

    The README instructions need to be updated, but I wanted to wait for the updated pip install method to do that. If you already have OpenVINO installed you can run the notebook from a terminal/cmd.exe where you have set the PATH/LD_LIBRARY_PATH or run setupvars. Then install the requirements in the demos/object_detection_demo/jupyter-python folder and run the notebook, with jupyter notebook or jupyter lab. Note: the notebook calls subprocess to run the model optimizer. If you use a virtualenv, this only works if your environment variables are set globally. If you set them in the shell before running the notebook, subprocess doesn't see them. With the new pip release this should no longer be an issue (otherwise I can change the subprocess call).

    Tested on Windows and Linux (Ubuntu 18).

    Notes and questions

    • I am relatively new to OpenVINO and OMZ and am not sure if I follow best practices. All feedback is welcome!
    • I would like to include a bit more information on the options for num_threads, num_streamsand max_num_requests but I don't know enough about this to write this.
    • I got the intro text from the Python demo, but the target audience for the notebook demo's is a bit different and the intro text could be a bit more catchy. Suggestions are welcome.
    • I put some code in a detection_utils file and some directly in the notebook. I tried to find a balance, so that the important code for this demo is still in the notebook, but that it is not too much code. I am not sure if I found the right balance.
    • I used ipywidgets so that there are dropdowns to select a model/video and buttons to start inferencing. I find it useful for this demo because there are so many models to choose from, and it allows for really quick testing/comparing of detection models. It also makes it possible to make an "app version" (using voila, to quickly show results to others for example). But it does add a bit of new complexity.
    • I made a models-subset.lst with a mostly random selection of models. I found that useful for testing, for example on Binder, and for slow computers/connections. If we want to keep that, I would welcome a selection of models that is less random.
    • I download the models by default to $HOME/open_model_zoo_models and set the cache to $HOME/open_model_zoo_cache. I keep the converted models in the same directory as the downloaded models. I am happy to change these directories to something else, I do not know what is best practice. I would like to have a standard directories for this that always work, so that I can use them for further OMZ notebooks.
    • New Python packages in requirements.txt are voila, ipywidgets and jupyterlab. They are all BSD licensed.

    Screenshots of the completed notebook and app-version

    omz_object_detection_demo_notebook omz_object_detection_demo_notebook_voila

    opened by helena-intel 36
  • G-API Interactive Face Detection demo

    G-API Interactive Face Detection demo

    This is a version of the existing Interactive Face Detection demo. Demo using G-API framework. To compile and run the demo, you need a feature that is not yet in the OpenCV master For now, you can take this feature from https://github.com/aDanPin/opencv/tree/dp/add_dinamic_graph_feature

    opened by aDanPin 35
  • Is there any tutorial or example to show how to use Inference Engine models in OpenCV

    Is there any tutorial or example to show how to use Inference Engine models in OpenCV

    I want to know if there is a tutorial or a using example to show how to use Inference Engine pre-trained models in OpenCV to detect the objects like face, human, car, etc...

    I have already downloaded and installed the Intel® OpenVINO™ toolkit I followed this wiki.

    Basically I have two questions Question 1: I tried to build OpenCV from source with Inference Engine, but the CMake was unable to locate the Inference Engine_DIR, It will be better to also have a tutorial to show how to build, it the wiki above it is not very clear. So I was not able to load the Inference Engine pre-trained model in the OpenCV which was built by me, it has thrwon the exception which say the Inference Engine is not enable. OK, so I used the OpenCV which came with the OpenVINO toolkit.

    Question 2: When I loaded the face-detection-adas-0001 xml and bin using cv::dnn::Net::readxxxxx(xml, bin) it was woking and did not throw any exception, but in the next step I don't know how to pass the frame (cv::Mat) to the Network and get the result. I am looking for an axample to show how to the pre-trained models in OpenCV.

    Thanks!!!!

    opened by Bahramudin 35
  • Initial media support for G-API background subtraction demo

    Initial media support for G-API background subtraction demo

    Overview

    • Switched demo from GMat to GFrame for input image. (Background img is still GMat).
    • Add basic onevpl support. (See --use_onevpl, --vpl_params)

    The next steps will be:

    • Fully support onevpl functionality
    • Use GFrame for background img.
    opened by TolyaTalamanov 28
  • Sb/time series checker

    Sb/time series checker

    • Add annotation converter for electricity dataset (data source https://archive.ics.uci.edu/ml/machine-learning-databases/00321/LD2011_2014.txt.zip)
    • Add Annotation representation for Time Series
    • Add Adapter for temporal fusion transformer
    • Add normalized_quantile_loss metric
    • Add postprocessor for time series electricity
    accuracy checker 
    opened by bes-dev 27
  • Update multi camera multi person tracker

    Update multi camera multi person tracker

    Here is an update for multi camera multi person tracker. Were added the next features:

    • support of instance segmentation network
    • support of orientation classification network
    • metrics evaluation
    • history visualization
    • visual analyzer
    • bug fixes

    @snosov1, @sovrasov, @AlexanderDokuchaev, please take a look.

    opened by DmitriySidnev 21
  • "malloc(): unsorted double linked list corrupted" problem on raspberry pi4 when I run the C++ demo smart_classroom_demo

    when I run the C++ demo smart_classroom_demo on Ubuntu it's working well, but when I run the C++ demo smart_classroom_demo on raspberry pi4, I got the error message "malloc(): unsorted double linked list corrupted" I want to know how to solve this problem. thanks

    pi@raspberrypi:~/open_model_zoo/demos/build/armv7l/Release $ ./smart_classroom_demo -m_fd ~/model/face-detection-adas-0001.xmlGi -m_act ~/model/person-detection-action-recognition-0005.xml -i ~/ncs2_test/May.mp4 -d_act MYRIAD -d_fd MYRIAD -d_lm MYRIAD -d_reid MYRIAD
    [ INFO ] InferenceEngine: 	API version ......... 2.1
    	Build ........... 2021.1.0-1237-bece22ac675-releases/2021/1
    [ INFO ] Parsing input parameters
    [ INFO ] Loading Inference Engine
    [ INFO ] Device info: 
    	MYRIAD
    	myriadPlugin version ......... 2.1
    	Build ........... 2021.1.0-1237-bece22ac675-releases/2021/1
    
    malloc(): unsorted double linked list corrupted
    已經終止
    

    btw. on Ubuntu it's working well

    dvlab@dvlab-Z370M-D3H:~/inference_engine_demos_build/intel64/Release$ ./smart_classroom_demo -m_act ~/models/person-detection-action-recognition-0005.xml -m_fd ~/models/face-detection-adas-0001.xml -i /home/dvlab/下載/sit.mp4 -d_act MYRIAD -d_fd MYRIAD -d_lm MYRIAD -d_reid MYRIAD
    [ INFO ] InferenceEngine: 	API version ......... 2.1
    	Build ........... 2021.1.0-1237-bece22ac675-releases/2021/1
    [ INFO ] Parsing input parameters
    [ INFO ] Loading Inference Engine
    [ INFO ] Device info: 
    	MYRIAD
    	myriadPlugin version ......... 2.1
    	Build ........... 2021.1.0-1237-bece22ac675-releases/2021/1
    
    [ WARNING ] Face recognition models are disabled!
    To close the application, press 'CTRL+C' here or switch to the output window and press ESC key
    
    [ INFO ] Mean FPS: 3.67467
    [ INFO ] Frames processed: 15
    
    [ INFO ] Execution successful
    
    ARM RaspberryPi 
    opened by ghost 20
  • Specified models for pedestrian_tracker_demo

    Specified models for pedestrian_tracker_demo

    Specified list of supported models. Models are not supported by this demo: person-detection-retail-0002, person-reidentification-retail-0076, person-reidentification-retail-0079.

    opened by dkustikx 20
  • Use openvino to implement openpose

    Use openvino to implement openpose

    Implement the models that openpose supported, including body pose, face, and hand. You can easily change the model type to be used by annotating the code in "human_pose_estimator.cpp", "peak.cpp", "render_human_pose.cpp"; and recompile, like

    #define COCO //#define MPI //#define BODY_25 //#define FACE //#define HAND

    The gitbub of openpose is: https://github.com/CMU-Perceptual-Computing-Lab/openpose You can download openpose models followed the link. It is worth noting that have to remove the last concat layer before MO if the model is coco or mpi.

    community contribution 
    opened by Chen-MingChang 19
  • Interactive Open Model Zoo as OpenCV module

    Interactive Open Model Zoo as OpenCV module

    import cv2 as cv
    
    from cv2.open_model_zoo.topologies import mobilenet_ssd
    from cv2.open_model_zoo import DnnDetectionModel
    
    frame = cv.imread('example.jpg')
    
    ssd = mobilenet_ssd()
    net = DnnDetectionModel(ssd)
    
    classIds, confidences, boxes = net.detect(frame, confThreshold=0.5)
    for box in boxes:
        cv.rectangle(frame, box, (0, 255, 0))
    
    cv.imshow('out', frame)
    cv.waitKey()
    

    :snake: :wolf: :lion: :hear_no_evil: :elephant: :penguin: :octopus: :shark: :mouse: :camel: :frog: :ant: :leopard: :fox_face: :fish: :smirk_cat: :beetle: :whale: :koala: :spider: :dragon: :dove: :owl: :rabbit: :chicken: :crab: :bee:

    WIP

    • [x] Add inference. There are multiple options: ~~1. Each topology is derived from OpenCV's dnn::Model. Using downloaded models IENetwork can be initialized separately.~~
      1. Each topology is just meta class which can be casted to dnn::Model or IENetwork
    • [x] Hash verification: do not download if hash is correct and verify hash after downloading (show warning in case of mismatch)
    • [x] Cross platform cache dir (using OPENCV_OPEN_MODEL_ZOO_CACHE_DIR environment variable to change default caching location)
    • [x] Archives management: tar.gz
    • [x] Model Optimizer conversion
    • [x] Added aliases for OpenVINO models (by highest version)
    • [x] Archives management: zip
    • [x] Downloading info: size and progress
    • [x] Downloading and inference tests.
    • [x] OpenCV DNN: input size
    • [x] Documentation at help(topology): description, license etc.
    • [ ] Replace pyyaml to own YAML implementation due it's not default Python package (is required only for build)
    • [x] Errors handling (Downloading, Model Optimizer)
    • [x] Remove archives after extract
    • [x] Let user add and override Model Optimizer arguments
    • [ ] Keep layout similar to the downloader

    related: https://github.com/opencv/opencv/issues/14730

    Pipelines

    example with text recognition demo API:

    import cv2 as cv
    import numpy as np
    
    from cv2.open_model_zoo import TextRecognitionPipeline
    
    frame = cv.imread('text.jpg')
    
    p = TextRecognitionPipeline()
    
    rects, texts, confs = p.process(frame)
    
    for rect, text in zip(rects, texts):
        vertices = cv.boxPoints(rect)
    
        for j in range(4):
            p1 = (vertices[j][0], vertices[j][1])
            p2 = (vertices[(j + 1) % 4][0], vertices[(j + 1) % 4][1])
            cv.line(frame, p1, p2, (0, 255, 0), 1);
    
        x = np.min(vertices[:,0])
        y = np.min(vertices[:,1])
        cv.putText(frame, text, (x, y), cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), thickness=2)
    
    cv.imshow('frame', frame)
    cv.waitKey()
    

    res

    import cv2 as cv
    from cv2.open_model_zoo import HumanPoseEstimation
    
    p = HumanPoseEstimation()
    
    frame = cv.imread('example.png')
    poses = p.process(frame)
    
    p.render(frame, poses)
    
    cv.imshow('res', frame)
    cv.waitKey()
    

    out

    opened by dkurt 19
  • Yolo-v3-tiny-tf output layers are swapped

    Yolo-v3-tiny-tf output layers are swapped

    Hello! I tried to download and convert model yolo-v3-tiny-tf. After omz_converter the output layers are swapped.

    In the model description:

    Converted model
        The array of detection summary info, name - conv2d_9/BiasAdd/YoloRegion, shape - 1, 13, 13, 255. The anchor values are 81,82, 135,169, 344,319.
        The array of detection summary info, name - conv2d_12/BiasAdd/YoloRegion, shape - 1, 26, 26, 255. The anchor values are 23,27, 37,58, 81,82.
    

    After omz_converter in the file "yolo-v3-tiny-tf.xml":

    		<layer id="107" name="conv2d_12/Conv2D/YoloRegion/sink_port_0" type="Result" version="opset1">
    			<rt_info>
    				<attribute name="fused_names" version="0" value="conv2d_12/Conv2D/YoloRegion/sink_port_0"/>
    			</rt_info>
    			<input>
    				<port id="0" precision="FP32">
    					<dim>1</dim>
    					<dim>255</dim>
    					<dim>26</dim>
    					<dim>26</dim>
    				</port>
    			</input>
    		</layer>
    		<layer id="66" name="conv2d_9/Conv2D/YoloRegion/sink_port_0" type="Result" version="opset1">
    			<rt_info>
    				<attribute name="fused_names" version="0" value="conv2d_9/Conv2D/YoloRegion/sink_port_0"/>
    			</rt_info>
    			<input>
    				<port id="0" precision="FP32">
    					<dim>1</dim>
    					<dim>255</dim>
    					<dim>13</dim>
    					<dim>13</dim>
    				</port>
    			</input>
    		</layer>
    

    Is it a bug?

    Windows 10 Python 3.9.0 openvino 2022.2.0 pypi_0 pypi openvino-dev 2022.2.0 pypi_0 pypi

    opened by FenixFly 1
  • multi_camera_multi_target_tracking_demo problem

    multi_camera_multi_target_tracking_demo problem

    I run multi_camera_multi_target_tracking_demo and got the following error.

    $ python3 multi_camera_multi_target_tracking_demo.py -i ~/images/4p-c0.avi ~/images/4p-c1.avi --m_segmentation ./intel/instance-segmentation-security-0228/FP32/instance-segmentation-security-0228.xml --m_reid ./intel/person-reidentification-retail-0277/FP32/person-reidentification-retail-0288.xml --config configs/person.py --output_video output.mp4
    /home/********/.local/lib/python3.8/site-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.1.36ubuntu1 is an invalid version and will not be supported in a future release
      warnings.warn(
    /home/********/.local/lib/python3.8/site-packages/pkg_resources/__init__.py:116: PkgResourcesDeprecationWarning: 0.23ubuntu1 is an invalid version and will not be supported in a future release
      warnings.warn(
    [ DEBUG ] Reading config from configs/person.py
    [ INFO ] OpenVINO Runtime
    [ INFO ] 	build: 2022.2.0-7713-af16ea1d79a-releases/2022/2
    Traceback (most recent call last):
      File "multi_camera_multi_target_tracking_demo.py", line 283, in <module>
        sys.exit(main() or 0)
      File "multi_camera_multi_target_tracking_demo.py", line 262, in main
        object_detector = MaskRCNN(core, args.m_segmentation,
      File "/home/********/open_model_zoo/demos/multi_camera_multi_target_tracking_demo/python/utils/network_wrappers.py", line 136, in __init__
        super().__init__(core, model_path, device, 'Instance Segmentation', self.max_reqs)
    TypeError: object.__init__() takes exactly one argument (the instance to initialize)
    

    Please fix it or let me know how to escape it.

    opened by Ishihara-Masabumi 0
  • Missing training pipeline for instance-segmentation-person-0007?

    Missing training pipeline for instance-segmentation-person-0007?

    opened by martin0258 1
Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

t5-japanese Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts. The following is a list of models that

Kimio Kuramitsu 1 Dec 13, 2021
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

Salesforce 334 Jan 6, 2023
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

null 172 Dec 23, 2022
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 4, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 7, 2022
High level network definitions with pre-trained weights in TensorFlow

TensorNets High level network definitions with pre-trained weights in TensorFlow (tested with 2.1.0 >= TF >= 1.4.0). Guiding principles Applicability.

Taehoon Lee 1k Dec 13, 2022
E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation E2EC: An End-to-End Contour-based Method for High-Quality H

zhangtao 146 Dec 29, 2022
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 2, 2022
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

null 152 Jan 2, 2023
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Hailo Model Zoo The Hailo Model Zoo provides pre-trained models for high-performance deep learning applications. Using the Hailo Model Zoo you can mea

Hailo 50 Dec 7, 2022
Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

InversePrompting Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting Code: The code is provided in the "chinese_ip"

THUDM 101 Dec 16, 2022
This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Dynamic-Vision-Transformer (Pytorch) This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT). Not All Ima

null 210 Dec 18, 2022
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Microsoft 1.3k Dec 30, 2022
KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

KoRean based ELECTRA (KR-ELECTRA) This is a release of a Korean-specific ELECTRA model with comparable or better performances developed by the Computa

null 12 Jun 3, 2022
Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 385 Jan 6, 2023
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters This repository is the implementation of the paper "K-Adapter: Infusing Knowledge

Microsoft 118 Dec 13, 2022