This is a vision-based 3d model manipulation and control UI

Overview

Manipulation of 3D Models Using Hand Gesture

This program allows user to manipulation 3D models (.obj format) with their hands. The project support both the OAK-D and OAK-D-LITE.

3d-manipulation

Install dependencies

On an Intel-based macOS or Linux machine, run the following command in the terminal:

git clone https://github.com/cortictechnology/vision_ui.git
cd vision_ui
python3 -m pip install -r requirements.txt

For Linux only, make sure your OAK-D device is not plugged in and then run the following:

echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="03e7", MODE="0666"' | sudo tee /etc/udev/rules.d/80-movidius.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

To run

  1. Make sure the OAK-D/OAK-D-Lite device is plug into the computer.
  2. In the terminal, run
python3 main.py

AI Model description

The ai_models folder includes two Intel Myriad X optimized models:

  1. palm_detection_sh4.blob: This is the palm detection model
  2. hand_landmark_sh4.blob: This is the model to detect the hand landmarks using the palm detection model

Credits

You might also like...
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions. In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

Provided is code that demonstrates the training and evaluation of the work presented in the paper:
Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

FFD Source Code Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face M

CLIPort: What and Where Pathways for Robotic Manipulation
CLIPort: What and Where Pathways for Robotic Manipulation

CLIPort CLIPort: What and Where Pathways for Robotic Manipulation Mohit Shridhar, Lucas Manuelli, Dieter Fox CoRL 2021 CLIPort is an end-to-end imitat

Python package for covariance matrices manipulation and Biosignal classification with application in Brain Computer interface

pyRiemann pyRiemann is a python package for covariance matrices manipulation and classification through Riemannian geometry. The primary target is cla

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation Official PyTorch implementation for the paper Look

Real-Time Multi-Contact Model Predictive Control via ADMM

Here, you can find the code for the paper 'Real-Time Multi-Contact Model Predictive Control via ADMM'. Code is currently being cleared up and optimize

Comments
  • Aborted (core dumped)

    Aborted (core dumped)

    (depthai-test) bharath@nitro:~/Desktop/PROJECTS/depthai_projects/vision_ui$ python3 main.py 
    pygame 2.0.1 (SDL 2.0.14, Python 3.9.7)
    Hello from the pygame community. https://www.pygame.org/contribute.html
    Palm detection blob : /home/bharath/Desktop/PROJECTS/depthai_projects/vision_ui/ai_models/palm_detection_sh4.blob
    Landmark blob       : /home/bharath/Desktop/PROJECTS/depthai_projects/vision_ui/ai_models/hand_landmark_sh4.blob
    [14442C10917AC8D200] [199.826] [system] [info] PRINT:LeonCss: BootloaderConfig.options1 checksum doesn't match. Is: 0x102C4414 should be: 0xAED890DB
    BootloaderConfig.options2 checksum doesn't match. Is: 0x00000000 should be: 0x2380F205
    GPIO boot mode 0x16, interface USBD
    Setting aons(0..4) back to boot from flash (offset = 0)
    ====ENABLE WATCHDOG====1
    --> brdInit ...
    initial keepalive, countdown: 10
    PLL0: 700000 AUX_IO0: 24000 AUX_IO1: 24000 MCFG: 24000 MECFG: 24000
    Board init ret 3
    brdInitAuxDevices: Error: SC = 27: io_initialize expander_cam_gpios_1 [OK]
    
    spi_N25Q_init: Flash JEDEC ID: ff ff ff
    eeprom_read_status 0!
    EEPROM read status 0
    EEPROM data version: 5
    Closing EEPROm!
    Is booted from flash by bootloader: 0
    Networking not available...
    Called by: LOS, controller: LOS
    Enumerating on socket: Cam_A / RGB / Center
    IMX378 patch for VCM type
      >> Registered camera A12N02A (imx378) as /dev/Camera_0
    Enumerating on socket: Cam_B / Left
      >> Registered camera TG161B (ov9282) as /dev/Camera_1
    Enumerating on socket: Cam_C / Right
      >> Registered camera TG161B (ov9282) as /dev/Camera_2
    Enumerating on socket: CAM_D
    Initializing XLink...
    UsbPumpVscAppI_Event: 5 VSC2_EVENT_ATTACH
    UsbPumpVscAppI_Event: 2 VSC2_EVENT_SUSPEND
    UsbPumpVscAppI_Event: 3 VSC2_EVENT_RESUME
    UsbPumpVscAppI_Event: 4 VSC2_EVENT_RESET
    initial keepalive, countdown: 9
    UsbPumpVscAppI_Event: 4 VSC2_EVENT_RESET
    UsbPumpVscAppI_Event: 4 VSC2_EVENT_RESET
    UsbPumpVscAppI_Event: 4 VSC2_EVENT_RESET
    UsbPumpVscAppI_Event: 2 VSC2_EVENT_SUSPEND
    UsbPumpVscAppI_Event: 3 VSC2_EVENT_RESUME
    UsbPumpVscAppI_Event: 0 VSC2_EVENT_INTERFACE_UP
    UsbPumpVscAppI_Event: 2 VSC2_EVENT_SUSPEND
    UsbPumpVscAppI_Event: 3 VSC2_EVENT_RESUME
    Done!
    Usb connection speed: High - USB 2.0
    I: [Timesync] [   1019244] [main] startSync:116     Timesync | Callback not set
    Temperature: Driver registered.
    Temperature: Initialized driver.
    Temperature: Sensor opened: CSS.
    Temperature: Sensor opened: MSS.
    Temperature: Sensor opened: UPA.
    Temperature: Sensor opened: DSS.
    [14442C10917AC8D200] [199.826] [system] [info] PRINT:LeonMss: Called by: LRT, controller: LOS
    Internal camera FPS set to: 23
    Sensor resolution: (1920, 1080)
    Internal camera image size: 1152 x 648 - crop_w:0 pad_h: 252
    896 anchors have been created
    Creating pipeline...
    Creating Color Camera...
    Creating Palm Detection Neural Network...
    Creating Hand Landmark Neural Network...
    Pipeline created.
    terminate called without an active exception
    Stack trace (most recent call last):
    #29   Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in 
    #28   Object "python3", at 0x555f061d1bc2, in 
    #27   Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fadc0bd80b2, in __libc_start_main
    #26   Object "python3", at 0x555f0624cd78, in Py_BytesMain
    #25   Object "python3", at 0x555f0624cc7e, in Py_RunMain
    #24   Object "python3", at 0x555f0624c49e, in PyRun_SimpleFileExFlags
    #23   Object "python3", at 0x555f060ec60c, in 
    #22   Object "python3", at 0x555f06247704, in 
    #21   Object "python3", at 0x555f06213c5a, in 
    #20   Object "python3", at 0x555f061629ea, in PyEval_EvalCode
    #19   Object "python3", at 0x555f06213bab, in PyEval_EvalCodeEx
    #18   Object "python3", at 0x555f061618e1, in 
    #17   Object "python3", at 0x555f061aa852, in _PyEval_EvalFrameDefault
    #16   Object "python3", at 0x555f061239ba, in _PyObject_MakeTpCall
    #15   Object "python3", at 0x555f0615d3d8, in 
    #14   Object "python3", at 0x555f0615384d, in _PyObject_FastCallDictTstate
    #13   Object "python3", at 0x555f06162526, in _PyFunction_Vectorcall
    #12   Object "python3", at 0x555f061618e1, in 
    #11   Object "python3", at 0x555f060d8fdb, in 
    #10   Object "python3", at 0x555f060b4e18, in 
    #9    Object "python3", at 0x555f061239ee, in _PyObject_MakeTpCall
    #8    Object "python3", at 0x555f06153713, in 
    #7    Object "/home/bharath/anaconda3/envs/depthai-test/lib/python3.9/site-packages/depthai.cpython-39-x86_64-linux-gnu.so", at 0x7fad98ac3a6e, in 
    #6    Object "/home/bharath/anaconda3/envs/depthai-test/lib/python3.9/site-packages/depthai.cpython-39-x86_64-linux-gnu.so", at 0x7fad98adef92, in 
    #5    Object "/home/bharath/anaconda3/envs/depthai-test/lib/python3.9/site-packages/depthai.cpython-39-x86_64-linux-gnu.so", at 0x7fad98c24cc3, in dai::DeviceBase::startPipelineImpl(dai::Pipeline const&)
    #4    Object "/home/bharath/anaconda3/envs/depthai-test/bin/../lib/libstdc++.so.6", at 0x7fadb7944fb0, in std::terminate()
    #3    Object "/home/bharath/anaconda3/envs/depthai-test/bin/../lib/libstdc++.so.6", at 0x7fadb7944f6e, in 
    #2    Object "/home/bharath/anaconda3/envs/depthai-test/bin/../lib/libstdc++.so.6", at 0x7fadb7946871, in __gnu_cxx::__verbose_terminate_handler()
    #1    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fadc0bd6858, in abort
    #0    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fadc0bf718b, in gsignal
    Aborted (Signal sent by tkill() 4125 1000)
    Aborted (core dumped)
    

    unable to run please help...:((

    my env specs: Python 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] :: Anaconda, Inc. on linux

    opened by bharath5673 1
Owner
Cortic Technology Corp.
Cortic Technology Corp.
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Ravens is a collection of simulated tasks in PyBullet for learning vision-based robotic manipulation, with emphasis on pick and place. It features a Gym-like API with 10 tabletop rearrangement tasks, each with (i) a scripted oracle that provides expert demonstrations (for imitation learning), and (ii) reward functions that provide partial credit (for reinforcement learning).

Google Research 367 Jan 9, 2023
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation [Project website] [Paper] This project is a PyTorch i

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 6 Feb 28, 2022
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

VITTAL 1 Jan 12, 2022
ROS-UGV-Control-Interface - Control interface which can be used in any UGV

ROS-UGV-Control-Interface Cam Closed: Cam Opened:

Ahmet Fatih Akcan 1 Nov 4, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"

Optimal Model Design for Reinforcement Learning This repository contains JAX code for the paper Control-Oriented Model-Based Reinforcement Learning wi

Evgenii Nikishin 43 Sep 28, 2022
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Vision Longformer This project provides the source code for the vision longformer paper. Multi-Scale Vision Longformer: A New Vision Transformer for H

Microsoft 209 Dec 30, 2022
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 9, 2023
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

AI2 79 Dec 23, 2022