1887 Python Scene-text-recognition Libraries

[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

dispersion-score Official implementation of 3DV 2021 Paper A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Rec

7 May 28, 2022

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

QAConv Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting This PyTorch code is proposed in

166 Dec 28, 2022

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

578 Dec 30, 2022

Face Recognition Attendance Project

Face-Recognition-Attendance-Project In This Project You will learn how to mark attendance using face recognition, Hello Guys This is Gautam Kumar, Thi

1 Dec 3, 2022

Various Algorithms for Short Text Mining

Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te

466 Dec 6, 2022

CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

1.8k Dec 30, 2022

Tightness-aware Evaluation Protocol for Scene Text Detection

TIoU-metric Release on 27/03/2019. This repository is built on the ICDAR 2015 evaluation code. If you propose a better metric and require further eval

206 Nov 18, 2022

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis This repository contains the code that accompanies our paper ATISS: Autoregressive Trans

138 Dec 22, 2022

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

66 Dec 1, 2021

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r

6 Dec 1, 2021

A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

A leetcode scraper to compile all questions in leetcode free tier to text file, pdf also available. if new questions get added, run again to get new questions.

3 Dec 7, 2021

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r

170 Jan 3, 2023

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

61 Nov 30, 2021

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Blended Diffusion for Text-driven Editing of Natural Images Blended Diffusion for Text-driven Editing of Natural Images Omri Avrahami, Dani Lischinski

328 Dec 30, 2022

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

1.8k Dec 28, 2022

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER. @inproceedings{tedes

40 Dec 11, 2022

WSDM‘2022: Knowledge Enhanced Sports Game Summarization

Knowledge Enhanced Sports Game Summarization Cooming Soon! :) Data will be released after approval process. Code will be published once the author of

14 Jul 13, 2022

This repository contains all code and data for the Inside Out Visual Place Recognition task

Inside Out Visual Place Recognition This repository contains code and instructions to reproduce the results for the Inside Out Visual Place Recognitio

15 May 21, 2022

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data Authors: Yi-Chang Chen, Yu-Chuan Chang, Yen-Cheng Chang and Yi-Ren Ye

5 Dec 15, 2022

An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

Kazakh Named Entity Recognition This repository contains an open-source Kazakh named entity recognition dataset (KazNERD), named entity annotation gui

9 Dec 23, 2022

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition The unofficial code of CDistNet. Now, we ha

25 Jul 20, 2022

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Pytorch Pedestrian Attribute Recognition: A strong PyTorch baseline of pedestrian attribute recognition and multi-label classification.

79 Dec 18, 2022

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

148 Nov 29, 2021

A procedural Blender pipeline for photorealistic training image generation

BlenderProc2 A procedural Blender pipeline for photorealistic rendering. Documentation | Tutorials | Examples | ArXiv paper | Workshop paper Features

1.8k Jan 2, 2023

Text-Based zombie apocalyptic decision-making game in Python

Inspiration We shared university first year game coursework.[to gauge previous experience and start brainstorming] Adapted a particular nuclear fallou

2 Feb 17, 2022

A Machine Teaching Framework for Scalable Recognition

MEMORABLE This repository contains the source code accompanying our ICCV 2021 paper. A Machine Teaching Framework for Scalable Recognition Pei Wang, N

2 Dec 8, 2021

Combine Tacotron2 and Hifi GAN to generate speech from text

EndToEndTextToSpeech Combine Tacotron2 and Hifi GAN to generate speech from text Download weights Hifi GAN - hifi_gan/checkpoint/ : pretrain 2.5M ste

1 Dec 18, 2021

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Requirements pytorch 1.1+ torchvision 0.3+ pyclipper opencv3 gcc

400 Dec 26, 2022

A simple telegram bot to recognize lengthy voice files to text and vice versa with multiple language support.

Voicebot A simple Telegram bot to convert lengthy voice clips to text and vice versa with supporting languages. Mandatory Variables API_HASH - Yo

12 Oct 21, 2022

Translate U is capable of translating the text present in an image from one language to the other.

Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.

1 Dec 22, 2021

Conversational text Analysis using various NLP techniques

PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb

158 Dec 25, 2022

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

VirtualAssistant Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries. Third Party Libraries us

1 Nov 27, 2021

Text-to-Speech for Belarusian language

title emoji colorFrom colorTo sdk app_file pinned Belarusian TTS 🐸 green green gradio app.py false Belarusian TTS 📢 🤖 Belarusian TTS (text-to-speec

1 Nov 27, 2021

Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

clip-text-decoder Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script. Example Predi

36 Dec 21, 2022

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

MAUVE MAUVE is a library built on PyTorch and HuggingFace Transformers to measure the gap between neural text and human text with the eponymous MAUVE

182 Jan 2, 2023

labelpix is a graphical image labeling interface for drawing bounding boxes

Welcome to labelpix 👋 labelpix is a graphical image labeling interface for drawing bounding boxes. 🏠 Homepage Install pip install -r requirements.tx

26 May 24, 2022

This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

4.2k Jan 7, 2023

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Faster R-CNN pretrained on VisualGenome This repository modifies maskrcnn-benchmark for object detection and attribute prediction on VisualGenome data

7 Apr 20, 2021

Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

ConSeC is a novel approach to Word Sense Disambiguation (WSD), accepted at EMNLP 2021. It frames WSD as a text extraction task and features a feedback loop strategy that allows the disambiguation of a target word to be conditioned not only on its context but also on the explicit senses assigned to nearby words.

36 Dec 13, 2022

Simple python program to auto credit your code, text, book, whatever!

Credit Simple python program to auto credit your code, text, book, whatever! Setup First change credit_text to whatever text you would like to credit

1 Jan 29, 2022

A neat little program to read the text from the "All Ten Fingers" program, and write them back.

ATFTyper A neat little program to read the text from the "All Ten Fingers" program, and write them back. How does it work? This program uses the Pillo

1 Nov 26, 2021

Some example code for using a raspberry pi to draw text (including emojis) and twitch emotes to a HUB75 RGB matrix via an HTTP post endpoint.

7 Nov 5, 2022

Convert Video Files To Text And Audio

Video-To-Text Convert Video Files To Text And Audio Convert To Audio 1: open dvtt folder in cmd 2: run this command in cmd = main.py Audio Convert To

2 Dec 5, 2021

Attendance Monitoring with Face Recognition using Python

Attendance Monitoring with Face Recognition using Python A python GUI integrated attendance system using face recognition to take attendance. In this

2 Jun 21, 2022

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

266 Nov 24, 2022

A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

1.7k Dec 26, 2022

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022

A hilarious program that translates text to Google's

Bork, bork, bork! A hilarious program that translates text in a file to Google's "bork bork bork' language. How to use Download the main.py file. Chan

1 Dec 17, 2021

Code for text augmentation method leveraging large-scale language models

HyperMix Code for our paper GPT3Mix and conducting classification experiments using GPT-3 prompt-based data augmentation. Getting Started Installing P

47 Dec 20, 2022

Automated question generation and question answering from Turkish texts using text-to-text transformers

Turkish Question Generation Offical source code for "Automated question generation & question answering from Turkish texts using text-to-text transfor

29 Dec 14, 2022

A python gui program to generate reddit text to speech videos from the id of any post.

Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate

17 Dec 19, 2022

Romanian Automatic Speech Recognition from the ROBIN project

RobinASR This repository contains Robin's Automatic Speech Recognition (RobinASR) for the Romanian language based on the DeepSpeech2 architecture, tog

10 Jan 1, 2023

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

PreSumm This code is for EMNLP 2019 paper Text Summarization with Pretrained Encoders Updates Jan 22 2020: Now you can Summarize Raw Text Input!. Swit

1.2k Dec 28, 2022

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes This repository is the official implementation of Us

0 Oct 18, 2021

Nested Named Entity Recognition

Nested Named Entity Recognition Training Dataset: CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark url: https://tianchi.aliyun.

8 Dec 25, 2022

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

3k Jan 7, 2023

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

2s-AGCN Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19 Note PyTorch version should be 0.3! For PyTor

547 Dec 26, 2022

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

DSPoint Official implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion". Paper link: https://arxiv.org/abs/2111.10

10 Nov 24, 2021

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

Extracting OpenAI CLIP (Global/Grid) Features from Image and Text This repo aims at providing an easy to use and efficient code for extracting image &

13 Jan 6, 2023

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

M2MeT challenge baseline -- AliMeeting This project provides the baseline system recipes for the ICASSP 2020 Multi-channel Multi-party Meeting Transcr

81 Dec 8, 2022

R interface to fast.ai

R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod

113 Dec 20, 2022

Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

43 Nov 28, 2022

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

709 Dec 31, 2022

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

5 Apr 5, 2022

Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

14 Sep 14, 2022

Face Recognition and Emotion Detector Device

Face Recognition and Emotion Detector Device Orange PI 1 Python 3.10.0 + Django 3.2.9 Project's file explanation Django manage.py Django commands hand

2 Dec 21, 2021

AUDD IS MUSIC RECOGNITION API

1 Dec 15, 2021

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

294 Dec 12, 2022

Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

424 Dec 1, 2022

State of the art faster Natural Language Processing in Tensorflow 2.0 .

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 ****************************************************************************

74 Dec 5, 2022

Create beautiful diagrams just by typing mathematical notation in plain text.

Penrose Penrose is an early-stage system that is still in development. Our system is not ready for contributions or public use yet, but hopefully will

5.6k Jan 8, 2023

Extract countries, regions and cities from a URL or text

This project is no longer being maintained and has been archived. Please check the Forks list for newer versions. Forks We are aware of two 3rd party

216 Nov 18, 2022

Raven is a tool written in Python3 allowing you to generate an unique image with some text.

🐦 Raven is a tool written in Python3 allowing you to generate an unique image with some text. It does it by searching the text on Google, do

39 Dec 20, 2022

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

8 Aug 2, 2022

Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

Geometrically Adaptive Dictionary Attack on Face Recognition This is the Pytorch code of our paper "Geometrically Adaptive Dictionary Attack on Face R

6 Nov 21, 2022

German Text-To-Speech Engine using Tacotron and Griffin-Lim

jotts JoTTS is a German text-to-speech engine using tacotron and griffin-lim. The synthesizer model has been trained on my voice using Tacotron1. Due

6 Aug 28, 2022

Detectron2 for Document Layout Analysis

Detectron2 trained on PubLayNet dataset This repo contains the training configurations, code and trained models trained on PubLayNet dataset using Det

163 Nov 21, 2022

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models. Solve a variety of tasks with pre-trained models or finetune them in

227 Dec 10, 2022

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

126 Dec 30, 2022

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

114 Oct 16, 2022

Python Scene-text-recognition Resources

Python scene-text-recognition Libraries

[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

Face Recognition Attendance Project

Various Algorithms for Short Text Mining

CCPD: a diverse and well-annotated dataset for license plate detection and recognition

Tightness-aware Evaluation Protocol for Scene Text Detection

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

WSDM‘2022: Knowledge Enhanced Sports Game Summarization

This repository contains all code and data for the Inside Out Visual Place Recognition task

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

A procedural Blender pipeline for photorealistic training image generation

Text-Based zombie apocalyptic decision-making game in Python

A Machine Teaching Framework for Scalable Recognition

Combine Tacotron2 and Hifi GAN to generate speech from text

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

A simple telegram bot to recognize lengthy voice files to text and vice versa with multiple language support.

Translate U is capable of translating the text present in an image from one language to the other.

Conversational text Analysis using various NLP techniques

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

Text-to-Speech for Belarusian language

Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

labelpix is a graphical image labeling interface for drawing bounding boxes

This tool will help you convert your text to handwriting xD

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

Simple python program to auto credit your code, text, book, whatever!

A neat little program to read the text from the "All Ten Fingers" program, and write them back.

Some example code for using a raspberry pi to draw text (including emojis) and twitch emotes to a HUB75 RGB matrix via an HTTP post endpoint.

Convert Video Files To Text And Audio

Attendance Monitoring with Face Recognition using Python

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

A large-scale face dataset for face parsing, recognition, generation and editing.

Semi-supervised Implicit Scene Completion from Sparse LiDAR

A hilarious program that translates text to Google's

Code for text augmentation method leveraging large-scale language models

Automated question generation and question answering from Turkish texts using text-to-text transformers

A python gui program to generate reddit text to speech videos from the id of any post.

Romanian Automatic Speech Recognition from the ROBIN project

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes

Nested Named Entity Recognition

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

R interface to fast.ai

Creating multimodal multitask models

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

Efficient Speech Processing Tookit for Automatic Speaker Recognition

Face Recognition and Emotion Detector Device

AUDD IS MUSIC RECOGNITION API

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

Development kit for MIT Scene Parsing Benchmark

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Create beautiful diagrams just by typing mathematical notation in plain text.

Extract countries, regions and cities from a URL or text

Raven is a tool written in Python3 allowing you to generate an unique image with some text.

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

German Text-To-Speech Engine using Tacotron and Griffin-Lim

Detectron2 for Document Layout Analysis

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.