2930 Python Image-to-text-software Libraries

Software that extracts spreadsheets from various .pdf files to .csv

Extração de planilhas de diversos arquivos .pdf para .csv O código inteiro foi desenvolvido em Python. Foi utilizado o pacote "tabula" e a biblioteca

2 Jan 9, 2022

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Parallel WaveGAN implementation with Pytorch This repository provides UNOFFICIAL pytorch implementations of the following models: Parallel WaveGAN Mel

1.2k Dec 23, 2022

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

13.6k Jan 5, 2023

Codes for “A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection”

DSAMNet The pytorch implementation for "A Deeply-supervised Attention Metric-based Network and an Open Aerial Image Dataset for Remote Sensing Change

41 Dec 14, 2022

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image This repository contains the code for the following paper: R. Hu,

37 Jan 4, 2023

A pythonic interface to high-throughput virtual screening software

pyscreener A pythonic interface to high-throughput virtual screening software Overview This repository contains the source of pyscreener, both a libra

56 Dec 15, 2022

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

305 Dec 30, 2022

StyleSwin: Transformer-based GAN for High-resolution Image Generation

StyleSwin This repo is the official implementation of "StyleSwin: Transformer-based GAN for High-resolution Image Generation". By Bowen Zhang, Shuyang

349 Dec 28, 2022

A pipeline for making highlighted text stand-alone.

title emoji colorFrom colorTo sdk app_file pinned decontextualizer 📤 green gray streamlit main.py false Decontextualizer As a second step in improvin

26 Dec 17, 2022

Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

9 Dec 21, 2021

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

Locally-Shifted-Attention-With-Early-Global-Integration Pretrained models You can download all the models from here. Training Imagenet python -m torch

14 Apr 15, 2022

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

7 Jan 12, 2022

Dumps the payload.bin image found in Android update images.

payload dumper Dumps the payload.bin image found in Android update images. Has significant performance gains over other tools due to using multiproces

7 Nov 17, 2022

A unet implementation for Image semantic segmentation

Unet-pytorch a unet implementation for Image semantic segmentation 参考网上的Unet做分割的代码，做了一个针对kaggle地盐识别的，请去以下地址获取数据集: https://www.kaggle.com/c/tgs-salt-id

3 Jun 29, 2022

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions Accepted by AAAI 2022 [arxiv] Wenyu Liu, Gaofeng Ren, Runsheng Yu, Shi Guo, Jia

245 Dec 16, 2022

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".

CrossSum This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summ

29 Nov 19, 2022

No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

26 Jan 4, 2023

Aircache is an open-source caching and security solution that can be integrated with most decoupled apps that use REST APIs for communicating.

AirCache Aircache is an open-source caching and security solution that can be integrated with most decoupled apps that use REST APIs for communicating

2 Dec 22, 2021

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC

43 Dec 21, 2022

An A-SOUL Text Generator Based on CPM-Distill.

ASOUL-Generator-Backend 本项目为 https://asoul.infedg.xyz/ 的后端。模型为基于 CPM-Distill 的 transformers 转化版本 CPM-Generate-distill 训练而成。

46 Dec 11, 2022

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

2.9k Jan 4, 2023

Depth image based mouse cursor visual haptic

Depth image based mouse cursor visual haptic How to run it. Install pyqt5. Install python modules pip install Pillow pip install numpy For illustrati

17 Dec 20, 2022

High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

5.6k Jan 4, 2023

This is a really simple text-to-speech app made with python and tkinter.

Tkinter Text-to-Speech App by Souvik Roy This is a really simple tkinter app which converts the text you have entered into a speech. It is created wit

1 Dec 21, 2021

Real-time text-editor using python tcp socket

Real-time text-editor using python tcp socket This project does not need any external libraries so you don't need to use virtual environments. All you

3 Aug 5, 2022

VIsually-Pivoted Audio and(N) Text

VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled

16 Nov 4, 2022

A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Duplicate Image Detection Getting Started Install dependencies pip install -r requirements.txt Run service python main.py Testing Test with pytest How

21 Nov 11, 2022

Software Platform for solving and manipulating multiparametric programs in Python

PPOPT Python Parametric OPtimization Toolbox (PPOPT) is a software platform for solving and manipulating multiparametric programs in Python. This pack

10 Sep 13, 2022

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

42 Dec 6, 2022

Convert text with ANSI color codes to HTML or to LaTeX.

326 Dec 28, 2022

MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

22 Dec 24, 2022

Streaming over lightweight data transformations

Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a

Research Unit of Medical Imaging, Physics and Technology

256 Jan 8, 2023

computer vision, image processing and machine learning on the web browser or node.

Image processing and Machine learning labs computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans

487 Nov 11, 2022

A Telegram bot to extracting text from images. All languages supported.

OCR Bot A Telegram bot to extracting text from images. All languages supported. Deploy to Heroku Local Deploying Clone the repo git clone https://gith

6 Oct 21, 2022

Build and Push docker image in Python (luigi + docker-py)

Docker build images workflow in Python Since docker hub stopped building images for free accounts, I've been looking for another way to do it. I could

2 Dec 15, 2022

A warping based image translation model focusing on upper body synthesis.

Pose2Img Upper body image synthesis from skeleton(Keypoints). Sub module in the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis

15 Nov 10, 2022

Auto-Lama combines object detection and image inpainting to automate object removals

Auto-Lama Auto-Lama combines object detection and image inpainting to automate object removals. It is build on top of DE:TR from Facebook Research and

44 Dec 9, 2022

An end-to-end image translation model with weight-map for color constancy

CCUnet An end-to-end image translation model with weight-map for color constancy 1. Download the dataset (take Colorchecker_recommended dataset as an

1 Dec 21, 2021

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

1 Dec 20, 2021

Text modding tools for FF7R (Final Fantasy VII Remake)

FF7R_text_mod_tools Subtitle modding tools for FF7R (Final Fantasy VII Remake) There are 3 tools I made. make_dualsub_mod.exe: Merges (or swaps) subti

10 Dec 19, 2022

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

T2I_CL This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning Requirements Linux Python

42 Dec 31, 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in

803 Dec 28, 2022

A Telegram bot to transcribe audio, video and image into text.

Transcriber Bot A Telegram bot to transcribe audio, video and image into text. Deploy to Heroku Local Deploying Install the FFmpeg. Make sure you have

10 Dec 19, 2022

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

58 Dec 23, 2022

J.A.R.V.I.S is an AI virtual assistant made in python.

J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa

16 Dec 29, 2022

Single Image Random Dot Stereogram for Tensorflow

TensorFlow-SIRDS Single Image Random Dot Stereogram for Tensorflow SIRDS is a means to present 3D data in a 2D image. It allows for scientific data di

5 Aug 10, 2022

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Domain Transfer Network (DTN) TensorFlow implementation of Unsupervised Cross-Domain Image Generation. Requirements Python 2.7 TensorFlow 0.12 Pickle

864 Dec 30, 2022

TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

902 Nov 29, 2022

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Super Resolution Examples We run this script under TensorFlow 2.0 and the TensorLayer2.0+. For TensorLayer 1.4 version, please check release. 🚀 🚀 🚀

2.9k Jan 8, 2023

Arabic speech recognition, classification and text-to-speech.

klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr

177 Dec 27, 2022

Useful PDF-related productivity tool.

Luftmensch 1.4.7 (Español) | 1.4.3 (English) Version 1.4.7 (Español) released in October 2021. Version 1.4.3 (English) released in September 2021. 🏮

8 Dec 29, 2022

Show Rubygems description and annotate your code right from Sublime Text.

Gem Description for Sublime Text Show Rubygems description and annotate your code. Just mouse over your Gemfile's gem definitions to show the popup. s

2 Dec 19, 2022

Django models and endpoints for working with large images -- tile serving

Django Large Image Models and endpoints for working with large images in Django -- specifically geared towards geospatial tile serving. DISCLAIMER: th

42 Dec 17, 2022

Turning images into '9-pan' palettes using KMeans clustering from sklearn.

img2palette Turning images into '9-pan' palettes using KMeans clustering from sklearn. Requirements We require: Pillow, for opening and processing ima

2 Jan 1, 2022

Text-Adventure-Game [Open Source] A group project by the Python TASK Force

2 Sep 17, 2021

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in

829 Jan 7, 2023

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image (Project page) Zhengqin Li, Mohammad Sha

209 Jan 5, 2023

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

InverseRenderNet: Learning single image inverse rendering !! Check out our new work InverseRenderNet++ paper and code, which improves the inverse rend

141 Dec 20, 2022

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

131 Dec 7, 2022

[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

SelectionGAN for Guided Image-to-Image Translation CVPR Paper | Extended Paper | Guided-I2I-Translation-Papers Citation If you use this code for your

424 Dec 2, 2022

Synthetic Scene Text from 3D Engines

Introduction UnrealText is a project that synthesizes scene text images using 3D graphics engine. This repository accompanies our paper: UnrealText: S

215 Dec 29, 2022

Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning Category-Specific Mesh Reconstruction from Image Collections Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik University

438 Dec 22, 2022

Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022

High-Resolution 3D Human Digitization from A Single Image.

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i

8.4k Dec 29, 2022

A simple software for capturing human body movements using the Kinect camera.

KinectMotionCapture A simple software for capturing human body movements using the Kinect camera. The software can seamlessly save joints and bones po

5 Aug 13, 2022

Simple Python Library to display text with color in Python Terminal

pyTextColor v1.0 Introduction pyTextColor is a simple Python Library to display colorful outputs in Terminal, etc. Note: Your Terminal or any software

1 Jan 23, 2022

A Sublime Text plugin to select a default syntax dialect

Default Syntax Chooser This Sublime Text 4 plugin provides the set_default_syntax_dialect command. This command manipulates a syntax file (e.g.: SQL.s

3 Jan 14, 2022

Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract

75 Dec 19, 2022

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region (Paper and DataSet). [New] Note that all the emails about the download permission o

71 Dec 22, 2022

Almost State-of-the-art Text Generation library

Ps: we are adding transformer model soon Text Gen 🐐 Almost State-of-the-art Text Generation library Text gen is a python library that allow you build

63 Jun 24, 2022

Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

jBrain Software associated with the AAAI 2022 paper Francesco D'Amore, Daniel Mitropolsky, Pierluigi Crescenzi, Emanuele Natale, Christos H. Papadimit

1 Apr 10, 2022

Deep Learning segmentation suite designed for 2D microscopy image segmentation

Deep Learning segmentation suite dessigned for 2D microscopy image segmentation This repository provides researchers with a code to try different enco

7 Nov 3, 2022

Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2021/11/19 Thank you for your interest in our work. We have uploaded the code of our MTUNet to help peers conduct further research on i

92 Dec 25, 2022

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

The source code is temporariy removed, as we are solving potential copyright and license issues with GRANSO (http://www.timmitchell.com/software/GRANS

0 Dec 16, 2021

Ensembling Off-the-shelf Models for GAN Training

Vision-aided GAN video (3m) | website | paper Can the collective knowledge from a large bank of pretrained vision models be leveraged to improve GAN t

345 Dec 28, 2022

Code for Deep Single-image Portrait Image Relighting

Deep Single-Image Portrait Relighting [Project Page] Hao Zhou, Sunil Hadap, Kalyan Sunkavalli, David W. Jacobs. In ICCV, 2019 Overview Test script for

438 Jan 5, 2023

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation [Arxiv] [Video] Evaluation code for Unrestricted Facial Geometry Reconstr

242 Dec 30, 2022

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

Illumination_Decomposition Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources. This code implements the

7 Nov 15, 2020

This repository contains the source code for the paper First Order Motion Model for Image Animation

!!! Check out our new paper and framework improved for articulated objects First Order Motion Model for Image Animation This repository contains the s

13k Jan 9, 2023

Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19 (Oral).

Pose-Transfer Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19(Oral). The paper is available here. Video generation

679 Jan 4, 2023

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Adelaide Intelligent Machines (AIM) Group

3k Jan 2, 2023

A CLI client for sending text emails. (Currently only gmail supported)

emailCLI A CLI client for sending text emails. (Currently only gmail supported)

3 Dec 17, 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

7 Jan 12, 2022

Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature

Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature Q. Wan, L. Gao, X. Li and L. Wen, "Industrial Image Anomaly

6 Dec 25, 2022

Tool to create a Phunk image with a custom background

Create Phunk image Tool to create a Phunk image with a custom background Installation Clone the repo git clone https://github.com/albanow/etherscan_sa

6 Mar 31, 2022

A very terrible python-based programming language that uses folders instead of text files

PYFolders by Lewis L. Foster PYFolders is a very terrible python-based programming language that uses folders instead of regular text files. In this r

5 Jan 8, 2022

A python programusing Tkinter graphics library to randomize questions and answers contained in text files

RaffleOfQuestions Um programa simples em python, utilizando a biblioteca gráfica Tkinter para randomizar perguntas e respostas contidas em arquivos de

1 Dec 16, 2021

Ensembling Off-the-shelf Models for GAN Training

Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br

1.2k Dec 26, 2022

Download YouTube videos/music and images in MP4, JPG with this tool.

ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,

5 Jan 2, 2023

Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

Full Spectrum Bioinformatics is a free online text designed to introduce key topics in Bioinformatics using the Python programming language. The text is written in interactive Jupyter Notebooks, which allow you to try out and modify example code and analyses.

33 Dec 28, 2022

Label data using HuggingFace's transformers and automatically get a prediction service

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo

30 Jul 13, 2022

Python Image-to-text-software Resources

Python image-to-text-software Libraries

Software that extracts spreadsheets from various .pdf files to .csv

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Codes for “A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection”

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

A pythonic interface to high-throughput virtual screening software

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

StyleSwin: Transformer-based GAN for High-resolution Image Generation

A pipeline for making highlighted text stand-alone.

Semantically Contrastive Learning for Low-light Image Enhancement

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Dumps the payload.bin image found in Android update images.

A unet implementation for Image semantic segmentation

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".

No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

Aircache is an open-source caching and security solution that can be integrated with most decoupled apps that use REST APIs for communicating.

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

An A-SOUL Text Generator Based on CPM-Distill.

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Depth image based mouse cursor visual haptic

High-Resolution Image Synthesis with Latent Diffusion Models

This is a really simple text-to-speech app made with python and tkinter.

Real-time text-editor using python tcp socket

VIsually-Pivoted Audio and(N) Text

A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Software Platform for solving and manipulating multiparametric programs in Python

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Convert text with ANSI color codes to HTML or to LaTeX.

MISSFormer: An Effective Medical Image Segmentation Transformer

Streaming over lightweight data transformations

computer vision, image processing and machine learning on the web browser or node.

A Telegram bot to extracting text from images. All languages supported.

Build and Push docker image in Python (luigi + docker-py)

A warping based image translation model focusing on upper body synthesis.

Auto-Lama combines object detection and image inpainting to automate object removals

An end-to-end image translation model with weight-map for color constancy

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Text modding tools for FF7R (Final Fantasy VII Remake)

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

A Telegram bot to transcribe audio, video and image into text.

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

J.A.R.V.I.S is an AI virtual assistant made in python.

Single Image Random Dot Stereogram for Tensorflow

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

TensorFlow Implementation of "Show, Attend and Tell"

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Arabic speech recognition, classification and text-to-speech.

Useful PDF-related productivity tool.

Show Rubygems description and annotate your code right from Sublime Text.

Django models and endpoints for working with large images -- tile serving

Turning images into '9-pan' palettes using KMeans clustering from sklearn.

Text-Adventure-Game [Open Source] A group project by the Python TASK Force

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

Synthetic Scene Text from 3D Engines

Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning 3D Part Assembly from a Single Image

High-Resolution 3D Human Digitization from A Single Image.

A simple software for capturing human body movements using the Kinect camera.

Simple Python Library to display text with color in Python Terminal

A Sublime Text plugin to select a default syntax dialect

Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

Almost State-of-the-art Text Generation library

Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

Deep Learning segmentation suite designed for 2D microscopy image segmentation

Mixed Transformer UNet for Medical Image Segmentation

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

Ensembling Off-the-shelf Models for GAN Training

Code for Deep Single-image Portrait Image Relighting

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.