2815 Python Image-text-retrieval Libraries

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

Applied Research Center (ARC), Tencent PCG

84 Jan 4, 2023

[x]it! support for working with todo and check list files in Sublime Text

[x]it! for Sublime Text This Sublime Package provides syntax-highlighting, shortcuts, and auto-completions for [x]it! files. Features Syntax highlight

18 Sep 19, 2022

An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

22 Dec 29, 2022

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

6 Dec 23, 2022

Library for converting from RGB / GrayScale image to base64 and back.

Library for converting RGB / Grayscale numpy images from to base64 and back. Installation pip install -U image_to_base_64 Conversion RGB to base 64 b

16 Aug 28, 2022

Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

FMEN Lowest memory consumption and second shortest runtime in NTIRE 2022 on Efficient Super-Resolution. Our paper: Fast and Memory-Efficient Network T

33 Dec 1, 2022

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Introduction This codebase contains source-code of the Python-based implementation (ARES) of our SIGIR 2022 paper. Chen, Jia, et al. "Axiomatically Re

17 Nov 9, 2022

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

DALLE2 Video (wip) ** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated ** Direct appli

105 May 15, 2022

A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

2.5k Jan 2, 2023

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022

Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

170 Dec 27, 2022

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

11 Oct 22, 2022

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

49 Dec 11, 2022

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

🔉 Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

58 Dec 28, 2022

Optimizes image files by converting them to webp while also updating all references.

About Optimizes images by (re-)saving them as webp. For every file it replaced it automatically updates all references. Works on single files as well

18 Dec 23, 2022

Helping data scientists better understand their datasets and models in text classification. With love from ServiceNow.

Azimuth, an open-source dataset and error analysis tool for text classification, with love from ServiceNow. Overview Azimuth is an open source applica

145 Dec 23, 2022

Converts an image into funny, smaller amongus characters

SussyImage Converts an image into funny, smaller amongus characters Demo Mona Lisa | Lona Misa (Made up of AmongUs characters) API I've also added an

14 Aug 18, 2022

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

33 Jan 5, 2023

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Disentangled Representation Learning for Text-Video Retrieval This is a PyTorch implementation of the paper Disentangled Representation Learning for T

49 Dec 18, 2022

A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

19 Dec 12, 2022

"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

ViGA: Video moment retrieval via Glance Annotation This is the official repository of the paper "Video Moment Retrieval from Text Queries via Single F

38 Dec 31, 2022

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

ITTR - Pytorch Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image

17 Dec 23, 2022

Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

Targeted Trojan-Horse Attacks on Language-based Image Retrieval Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Re

7 Aug 23, 2022

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Deep-Learning-for-Text-Document-Classification Text classification is one of the popular tasks in NLP that allows a program to classify free-text docu

2 Mar 17, 2022

Author: Wenhao Yu (wyu1@nd.edu). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

61 Dec 30, 2022

This is a Text Data Analysis Project Involving (YouTube Case Study).

Text_Data_Analysis This is a Text Data Analysis Project Involving (YouTube Case Study). Problem Statement = Sentiment Analysis. Package1: There are m

1 Mar 5, 2022

A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

14 Nov 21, 2022

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

29 Oct 16, 2022

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -

1 Mar 14, 2022

Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng, Hongwei Yong, Lei Zhang, "Unfolded Deep Kernel Estimation for Blind Ima

15 Dec 26, 2022

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

58 Dec 6, 2022

Code base for "On-the-Fly Test-time Adaptation for Medical Image Segmentation"

On-the-Fly Adaptation Official Pytorch Code base for On-the-Fly Test-time Adaptation for Medical Image Segmentation Paper Introduction One major probl

17 Nov 10, 2022

code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

On Robust Prefix-Tuning for Text Classification Prefix-tuning has drawed much attention as it is a parameter-efficient and modular alternative to adap

12 Nov 30, 2022

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

NLP_0-project Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures1. We are a "democratic" and c

3 Mar 16, 2022

Input english text, then translate it between languages n times using the Deep Translator Python Library.

mass-translator About Input english text, then translate it between languages n times using the Deep Translator Python Library. How to Use Install dep

2 Mar 4, 2022

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Dual-task Pose Transformer Network The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

63 Dec 15, 2022

Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

BiDR Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. Requirements torch==

11 Oct 20, 2022

Image Lowpoly based on Centroid Voronoi Diagram via python-opencv and taichi

CVTLowpoly: Image Lowpoly via Centroid Voronoi Diagram Image Sharp Feature Extraction using Guide Filter's Local Linear Theory via opencv-python. The

4 Jul 29, 2022

a simple, efficient, and intuitive text editor

Oxygen beta a simple, efficient, and intuitive text editor Overview oxygen is a simple, efficient, and intuitive text editor designed as more featured

1 Feb 23, 2022

Pdraw - Generate Deterministic, Procedural Artwork from Arbitrary Text

pdraw.py: Generate Deterministic, Procedural Artwork from Arbitrary Text pdraw a

2 Sep 12, 2022

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

157 Jan 1, 2023

Django-Text-to-HTML-converter - The simple Text to HTML Converter using Django framework

Django-Text-to-HTML-converter This is the simple Text to HTML Converter using Dj

6 Oct 9, 2022

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZEROGEN This repository contains the code for our paper “ZeroGen: Efficient Zero

31 Dec 30, 2022

TwitterBot-ImageCollector - Twitter bot that collects images from likes saves the image

TwitterBot-ImageCollector Bot de Twitter que recolecta imagenes a partir de los

4 Jun 1, 2022

SimCTG - A Contrastive Framework for Neural Text Generation

A Contrastive Framework for Neural Text Generation Authors: Yixuan Su, Tian Lan,

345 Jan 3, 2023

PaintPrint - This module can colorize any text in your terminal

PaintPrint This module can colorize any text in your terminal Author: tankalxat3

2 Feb 17, 2022

Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow

Digitalizing-Prescription-Image PIRDS - Prescription Image Recognition and Digit

2 May 11, 2022

Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

139 Jan 3, 2023

Python-Strongest-Encrypter - Transform your text into encrypted symbols using their dictionary

How does the encrypter works? Transform your text into encrypted symbols using t

1 Jul 10, 2022

Faune proche - Retrieval of Faune-France data near a google maps location

faune_proche Récupération des données de Faune-France près d'un lieu google maps

4 Feb 15, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

Official code release for: EditGAN: High-Precision Semantic Image Editing

565 Jan 5, 2023

Image generation API.

Image Generator API This is an api im working on Currently its just a test project Im trying to make custom readme images with your discord account pr

2 Feb 19, 2022

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation In this repo you can find the code of the Supervised Hybrid Audio Segmentatio

21 Dec 20, 2022

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

1 Feb 13, 2022

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

36 Feb 13, 2022

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [toc] 1. Introduction This repository provides the code for our paper at

13 Dec 8, 2022

Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

1.4k Jan 8, 2023

Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"

Hyper-Convolution Networks for Biomedical Image Segmentation Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentatio

17 Nov 2, 2022

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

760 Jan 3, 2023

Automatic game data translator for RPGMaker-MV

RPGMaker-MV Translator 🕹️ 🎮 Use AI to translate all the dialogs and texts of your RPGMaker automatically. 👊 You worked hard to make your game, now

11 Dec 26, 2022

GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

GVT is a generic translation tool for parts of text on the PC screen with Text to Speech functionality. I wanted to create it because the existing tools that I experimented with did not satisfy me in ease-to-use experience and configuration. Personally I used it with Lost Ark (example included generated by 2k monitor) to translate simple dialogues of quests in Italian.

1 Aug 21, 2022

A working (ish) python script to convert text to a gradient.

verticle-horiontal-gradient-script A working (ish) python script to convert text to a gradient. This script is poorly made with the well known python

1 Feb 20, 2022

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the SpeechRecognition and pyttsx3 library of Python.

1 Feb 13, 2022

Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv then it will detect black listed people if mentioned properly with their images.

1 Feb 13, 2022

Image2scan - a python program that can be applied on an image in order to get a scan of it back

image2scan Purpose image2scan is a python program that can be applied on an image in order to get a scan of it back. For this purpose, it searches for

2 Feb 13, 2022

Redlines produces a Markdown text showing the differences between two strings/text

Redlines Redlines produces a Markdown text showing the differences between two strings/text. The changes are represented with strike-throughs and unde

2 Apr 8, 2022

A python script that can send notifications to your phone via SMS text

Discord SMS Notification A python script that help you send text message to your phone one of your desire discord channel have a new message. The proj

2 Apr 25, 2022

Different steganography methods with examples and my own small image database

literally-the-most-useless-project [Different steganography methods with examples and my own small image database] This project currently contains thr

1 Dec 9, 2022

Copy only text-like files from the folder

copy-only-text-like-files-from-folder-python copy only text-like files from the folder This project is for those who want to copy only source code or

1 May 17, 2022

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters - where the final result looks like waves in the ocean.

2 Sep 9, 2022

Image-to-image regression with uncertainty quantification in PyTorch

Image-to-image regression with uncertainty quantification in PyTorch. Take any dataset and train a model to regress images to images with rigorous, distribution-free uncertainty quantification.

25 Dec 26, 2022

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

BROS (BERT Relying On Spatiality) is a pre-trained language model focusing on text and layout for better key information extraction from documents. Given the OCR results of the document image, which are text and bounding box pairs, it can perform various key information extraction tasks, such as extracting an ordered item list from receipts

94 Dec 30, 2022

Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

231 Dec 7, 2022

A framework for GPU based high-performance medical image processing and visualization

FAST is an open-source cross-platform framework with the main goal of making it easier to do high-performance processing and visualization of medical images on heterogeneous systems utilizing both multi-core CPUs and GPUs. To achieve this, FAST use modern C++, OpenCL and OpenGL.

315 Dec 30, 2022

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

98 Dec 15, 2022

hashily is a Python module that provides a variety of text decoding and encoding operations.

hashily is a python module that performs a variety of text decoding and encoding functions. It also various functions for encrypting and decrypting text using various ciphers.

5 Jul 17, 2022

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

3 Feb 10, 2022

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

53 Nov 22, 2022

An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright.

Playwright nonoCAPTCHA An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright. Disclaimer This project is for educational

69 Dec 28, 2022

End-to-end image captioning with EfficientNet-b3 + LSTM with Attention

Image captioning End-to-end image captioning with EfficientNet-b3 + LSTM with Attention Model is seq2seq model. In the encoder pretrained EfficientNet

2 Feb 10, 2022

This project uses ViT to perform image classification tasks on DATA set CIFAR10.

Vision-Transformer-Multiprocess-DistributedDataParallel-Apex Introduction This project uses ViT to perform image classification tasks on DATA set CIFA

3 Jun 3, 2022

NeuroGen: activation optimized image synthesis for discovery neuroscience

NeuroGen: activation optimized image synthesis for discovery neuroscience NeuroGen is a framework for synthesizing images that control brain activatio

3 Aug 17, 2022

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation. Generally, MAS methods register multiple atlases, i.e., medical images with corresponding labels, to a target image;

13 Oct 9, 2022

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis Pretrained Models In this work, we created synthetic tissue

1 Feb 7, 2022

Generate waves art for an image

waves-art Generate waves art for an image. Requirements: OpenCV Numpy Example Usage python waves_art.py --image_path tests/test1.jpg --patch_size 15 T

18 Apr 4, 2022

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

91 Sep 23, 2022

The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

38 Dec 12, 2022

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

247 Dec 16, 2022

Simple Python package to convert an image into a quantized image using a customizable palette

Simple Python package to convert an image into a quantized image using a customizable palette. Resulting image can be displayed by ePaper displays such as Waveshare displays.

3 Apr 13, 2022

L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources.

L3Cube-MahaCorpus L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources. We expand the existing Marathi monolingual

21 Dec 17, 2022

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We

29 Dec 1, 2022

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation By Qing Xu, Wenting Duan and Na He Requirements pytorch==1.1

20 Dec 9, 2022

Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Hierarchical Token Semantic Audio Transformer Introduction The Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound

13 Feb 8, 2022

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation This reposi

First Person Vision @ Image Processing Laboratory - University of Catania

1 Aug 21, 2022

Image Data Augmentation in Keras

Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.

3 Feb 15, 2022

To classify the News into Real/Fake using Features from the Text Content of the article

Hoax-Detector Authenticity of news has now become a major problem. The Idea is to classify the News into Real/Fake using Features from the Text Conten

1 Feb 9, 2022

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

50 Nov 12, 2022

Python Image-text-retrieval Resources

Python image-text-retrieval Libraries

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

[x]it! support for working with todo and check list files in Sublime Text

An example to implement a new backbone with OpenMMLab framework.

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Library for converting from RGB / GrayScale image to base64 and back.

Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

A Human-in-the-Loop workflow for creating HD images from text

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Read Japanese manga inside browser with selectable text.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

Optimizes image files by converting them to webp while also updating all references.

Helping data scientists better understand their datasets and models in text classification. With love from ServiceNow.

Converts an image into funny, smaller amongus characters

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

A python-image-classification web application project, written in Python and served through the Flask Microframework

"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

This is a Text Data Analysis Project Involving (YouTube Case Study).

A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Code base for "On-the-Fly Test-time Adaptation for Medical Image Segmentation"

code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

Input english text, then translate it between languages n times using the Deep Translator Python Library.

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

Image Lowpoly based on Centroid Voronoi Diagram via python-opencv and taichi

a simple, efficient, and intuitive text editor

Pdraw - Generate Deterministic, Procedural Artwork from Arbitrary Text

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Django-Text-to-HTML-converter - The simple Text to HTML Converter using Django framework

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

TwitterBot-ImageCollector - Twitter bot that collects images from likes saves the image

SimCTG - A Contrastive Framework for Neural Text Generation

PaintPrint - This module can colorize any text in your terminal

Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow

Breaching - Breaching privacy in federated learning scenarios for vision and text

Python-Strongest-Encrypter - Transform your text into encrypted symbols using their dictionary

Faune proche - Retrieval of Faune-France data near a google maps location

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Official code release for: EditGAN: High-Precision Semantic Image Editing

Image generation API.

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

Automatic game data translator for RPGMaker-MV

GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

A working (ish) python script to convert text to a gradient.

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Image2scan - a python program that can be applied on an image in order to get a scan of it back

Redlines produces a Markdown text showing the differences between two strings/text

A python script that can send notifications to your phone via SMS text

Different steganography methods with examples and my own small image database

Copy only text-like files from the folder

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

Image-to-image regression with uncertainty quantification in PyTorch

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

Reading list for research topics in Masked Image Modeling

A framework for GPU based high-performance medical image processing and visualization

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers