1887 Repositories
Python scene-text-recognition Libraries
CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs
CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair
STonKGs is a Sophisticated Transformer that can be jointly trained on biomedical text and knowledge graphs
STonKGs STonKGs is a Sophisticated Transformer that can be jointly trained on biomedical text and knowledge graphs. This multimodal Transformer combin
Simple and understandable swin-transformer OCR project
swin-transformer-ocr ocr with swin-transformer Overview Simple and understandable swin-transformer OCR project. The model in this repository heavily r
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation The reference code of Improving Factual Completeness and C
Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks
Introduction This repository contains the modified caffe library and network architectures for our paper "Automated Melanoma Recognition in Dermoscopy
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
ademxapp Visual applications by the University of Adelaide In designing our Model A, we did not over-optimize its structure for efficiency unless it w
This project generates news headlines using a Long Short-Term Memory (LSTM) neural network.
News Headlines Generator bunnysaini/Generate-Headlines Goal This project aims to generate news headlines using a Long Short-Term Memory (LSTM) neural
Some utils for auto speech recognition
About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.
Exadel CompreFace is a free and open-source face recognition GitHub project
Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that
Stanford CoreNLP provides a set of natural language analysis tools written in Java
Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv
MasterDuel Image Recognition Translation Command Line Tool
MasterDuelTranslate(Use Ygo Card DataBase,belong win32 window shot & image match)
Simple Text-To-Speech Bot For Discord
Simple Text-To-Speech Bot For Discord This is a very simple TTS bot for discord made with python. For this bot you need FFMPEG, see installation to se
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features | paper | Official PyTorch implementation for Mul
Klexikon: A German Dataset for Joint Summarization and Simplification
Klexikon: A German Dataset for Joint Summarization and Simplification Dennis Aumiller and Michael Gertz Heidelberg University Under submission at LREC
A python module to parse text files with contains secret variables.
A python module to parse text files with contains secret variables.
AIDynamicTextReader - A simple dynamic text reader based on Artificial intelligence
AI Dynamic Text Reader: This is a simple dynamic text reader based on Artificial
Blackstone is a spaCy model and library for processing long-form, unstructured legal text
Blackstone Blackstone is a spaCy model and library for processing long-form, unstructured legal text. Blackstone is an experimental research project f
GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form
GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.
Grover is a model for Neural Fake News -- both generation and detectio
Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.
Applying CLIP to Point Cloud Recognition.
PointCLIP: Point Cloud Understanding by CLIP This repository is an official implementation of the paper 'PointCLIP: Point Cloud Understanding by CLIP'
HairCLIP: Design Your Hair by Text and Reference Image
Overview This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image". Our single
TransVTSpotter: End-to-end Video Text Spotter with Transformer
TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp
Local-Global Stratified Transformer for Efficient Video Recognition
DualFormer This repo is the implementation of our manuscript entitled "Local-Global Stratified Transformer for Efficient Video Recognition". Our model
A library for creating text-based graphs in the terminal
tplot is a Python package for creating text-based graphs. Useful for visualizing data to the terminal or log files.
ECLARE: Extreme Classification with Label Graph Correlations
ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark
Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods
A real-time speech emotion recognition application using Scikit-learn and gradio
Speech-Emotion-Recognition-App A real-time speech emotion recognition application using Scikit-learn and gradio. Requirements librosa==0.6.3 numpy sou
A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.
Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G
TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER
TweebankNLP This repo contains the new Tweebank-NER dataset and Twitter-Stanza p
Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset
Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset
The Face Mask recognition system uses AI technology to detect the person with or without a mask.
Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect
Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch
Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "
Python Computer Vision from Scratch
This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.
An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.
Attendance_System An image processing project uses Viola-jones technique to detect faces and then use LPB algorithm for recognition. Face Detection Us
A CNN model to detect hand gestures.
Software Used python - programming language used, tested on v3.8 miniconda - for managing virtual environment Libraries Used opencv - pip install open
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization 📥 Download Datasets 📥 Download Trained Models INTRODUCTION TH2ZH (
Custom function scheduler TUI (text-based user interface) in the console
Custom function scheduler TUI (text-based user interface) in the console
Free & simple way to encipher text
VenSipher VenSipher is a free medium through which text can be enciphered. It can convert any text into an unrecognizable secret text that can only be
The Scary Story - A Text Adventure
This is a text adventure which I made in python 3. This is one of my first big projects so any feedback would be greatly appreciated.
📝An easy-to-use package to restore punctuation of the text.
✏️ rpunct - Restore Punctuation This repo contains code for Punctuation restoration. This package is intended for direct use as a punctuation restorat
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU
Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te
Code repository for our paper "Learning to Generate Scene Graph from Natural Language Supervision" in ICCV 2021
Scene Graph Generation from Natural Language Supervision This repository includes the Pytorch code for our paper "Learning to Generate Scene Graph fro
Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition
Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition
Official implementation of Sparse Transformer-based Action Recognition
STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2
Pyramid Pooling Transformer for Scene Understanding
Pyramid Pooling Transformer for Scene Understanding Requirements: torch 1.6+ torchvision 0.7.0 timm==0.3.2 Validated on torch 1.6.0, torchvision 0.7.0
Pretty-doc - Composable text objects with python
pretty-doc from __future__ import annotations from dataclasses import dataclass
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)
Deasciify-highlighted - A Python script for deasciifying text to Turkish and copying clipboard
deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.
BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)
BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.
Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).
HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu
Desktop utility to download images/videos/music/text from various websites, and more
Desktop utility to download images/videos/music/text from various websites, and more
A Python library for generating new text from existing samples.
ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt
Action Recognition for Self-Driving Cars
Action Recognition for Self-Driving Cars This repo contains the codes for the 2021 Fall semester project "Action Recognition for Self-Driving Cars" at
Text editor on python tkinter to convert english text to other languages with the help of ployglot.
Transliterator Text Editor This is a simple transliteration program which is used to convert english word to phonetically matching word in another lan
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.
The text based version of my App Blocker that I planning on converting to GUI soon.
App-Blocker The text based version of my App Blocker that I planning on converting to GUI soon. Currently I am just uploading the appblocker.py file,
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call.
Discord_Meme_Bot 🤣 Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call. Add the bot
A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials, and online demo for beginners.
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning (TPAMI 2022 - in submission)
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper
Long text token classification using LongFormer
Long text token classification using LongFormer
How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3
Real Time Object Recognition From your Screen Desktop . In this post, I will explain how to build a simply program to detect objects from you desktop
IA for recognising Traffic Signs using Keras [Tensorflow]
Traffic Signs Recognition ⚠️ 🚦 Fundamentals of Intelligent Systems Introduction 📄 Development of a neural network capable of recognizing nine differ
A machine learning project which can detect and predict the skin disease through image recognition.
ML-Project-2021 A machine learning project which can detect and predict the skin disease through image recognition. The dataset used for this is the H
A collection of useful functions for writers to analyze text/stories.
AuthorTools AuthorTools provides a multitude of functions for easily analyzing (your?) writing. AuthorTools is made especially for creative writers wi
Face Recognition & AI Based Smart Attendance Monitoring System.
In today’s generation, authentication is one of the biggest problems in our society. So, one of the most known techniques used for authentication is h
Frappe tinymce - Frappe app to replace default text editor with tinymce
Frappe tinyMCE tinyMCE Text Editor for frappe apps Replace frappe's Quill Text E
Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)
Arabic-Phonetic-Output You can input the phonetic version of any Arabic text her
Food recognition model using convolutional neural network & computer vision
Food recognition model using convolutional neural network & computer vision. The goal is to match or beat the DeepFood Research Paper
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.
Object recognition using Azure Custom Vision AI and Azure Functions
Step by Step on how to create an object recognition model using Custom Vision, export the model and run the model in an Azure Function
2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest
繁體中文場景文字辨識 程式碼說明 組別:這就是我 成員:蔣明憲 唐碩謙 黃玥菱 林冠霆 蕭靖騰 目錄 環境套件 安裝方式 資料夾布局 前處理-製作偵測訓練註解檔 前處理-製作分類訓練樣本 part.py : 從 json 裁切出分類訓練樣本 Class.py : 將切出來的樣本按照文字分類到各資料夾
Image Recognition Model Generator
Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.
DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2
Image-generation-baseline - MUGE Text To Image Generation Baseline
MUGE Text To Image Generation Baseline Requirements and Installation More detail
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Code for ICCV 2021 paper Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs
Graph-to-3D This is the official implementation of the paper Graph-to-3d: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs | arx
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed
A menu for pygame. Simple, and easy to use
pygame-menu Source repo on GitHub, and run it on Repl.it Introduction Pygame-menu is a python-pygame library for creating menus and GUIs. It supports
👄 The most accurate natural language detection library for Python, suitable for long and short text alike
1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Usage First, install PyTorch 1.7.1+, torchvision 0.8.2
Blender Add-on to Add Metal Materials to Your Scene
Blender QMM (Quick Metal Materials) Blender Addon to Add Metal Materials to Your Scene Installation Download the latest ZIP from Releases. Usage This
A Telegram bot written in python.
telegram_bot This bot is currently a beta project. Features A telegram bot which can: Send current COVID-19 cases/stats of Germany Send current worth
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t
Multi-Stage Episodic Control for Strategic Exploration in Text Games
XTX: eXploit - Then - eXplore Requirements First clone this repo using git clone https://github.com/princeton-nlp/XTX.git Please create two conda envi
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".
Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If
Towards Boosting the Accuracy of Non-Latin Scene Text Recognition
Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.
Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.