1592 Repositories
Python ocr-open-dataset Libraries
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:
PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network Introduction This is a tensorflow re-implementation of PSENet: Shape Robu
Scene text detection and recognition based on Extremal Region(ER)
Scene text recognition A real-time scene text recognition algorithm. Our system is able to recognize text in unconstrain background. This algorithm is
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition
STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net
This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.
DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection
R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Extract tables from scanned image PDFs using Optical Character Recognition.
ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt
TableBank: A Benchmark Dataset for Table Detection and Recognition
TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.
Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.
Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit
Handwritten_Text_Recognition
Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi
OCR software for recognition of handwritten text
Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis
Handwritten Text Recognition (HTR) system implemented with TensorFlow.
Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis
Let's explore how we can extract text from forms
Form Segmentation Let's explore how we can extract text from any forms / scanned pages. Objectives The goal is to find an algorithm that can extract t
Document Layout Analysis
Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P
OCR-D-compliant page segmentation
ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which
Detect handwritten words in a text-line (classic image processing method).
Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
An application of high resolution GANs to dewarp images of perturbed documents
Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Dataset Cartography Code for the paper Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics at EMNLP 2020. This repository cont
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"
NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement
A python script to lookup Passport Index Dataset
visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]
[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"
FiGNN for CTR prediction The code and data for our paper in CIKM2019: Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti
Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"
A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera
Nimbus - Open Source Cloud Computing Software - 100% Apache2 licensed
⚠️ The Nimbus infrastructure project is no longer under development. ⚠️ For more information, please read the news announcement. If you are interested
Open source home automation that puts local control and privacy first
Home Assistant Open source home automation that puts local control and privacy first. Powered by a worldwide community of tinkerers and DIY enthusiast
🦔 PostHog is developer-friendly, open-source product analytics.
PostHog provides open-source product analytics, built for developers. Automate the collection of every event on your website or app, with no need to send data to 3rd parties. With just 1 click you can deploy on your own infrastructure, having full API/SQL access to the underlying data.
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.
The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.
The first open-source PyTgCalls-based project.
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Requirements FFmpeg NodeJS 15+ Python 3.7+ Deploymen
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
The Recon-ng Framework Recon-ng content now available on Pluralsight! Recon-ng is a full-featured reconnaissance framework designed with the goal of p
LinOTP - the open source solution for two factor authentication
LinOTP LinOTP - the Open Source solution for multi-factor authentication Copyright © 2010-2019 KeyIdentity GmbH Coypright © 2019- arxes-tolina GmbH In
Hubble is a modular, open-source security compliance framework. The project provides on-demand profile-based auditing, real-time security event notifications, alerting, and reporting. HubbleStack is a free and open source project made possible by Adobe. https://github.com/adobe
Welcome to HubbleStack!! You can find the docs here You can file an issue here Follow us on Twitter! Development Below are sample instructions to setu
An open-source post-exploitation framework for students, researchers and developers.
Questions? Join the Discord support server Disclaimer: This project should be used for authorized testing or educational purposes only. BYOB is an ope
Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.
Welcome to the Spinnaker Project Spinnaker is an open-source continuous delivery platform for releasing software changes with high velocity and confid
Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software.
Ganeti 3.0 =========== For installation instructions, read the INSTALL and the doc/install.rst files. For a brief introduction, read the ganeti(7) m
An open source multi-tool for exploring and publishing data
Datasette An open source multi-tool for exploring and publishing data Datasette is a tool for exploring and publishing data. It helps people take data
Patchwork is a web-based patch tracking system designed to facilitate the contribution and management of contributions to an open-source project.
Patchwork Patchwork is a patch tracking system for community-based projects. It is intended to make the patch management process easier for both the p
Conan - The open-source C/C++ package manager
Conan Decentralized, open-source (MIT), C/C++ package manager. Homepage: https://conan.io/ Github: https://github.com/conan-io/conan Docs: https://doc
某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具
elective-dataset-2021spring 某学校2021春季选课系统GIF验证码数据集(29338张) + 准确率98.4%的Baseline模型 + 上下游相关工具。 数据集采用 知识共享署名-非商业性使用 4.0 国际许可协议 进行许可。 Baseline模型和上下游相关工具采用
Dogs classification with Deep Metric Learning using some popular losses
Tsinghua Dogs classification with Deep Metric Learning 1. Introduction Tsinghua Dogs dataset Tsinghua Dogs is a fine-grained classification dataset fo
Contract Understanding Atticus Dataset
Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra
Open world survival environment for reinforcement learning
Crafter Open world survival environment for reinforcement learning. Highlights Crafter is a procedurally generated 2D world, where the agent finds foo
DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)
Evaluation, Training, Demo, and Inference of DeFMO DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021) Denys Rozumnyi, Martin R. O
Open Sound Strip, Sequence or Record in Audacity
Audacity Tools For Blender Sound editing in Blender Video Sequence Editor with Audacity integrated. Send/receive the full edited sequence or single st
Object Depth via Motion and Detection Dataset
ODMD Dataset ODMD is the first dataset for learning Object Depth via Motion and Detection. ODMD training data are configurable and extensible, with ea
The MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset This is the repository for Measuring Mathematical Problem Solving With the MATH Dataset b
Open standard for machine learning interoperability
Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides
Open Source Computer Vision Library
OpenCV: Open Source Computer Vision Library Resources Homepage: https://opencv.org Courses: https://opencv.org/courses Docs: https://docs.opencv.org/m
A Free and Open Source Python Library for Multiobjective Optimization
Platypus What is Platypus? Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs)
open-source feature selection repository in python
scikit-feature Feature selection repository scikit-feature in Python. scikit-feature is an open-source feature selection repository in Python develope
An open source python library for automated feature engineering
"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.datasets: The raw text iterators for common NLP datasets torchtext.data: Some basic NLP building bloc
Open source time series library for Python
PyFlux PyFlux is an open source time series library for Python. The library has a good array of modern time series models, as well as a flexible array
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully
An open source image editor which can manipulate an image in many ways!
Image Editor - An open source image editor which can manipulate an image in many ways! If you need any more modes in repo or I
Upbit(업비트) Cryptocurrency Exchange OPEN API Client for Python
Base Repository Python Upbit Client Repository Upbit OPEN API Client @Author: uJhin @GitHub: https://github.com/uJhin/upbit-client/ @Officia
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition
Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S
OPEM (Open Source PEM Fuel Cell Simulation Tool)
Table of contents What is PEM? Overview Installation Usage Executable Library Telegram Bot Try OPEM in Your Browser! MATLAB Issues & Bug Reports Contr
Open Delmic Microscope Software
Odemis Odemis (Open Delmic Microscope Software) is the open-source microscopy software of Delmic B.V. Odemis is used for controlling microscopes of De
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
CKAN: The Open Source Data Portal Software CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work
An open-source application for biological image analysis
CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure
The docker-based Open edX distribution designed for peace of mind
Tutor: the docker-based Open edX distribution designed for peace of mind Tutor is a docker-based Open edX distribution, both for production and local
The Open edX platform, the software that powers edX!
This is the core repository of the Open edX software. It includes the LMS (student-facing, delivering courseware), and Studio (course authoring) compo
Zulip server and webapp - powerful open source team chat
Zulip overview Zulip is a powerful, open source group chat application that combines the immediacy of real-time chat with the productivity benefits of
Securely and anonymously share files, host websites, and chat with friends using the Tor network
OnionShare OnionShare is an open source tool that lets you securely and anonymously share files, host websites, and chat with friends using the Tor ne
A free & open modern, fast email client with user-friendly encryption and privacy features
Welcome to Mailpile! Introduction Mailpile (https://www.mailpile.is/) is a modern, fast web-mail client with user-friendly encryption and privacy feat
GlobaLeaks is free, open source software enabling anyone to easily set up and maintain a secure whistleblowing platform.
GlobaLeaks is free, open souce software enabling anyone to easily set up and maintain a secure whistleblowing platform. Continous Integration and Test
The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim through format.
The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim
Scan, index, and archive all of your paper documents
[ en | de | el ] Important news about the future of this project It's been more than 5 years since I started this project on a whim as an effort to tr
One webpage for every book ever published!
Open Library Open Library is an open, editable library catalog, building towards a web page for every book ever published. Are you looking to get star
Open source platform for the machine learning lifecycle
MLflow: A Machine Learning Lifecycle Platform MLflow is a platform to streamline machine learning development, including tracking experiments, packagi
Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
Archivematica By Artefactual Archivematica is a web- and standards-based, open-source application which allows your institution to preserve long-term
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
ArchiveBox Open-source self-hosted web archiving. ▶️ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe
:mag: Ambar: Document Search Engine
🔍 Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am
A comprehensive, feature-rich, open source, and portable, collection of Solitaire games.
PySol Fan Club edition This is an open source and portable (Windows, Linux and Mac OS X) collection of Card Solitaire/Patience games written in Python
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality video editing and animation solutions to the world.
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality v
GNU Radio – the Free and Open Software Radio Ecosystem
GNU Radio is a free & open-source software development toolkit that provides signal processing blocks to implement software radios. It can be used wit
changedetection.io - The best and simplest self-hosted website change detection monitoring service
changedetection.io - The best and simplest self-hosted website change detection monitoring service. An alternative to Visualping, Watchtower etc. Designed for simplicity - the main goal is to simply monitor which websites had a text change. Open source web page change detection.
thumbor is an open-source photo thumbnail service by globo.com
Survey If you use thumbor, please take 1 minute and answer this survey? It's only 2 questions and one is multiple choice!!! thumbor is a smart imaging
Python-based tools for document analysis and OCR
ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so
Transfer SemanticKITTI labeles into other dataset/sensor formats.
LiDAR-Transfer Transfer SemanticKITTI labeles into other dataset/sensor formats. Content Convert datasets (NUSCENES, FORD, NCLT) to KITTI format Minim
A demo titiler for Sentinel 2 Digital Twin dataset
This is a DEMO custom api built on top of TiTiler to create Web Map Tiles from the Digital Twin Sentinel-2 COG created by Sinergise
FastOCR is a desktop application for OCR API.
FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe
Free and Open Source Machine Translation API. 100% self-hosted, no limits, no ties to proprietary services. Built on top of Argos Translate.
LibreTranslate Try it online! | API Docs Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it doesn't rely on pro
An Open-Source Package for Neural Relation Extraction (NRE)
OpenNRE We have a DEMO website (http://opennre.thunlp.ai/). Try it out! OpenNRE is an open-source and extensible toolkit that provides a unified frame
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr
An open source library for deep learning end-to-end dialog systems and chatbots.
DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re
Open Source Neural Machine Translation in PyTorch
OpenNMT-py: Open-Source Neural Machine Translation OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine trans
An open-source NLP research library, built on PyTorch.
An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual
The open-source tool for building high-quality datasets and computer vision models
The open-source tool for building high-quality datasets and computer vision models. Website • Docs • Try it Now • Tutorials • Examples • Blog • Commun
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
AutoViz Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one lin
An open-source plotting library for statistical data.
Lets-Plot Lets-Plot is an open-source plotting library for statistical data. It is implemented using the Kotlin programming language. The design of Le
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
PyVista Deployment Build Status Metrics Citation License Community 3D plotting and mesh analysis through a streamlined interface for the Visualization