454 Repositories
Python document-management Libraries
organize - The file management automation tool
organize - The file management automation tool
Command line program to download documents from web portals.
command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network
Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"
Aspect-level Sentiment Classification Code and dataset for ACL2018 [paper] ‘‘Exploiting Document Knowledge for Aspect-level Sentiment Classification’’
Monitor your Binance portfolio
Binance Report Bot The intent of this bot is to take a snapshot of your binance wallet, e.g. the current balances and store it for further plotting. I
Datargsing is a data management and manipulation Python library
Datargsing What is It? Datargsing is a data management and manipulation Python library which is currently in deving Why this library is good? This Pyt
SDL: Synthetic Document Layout dataset
SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.
document organizer with tags and full-text-search, in a simple and clean sqlite3 schema
document organizer with tags and full-text-search, in a simple and clean sqlite3 schema
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.
Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).
SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques
PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every
Make nixos usable for non-technical users through a settings / package management GUI.
Nix-Gui Make nixos usable for non-technical users through a settings / package management GUI. Motives The declarative nature of ni
A collection of modules I have created to programmatically search for/download imagery from live cam feeds across the state of California.
A collection of modules that I have created to programmatically search for/download imagery from all publicly available live cam feeds across the state of California. In no way am I affiliated with any of these organizations and these modules/methods of gathering imagery are completely unofficial.
DBMS Mini-project: Recruitment Management System
# Hire-ME DBMS Mini-project: Recruitment Management System. 💫 ✨ Features Python + MYSQL using mysql.connector library Recruiter and Client Panel Beau
Cross-Document Coreference Resolution
Cross-Document Coreference Resolution This repository contains code and models for end-to-end cross-document coreference resolution, as decribed in ou
A embed able annotation tool for end to end cross document co-reference
CoRefi CoRefi is an emebedable web component and stand alone suite for exaughstive Within Document and Cross Document Coreference Anntoation. For a de
nvitop, an interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management
An interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management.
❤️ DaisyX 2.0 ❤️ A Powerful, Smart And Simple Group Manager ... Written with AioGram , Pyrogram and Telethon...
❤️ DaisyX 2.0 ❤️ A Powerful, Smart And Simple Group Manager ... Written with AioGram , Pyrogram and Telethon... ⭐️ Thanks to everyone who starred Dais
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021
ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co
A declarative Kubeflow Management Tool inspired by Terraform
🍭 KRSH is Alpha version, so many bugs can be reported. If you find a bug, please write an Issue and grow the project together! A declarative Kubeflow
Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"
LongDocSum Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization" This repository contains data and models needed to reprodu
GitHub Actions Version Updater Updates All GitHub Action Versions in a Repository and Creates a Pull Request with the Changes.
GitHub Actions Version Updater GitHub Actions Version Updater is GitHub Action that is used to update other GitHub Actions in a Repository and create
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'
Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr
Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.
Esparto is a simple HTML and PDF document generator for Python. Its primary use is for generating shareable single page reports with content from popular analytics and data science libraries.
Simple but maybe too simple config management through python data classes. We use it for machine learning.
👩✈️ Coqpit Simple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization. Curre
Beanie - is an Asynchronous Python object-document mapper (ODM) for MongoDB
Beanie - is an Asynchronous Python object-document mapper (ODM) for MongoDB, based on Motor and Pydantic.
API spec validator and OpenAPI document generator for Python web frameworks.
API spec validator and OpenAPI document generator for Python web frameworks.
Management commands to help backup and restore your project database and media files
Django Database Backup This Django application provides management commands to help backup and restore your project database and media files with vari
Write Django management command using the click CLI library
Django Click Project information: Automated code metrics: django-click is a library to easily write Django management commands using the click command
This is a repository for collecting global custom management extensions for the Django Framework.
Django Extensions Django Extensions is a collection of custom extensions for the Django Framework. Getting Started The easiest way to figure out what
Universal Command Line Interface for Amazon Web Services
This package provides a unified command line interface to Amazon Web Services.
Simple, light-weight config handling through python data classes with to/from JSON serialization/deserialization.
Simple but maybe too simple config management through python data classes. We use it for machine learning.
AHA is an incident management & communication framework to provide real-time alert customers when there are active AWS event(s). For customers with AWS Organizations, customers can get aggregated active account level events of all the accounts in the Organization. Customers not using AWS Organizations still benefit alerting at the account level.
Table of Contents Introduction Architecture Configuring an Endpoint Creating a Amazon Chime Webhook URL Creating a Slack Webhook URL Creating a Micros
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1
Reinforcement Learning for Portfolio Management
qtrader Reinforcement Learning for Portfolio Management Why Reinforcement Learning? Learns the optimal action, rather than models the market. Adaptive
Layout Parser is a deep learning based tool for document image layout analysis tasks.
A Python Library for Document Layout Understanding
xeuledoc - Fetch information about a public Google document.
xeuledoc - Fetch information about a public Google document.
Glauth management ui created with python/flask
glauth-ui Glauth-UI is a small flask web app i created to manage the minimal glauth ldap server. I created this as i wanted to use glauth for authenti
A django integration for huey task queue that supports multi queue management
django-huey This package is an extension of huey contrib djhuey package that allows users to manage multiple queues. Installation Using pip package ma
Top2Vec is an algorithm for topic modeling and semantic search.
Top2Vec is an algorithm for topic modeling and semantic search. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors.
Web UI for your scripts with execution management
Script-server is a Web UI for scripts. As an administrator, you add your existing scripts into Script server and other users would be ab
jupyter/ipython experiment containers for GPU and general RAM re-use
ipyexperiments jupyter/ipython experiment containers and utils for profiling and reclaiming GPU and general RAM, and detecting memory leaks. About Thi
Python 3 Bindings for the NVIDIA Management Library
====== pyNVML ====== *** Patched to support Python 3 (and Python 2) *** ------------------------------------------------ Python bindings to the NVID
Fetch information about a public Google document.
xeuledoc Fetch information about any public Google document. It's working on : Google Docs Google Spreadsheets Google Slides Google Drawning Google My
"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.
ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a
document image degradation
ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC
Binarize document images
Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to
A selectional auto-encoder approach for document image binarization
The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =
PAGE XML format collection for document image page content and more
PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri
Python-based tools for document analysis and OCR
ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |
This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.
SCUT-CTW1500 Datasets We have updated annotations for both train and test set. Train: 1000 images [images][annos] Additional point annotation for each
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"
TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Document Layout Analysis
Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P
Page to PAGE Layout Analysis Tool
P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. 💥 Try our new DEMO for online baseli
A simple document layout analysis using Python-OpenCV
Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen
Document Layout Analysis Projects
Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std
Generic framework for historical document processing
dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)
DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma
Document Image Dewarping
Document image dewarping using text-lines and line Segments Abstract Conventional text-line based document dewarping methods have problems when handli
Library used to deskew a scanned document
Deskew //Note: Skew is measured in degrees. Deskewing is a process whereby skew is removed by rotating an image by the same amount as its skew but in
An application of high resolution GANs to dewarp images of perturbed documents
Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat
Ralph is the CMDB / Asset Management system for data center and back office hardware.
Ralph Ralph is full-featured Asset Management, DCIM and CMDB system for data centers and back offices. Features: keep track of assets purchases and th
IP address management (IPAM) and data center infrastructure management (DCIM) tool.
NetBox is an IP address management (IPAM) and data center infrastructure management (DCIM) tool. Initially conceived by the network engineering team a
Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software.
Ganeti 3.0 =========== For installation instructions, read the INSTALL and the doc/install.rst files. For a brief introduction, read the ganeti(7) m
Daemon to ban hosts that cause multiple authentication errors
__ _ _ ___ _ / _|__ _(_) |_ ) |__ __ _ _ _ | _/ _` | | |/ /| '_ \/ _` | ' \
A generic JSON document store with sharing and synchronisation capabilities.
Kinto Kinto is a minimalist JSON storage service with synchronisation and sharing abilities. Online documentation Tutorial Issue tracker Contributing
Patchwork is a web-based patch tracking system designed to facilitate the contribution and management of contributions to an open-source project.
Patchwork Patchwork is a patch tracking system for community-based projects. It is intended to make the patch management process easier for both the p
Parse Robinhood 1099 Tax Document from PDF into CSV
Robinhood 1099 Parser This project converts Robinhood Securities 1099 tax document from PDF to CSV file. This tool will be helpful for those who need
“ Hey there 👋 I'm Daisy „ AI based Advanced Group Management Bot Suit For All Your Needs ❤️.. Source Code of @Daisyxbot
Project still under heavy development Everything will be changed in the release “ Hey there 👋 I'm Daisy „ AI based Advanced telegram Group Management
A shopping list and kitchen inventory management app.
Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
CKAN: The Open Source Data Portal Software CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work
RELATE is an Environment for Learning And TEaching
RELATE Relate is an Environment for Learning And TEaching RELATE is a web-based courseware package. It is set apart by the following features: Focus o
Conference planning tool: CfP, scheduling, speaker management
pretalx is a conference planning tool focused on providing the best experience for organisers, speakers, reviewers, and attendees alike. It handles th
Indico - A feature-rich event management system, made @ CERN, the place where the Web was born.
Indico Indico is: 🗓 a general-purpose event management tool; 🌍 fully web-based; 🧩 feature-rich but also extensible through the use of plugins; ⚖️ O
Agile project management platform. Built on top of Django and AngularJS
Taiga Backend Documentation Currently, we have authored three main documentation hubs: API: Our API documentation and reference for developing from Ta
Conference planning tool: CfP, scheduling, speaker management
pretalx is a conference planning tool focused on providing the best experience for organisers, speakers, reviewers, and attendees alike. It handles th
Open source platform for the machine learning lifecycle
MLflow: A Machine Learning Lifecycle Platform MLflow is a platform to streamline machine learning development, including tracking experiments, packagi
Indico - A feature-rich event management system, made @ CERN, the place where the Web was born.
Indico Indico is: 🗓 a general-purpose event management tool; 🌍 fully web-based; 🧩 feature-rich but also extensible through the use of plugins; ⚖️ O
:mag: Ambar: Document Search Engine
🔍 Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am
Admin Panel for GinoORM - ready to up & run (just add your models)
Gino-Admin Docs (state: in process): Gino-Admin docs Play with Demo (current master 0.2.3) Gino-Admin demo (login: admin, pass: 1234) Admin
Ready-to-use and customizable users management for FastAPI
FastAPI Users Ready-to-use and customizable users management for FastAPI Documentation: https://frankie567.github.io/fastapi-users/ Source Code: https
Dynamic image server for web and print
Quru Image Server - dynamic imaging for web and print QIS is a high performance web server for creating and delivering dynamic images. It is ideal for
This is a new web-based photo management application. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, location awareness, color analysis and other ML algorithms.
Photonix Photo Manager This is a photo management application based on web technologies. Run it on your home server and it will let you find what you
Python-based tools for document analysis and OCR
ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so
Knowledge Management for Humans using Machine Learning & Tags
HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.
Dockerized iCloud drive
iCloud-drive-docker is a simple iCloud drive client in Docker environment. It uses pyiCloud python library to interact with iCloud
Knowledge Management for Humans using Machine Learning & Tags
HyperTag helps humans intuitively express how they think about their files using tags and machine learning. Represent how you think using tags. Find what you look for using semantic search for your text documents (yes, even PDF's) and images.
Generate modern Python clients from OpenAPI
openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need
Topic Modelling for Humans
gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
NLP library designed for reproducible experimentation management
Welcome to the Transfer NLP library, a framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP You can
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual
Customizable User Authorization & User Management: Register, Confirm, Login, Change username/password, Forgot password and more.
Flask-User v1.0 Attention: Flask-User v1.0 is a Production/Stable version. The previous version is Flask-User v0.6. User Authentication and Management
Flask user session management.
Flask-Login Flask-Login provides user session management for Flask. It handles the common tasks of logging in, logging out, and remembering your users
:rocket: CLI tool for FastAPI. Generating new FastAPI projects & boilerplates made easy.
Project generator and manager for FastAPI. Source Code: View it on Github Features 🚀 Creates customizable project boilerplate. Creates customizable a