10447 Repositories
Python python-hacking Libraries
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.
cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.
doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra
Python-based tools for document analysis and OCR
ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so
第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)第一名;仅采用densenet识别图中文字
OCR 第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)冠军 模型结果 该比赛计算每一个条目的f1score,取所有条目的平均,具体计算方式在这里。这里的计算方式不对一句话里的相同文字重复计算,故f1score比提交的最终结果低: - train val f1score 0
python ocr using tesseract/ with EAST opencv detector
pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de
Textboxes : Image Text Detection Model : python package (tensorflow)
shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl
Textboxes implementation with Tensorflow (python)
tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize
Textboxes_plusplus implementation with Tensorflow (python)
TextBoxes++-TensorFlow TextBoxes++ re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modifie
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |
AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car
AdvancedEAST AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Dete
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:
PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network Introduction This is a tensorflow re-implementation of PSENet: Shape Robu
Lightning Fast Language Prediction 🚀
whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Extract tables from scanned image PDFs using Optical Character Recognition.
ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt
OCR software for recognition of handwritten text
Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis
A simple document layout analysis using Python-OpenCV
Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen
AntroPy: entropy and complexity of (EEG) time-series in Python
AntroPy is a Python 3 package providing several time-efficient algorithms for computing the complexity of time-series. It can be used for example to e
Python generation script for BitBirds
BitBirds generation script Intro This is published under MIT license, which means you can do whatever you want with it - entirely at your own risk. Pl
A python script to lookup Passport Index Dataset
visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]
A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response and scrap complete article - No need to write scrappers for articles fetching anymore
GNews 🚩 A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response 🚩 As well as you can fetch full
Cyberbrain: Python debugging, redefined.
Cyberbrain1(电子脑) aims to free programmers from debugging.
DECAF: Deep Extreme Classification with Label Features
DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain
A forensic collection tool written in Python.
CHIRP A forensic collection tool written in Python. Watch the video overview 📝 Table of Contents 📝 Table of Contents 🧐 About 🏁 Getting Started Pre
Telegram RAT written in Python
teleRAT Python based RAT that uses Telegram for sending commands and receiving data to and from a victim computer. Setup.py Insert your API key into t
Implementation of Kalman Filter in Python
Kalman Filter in Python This is a basic example of how Kalman filter works in Python. I do plan on refactoring and expanding this repo in the future.
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices
介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python =3.7 brew install libusb pkg-config 如需使用 g
Use Ghidra Structs in Python
Strudra Welcome to Strudra, a way to craft Ghidra structs in python, using ghidra_bridge. Example First, init Strudra - you can pass in a custom Ghidr
track IP Address
ipX Table of Contents ipX Welcome Features Uses Author 📝 License Welcome find the location of an IP address. Specifically, you can get the following
Render Jupyter notebook in the terminal
jut - JUpyter notebook Terminal viewer. The command line tool view the IPython/Jupyter notebook in the terminal. Install pip install jut Usage $jut --
A Python pickling decompiler and static analyzer
Fickling Fickling is a decompiler, static analyzer, and bytecode rewriter for Python pickle object serializations. Pickled Python objects are in fact
A full-featured, hackable tiling window manager written and configured in Python
A full-featured, hackable tiling window manager written and configured in Python Features Simple, small and extensible. It's easy to write your own la
Nicotine+: A graphical client for the SoulSeek peer-to-peer system
Nicotine+ Nicotine+ is a graphical client for the Soulseek peer-to-peer file sharing network. Nicotine+ aims to be a pleasant, Free and Open Source (F
Viewer for NFO files
NFO Viewer NFO Viewer is a simple viewer for NFO files, which are "ASCII" art in the CP437 codepage. The advantages of using NFO Viewer instead of a t
A community-driven python bot that aims to be as simple as possible to serve humans with their everyday tasks
JARVIS on Messenger Just A Rather Very Intelligent System, now on Messenger! Messenger is now used by 1.2 billion people every month. With the launch
Open source home automation that puts local control and privacy first
Home Assistant Open source home automation that puts local control and privacy first. Powered by a worldwide community of tinkerers and DIY enthusiast
The Pants Build System
Pants Build System Pants is a scalable build system for monorepos: codebases containing multiple projects, often using multiple programming languages
A PyPI mirror client according to PEP 381 http://www.python.org/dev/peps/pep-0381/
This is a PyPI mirror client according to PEP 381 + PEP 503 http://www.python.org/dev/peps/pep-0381/. bandersnatch =4.0 supports Linux, MacOSX + Wind
Python IDE for beginners
Thonny Thonny is a Python IDE meant for learning programming. End users See https://thonny.org and wiki for more info. Contributors Contributions are
{Ninja-IDE Is Not Just Another IDE}
Ninja-IDE Is Not Just Another IDE Ninja-IDE is a cross-platform integrated development environment (IDE) that allows developers to create applications
A small, simple editor for beginner Python programmers. Written in Python and Qt5.
Mu - A Simple Python Code Editor Mu is a simple code editor for beginner programmers based on extensive feedback from teachers and learners. Having sa
Leo is an Outliner, Editor, IDE and PIM written in 100% Python.
Leo 6.3, http://leoeditor.com, is now available on GitHub. Leo is an IDE, outliner and PIM. The highlights of Leo 6.3 leoAst.py: The unification of Py
Komodo Edit is a fast and free multi-language code editor. Written in JS, Python, C++ and based on the Mozilla platform.
Komodo Edit This readme explains how to get started building, using and developing with the Komodo Edit source base. Whilst the main Komodo Edit sourc
The uncompromising Python code formatter
The Uncompromising Code Formatter “Any color you like.” Black is the uncompromising Python code formatter. By using it, you agree to cede control over
The source code that powers readthedocs.org
Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex
The project that powers MDN.
Kuma Kuma is the platform that powers MDN (developer.mozilla.org) Development Code: https://github.com/mdn/kuma Issues: P1 Bugs (to be fixed ASAP) P2
Gaphor is the simple modeling tool
Gaphor Gaphor is a UML and SysML modeling application written in Python. It is designed to be easy to use, while still being powerful. Gaphor implemen
Create docsets for Dash.app-compatible API browser.
doc2dash: Create Docsets for Dash.app and Clones doc2dash is an MIT-licensed extensible Documentation Set generator intended to be used with the Dash.
Legacy python processor for AsciiDoc
AsciiDoc.py This branch is tracking the alpha, in-progress 10.x release. For the stable 9.x code, please go to the 9.x branch! AsciiDoc is a text docu
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
🦔 PostHog is developer-friendly, open-source product analytics.
PostHog provides open-source product analytics, built for developers. Automate the collection of every event on your website or app, with no need to send data to 3rd parties. With just 1 click you can deploy on your own infrastructure, having full API/SQL access to the underlying data.
Oppia a free, online learning platform to make quality education accessible for all.
Oppia is an online learning tool that enables anyone to easily create and share interactive activities (called 'explorations'). These activities simulate a one-on-one conversation with a tutor, making it possible for students to learn by doing while getting feedback.
This program is an automated trading bot that uses TDAmeritrades Thinkorswim trading platform's scanners and alerts system.
Python Trading Bot w/ Thinkorswim Description This program is an automated trading bot that uses TDAmeritrades Thinkorswim trading platform's scanners
Send text to girlfriend in the morning
Girlfriend Text Send text to girlfriend (or really anyone with a phone number) in the morning 1. Configure your settings in utils.py. phone_number = "
📔️ Generate a text-based journal from a template file.
JGen 📔️ Generate a text-based journal from a template file. Contents Getting Started Example Overview Usage Details Reserved Keywords Gotchas Getting
PubMed Mapper: A Python library that map PubMed XML to Python object
pubmed-mapper: A Python Library that map PubMed XML to Python object 中文文档 1. Philosophy view UML Programmatically access PubMed article is a common ta
First Party data integration solution built for marketing teams to enable audience and conversion onboarding into Google Marketing products (Google Ads, Campaign Manager, Google Analytics).
Megalista Sample integration code for onboarding offline/CRM data from BigQuery as custom audiences or offline conversions in Google Ads, Google Analy
CTC segmentation python package
CTC segmentation CTC segmentation can be used to find utterances alignments within large audio files. This repository contains the ctc-segmentation py
TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.
Machine learning metrics for distributed, scalable PyTorch applications.
cysimdjson - Very fast Python JSON parsing library
Fast JSON parsing library for Python, 7-12 times faster than standard Python JSON parser.
Universal Radio Hacker: Investigate Wireless Protocols Like A Boss
The Universal Radio Hacker (URH) is a complete suite for wireless protocol investigation with native support for many common Software Defined Radios.
Python low-interaction honeyclient
Thug The number of client-side attacks has grown significantly in the past few years shifting focus on poorly protected vulnerable clients. Just as th
SpiderFoot automates OSINT collection so that you can focus on analysis.
SpiderFoot is an open source intelligence (OSINT) automation tool. It integrates with just about every data source available and utilises a range of m
Privacy-respecting metasearch engine
Privacy-respecting, hackable metasearch engine / pronunciation səːks. If you are looking for running instances, ready to use, then visit searx.space.
Remote Desktop Protocol in Twisted Python
RDPY Remote Desktop Protocol in twisted python. RDPY is a pure Python implementation of the Microsoft RDP (Remote Desktop Protocol) protocol (client a
Pupy is an opensource, cross-platform (Windows, Linux, OSX, Android) remote administration and post-exploitation tool mainly written in python
Pupy Installation Installation instructions are on the wiki, in addition to all other documentation. For maximum compatibility, it is recommended to u
:closed_lock_with_key: multi factor authentication system (2FA, MFA, OTP Server)
privacyIDEA privacyIDEA is an open solution for strong two-factor authentication like OTP tokens, SMS, smartphones or SSH keys. Using privacyIDEA you
MozDef: Mozilla Enterprise Defense Platform
MozDef: Documentation: https://mozdef.readthedocs.org/en/latest/ Give MozDef a Try in AWS: The following button will launch the Mozilla Enterprise Def
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
mitmproxy mitmproxy is an interactive, SSL/TLS-capable intercepting proxy with a console interface for HTTP/1, HTTP/2, and WebSockets. mitmdump is the
Phishing Campaign Toolkit
King Phisher Phishing Campaign Toolkit Installation For instructions on how to install, please see the INSTALL.md file. After installing, for instruct
Consolidating and extending hosts files from several well-curated sources. You can optionally pick extensions to block pornography, social media, and other categories.
Take Note! With the exception of issues and PRs regarding changes to hosts/data/StevenBlack/hosts, all other issues regarding the content of the produ
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, security responses, troubleshooting, deployments, and more. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html. Questions? https://forum.stackstorm.com/.
StackStorm is a platform for integration and automation across services and tools, taking actions in response to events. Learn more at www.stackstorm.
Remote Desktop Protocol in Twisted Python
RDPY Remote Desktop Protocol in twisted python. RDPY is a pure Python implementation of the Microsoft RDP (Remote Desktop Protocol) protocol (client a
Dynamic DNS service
About nsupdate.info https://nsupdate.info is a free dynamic DNS service. nsupdate.info is also the name of the software used to implement it. If you l
A colony of interacting processes
NColony Infrastructure for running "colonies" of processes. Hacking $ tox Should DTRT -- if it passes, it means unit tests are passing, and 100% cover
Nagios status monitor for your desktop.
Nagstamon Nagstamon is a status monitor for the desktop. It connects to multiple Nagios, Icinga, Opsview, Centreon, Op5 Monitor/Ninja, Checkmk Multisi
A cron monitoring tool written in Python & Django
Healthchecks Healthchecks is a cron job monitoring service. It listens for HTTP requests and email messages ("pings") from your cron jobs and schedule
gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
Gunicorn Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model ported from Ruby's Unicorn project. The Gunicorn
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
Glances - An eye on your system Summary Glances is a cross-platform monitoring tool which aims to present a large amount of monitoring information thr
Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software.
Ganeti 3.0 =========== For installation instructions, read the INSTALL and the doc/install.rst files. For a brief introduction, read the ganeti(7) m
Daemon to ban hosts that cause multiple authentication errors
__ _ _ ___ _ / _|__ _(_) |_ ) |__ __ _ _ _ | _/ _` | | |/ /| '_ \/ _` | ' \
DC/OS - The Datacenter Operating System
DC/OS - The Datacenter Operating System The easiest way to run microservices, big data, and containers in production. What is DC/OS? Like traditional
Cobbler is a versatile Linux deployment server
Cobbler Cobbler is a Linux installation server that allows for rapid setup of network installation environments. It glues together and automates many
Ajenti Core and stock plugins
Ajenti is a Linux & BSD modular server admin panel. Ajenti 2 provides a new interface and a better architecture, developed with Python3 and AngularJS.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Airflow Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are define
ZFS, in Python, without reading the original C.
ZFSp What? ZFS, in Python, without reading the original C. What?! That's right. How? Many hours spent staring at hexdumps, and asking friends to searc
Continuous Archiving for Postgres
WAL-E Continuous archiving for Postgres WAL-E is a program designed to perform continuous archiving of PostgreSQL WAL files and base backups. To corre
Automatic SQL injection and database takeover tool
sqlmap sqlmap is an open source penetration testing tool that automates the process of detecting and exploiting SQL injection flaws and taking over of
The web end of seafile server.
Introduction Seahub is the web frontend for Seafile. Preparation Build and deploy Seafile server from source. See http://manual.seafile.com/build_seaf
Postgres CLI with autocompletion and syntax highlighting
A REPL for Postgres This is a postgres client that does auto-completion and syntax highlighting. Home Page: http://pgcli.com MySQL Equivalent: http://
A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting.
mycli A command line client for MySQL that can do auto-completion and syntax highlighting. HomePage: http://mycli.net Documentation: http://mycli.net/
An open source multi-tool for exploring and publishing data
Datasette An open source multi-tool for exploring and publishing data Datasette is a tool for exploring and publishing data. It helps people take data
An extensible and friendly code review tool for projects and companies of all sizes.
Review Board Review Board is an open source, web-based code and document review tool built to help companies, open source projects, and other organiza
Trac is an enhanced wiki and issue tracking system for software development projects (mirror)
About Trac Trac is a minimalistic web-based software project management and bug/issue tracking system. It provides an interface to the Git and Subvers
docker run klaus / pip install klaus — the first Git web viewer that Just Works™.
klaus: a simple, easy-to-set-up Git web viewer that Just Works™. (If it doesn't Just Work for you, please file a bug.) Super easy to set up -- no conf
git-cola: The highly caffeinated Git GUI
git-cola: The highly caffeinated Git GUI git-cola is a powerful Git GUI with a slick and intuitive user interface. Copyright (C) 2007-2020, David Agu
🦉Data Version Control | Git for Data & Models
Website • Docs • Blog • Twitter • Chat (Community & Support) • Tutorial • Mailing List Data Version Control or DVC is an open-source tool for data sci
Mirror of Apache Allura
Apache Allura Allura is an open source implementation of a software "forge", a web site that manages source code repositories, bug reports, discussion
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
Spack Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, a