44 Repositories
Python existing-pdfs Libraries
A Python library for generating new text from existing samples.
ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt
Convert scans of handwritten notes to beautiful, compact PDFs
Convert scans of handwritten notes to beautiful, compact PDFs
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
A bulk pdf generator. This application can generate PDFs in bulk by using just one click.
A bulk html pdf generator. This application can generate PDFs in bulk by using just one click. Screenshots Requirements 🧱 Your system must have the f
Clarity mode is a single-notebook interface built with existing JupyterLab components.
JupyterLab Clarity Mode Clarity mode is a single-notebook interface built with existing JupyterLab components. To install: Clone this repository Ensur
Generate bitcoin public and private keys and check if they match a filelist of existing addresses that have a nonzero balance
btc-heist Running Install deps, i.e., python3 -m pip install -r requirements.txt Download the CSV dump of all bitcoin addresses with a balance and cut
Scans pdfs for links written in plaintext and checks if they are active or returns an error code.
Scans pdfs for links written in plaintext and checks if they are active or returns an error code. It then generates a report of its findings. Extract references (pdf, url, doi, arxiv) and metadata from a PDF.
A collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.
KGQA Datasets Brief Introduction This repository is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to
Excalibur: A web interface to extract tabular data from PDFs
Excalibur: A web interface to extract tabular data from PDFs Excalibur is a web interface to extract tabular data from PDFs, written in Python 3! It i
Existing Literature about Machine Unlearning
Machine Unlearning Papers 2021 Brophy and Lowd. Machine Unlearning for Random Forests. In ICML 2021. Bourtoule et al. Machine Unlearning. In IEEE Symp
A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.
Universal Online Judge Spider Introduction This is a spider for Universal Online Judge (UOJ) system (https://uoj.ac/). It also works for all other Onl
A new play-and-plug method of controlling an existing generative model with conditioning attributes and their compositions.
Controllable and Compositional Generation with Latent-Space Energy-Based Models Official PyTorch implementation of the NeurIPS 2021 paper: Controllabl
Pymongo based CLI client, to run operation on existing databases and collections
Mongodb-Operations-Console Pymongo based CLI client, to run operation on existing databases and collections Program developed by Gustavo Wydler Azuaga
pdf_sprinkles: sprinkles text in your PDFs
pdf_sprinkles: sprinkles text in your PDFs pdf_sprinkles remotely OCRs a PDF with Google Cloud Document AI, and returns the result as a PDF with searc
Reads Data from given Excel File and exports Single PDFs and a complete PDF grouped by Gateway
E-Shelter Excel2QR Reads Data from given Excel File and exports Single PDFs and a complete PDF grouped by Gateway Features Reads Excel 2021 Export Sin
pikepdf is a Python library for reading and writing PDF files.
A Python library for reading and writing PDF, powered by qpdf
Fully automated download and parsing for Texas A&M University's Registrar's grade distribution PDFs for years 2014+.
Fully automated download and parsing for Texas A&M University's Registrar's grade distribution PDFs for years 2014+. Adds the parsing results to a mySQL database.
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.
pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp
Auto Convert PDFs to png files in python
This python tool, which is an application of PyMuPDF module, could auto convert PDFs to png files
This is the fuzzer I made to fuzz Preview on macOS and iOS like 8years back when I just started fuzzing things.
Fuzzing PDFs like its 1990s This is the fuzzer I made to fuzz Preview on macOS and iOS like 8years back when I just started fuzzing things. Some discl
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.
pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp
Pdfencrypt is a tool to encrypt/lock PDFs
Pdfencrypt Pdfencrypt is a tool to encrypt/lock PDFs Installation $ apt update $ apt upgrade $ apt install git $ apt install python $ git clone https:
HTML2Image is a lightweight Python package that acts as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.
Continual Reinforcement Learning This repository provides a simple way to run continual reinforcement learning experiments in PyTorch, including evalu
In this program, I generate the strong password from existing password without class using python. The special thing in this program is that is generate very strong password from our existing password. It replaces some elements from password and put another elements in it.
Strong-password-Generator-from-existing-password-using-python In this program, I generate the strong password from existing password without class usi
Find existing email addresses by nickname using API/SMTP checking methods without user notification. Please, don't hesitate to improve cat's job! 🐱🔎 📬
mailcat The only cat who can find existing email addresses by nickname. Usage First install requirements: pip3 install -r requirements.txt Then just
DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by launching a dictionary based attack against a webserver and analyse its response.
DirBruter DirBruter is a Python based CLI tool. It looks for hidden or existing directories/files using brute force method. It basically works by laun
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.
aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in a future time.
aws-lambda-scheduler aws-lambda-scheduler lets you call any existing AWS Lambda Function you have in the future. This functionality is achieved by dyn
A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.
ACL-Anthology-Crawler A toolkit to automatically crawl the paper list and download paper pdfs of ACL Anthology
Camelot is a Python library that can help you extract tables from PDFs!
A Python library to extract tabular data from PDFs
A python library for extracting text from PDFs without losing the formatting of the PDF content.
Multilingual PDF to Text Install Package from Pypi Install it using pip. pip install multilingual-pdf2text The library uses Tesseract which can be ins
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.
BuddyPress is an open source WordPress plugin to build a community site. In releases of BuddyPress from 5.0.0 before 7.2.1 it's possible for a non-privileged, regular user to obtain administrator rights by exploiting an issue in the REST API members endpoint. The vulnerability has been fixed in BuddyPress 7.2.1. Existing installations of the plugin should be updated to this version to mitigate the issue.
CVE-2021-21389 BuddyPress 7.2.1 - REST API Privilege Escalation to RCE PoC (Full) Affected version: 5.0.0 to 7.2.0 User requirement: Subscriber user
existing and custom freqtrade strategies supporting the new hyperstrategy format.
freqtrade-strategies Description Existing and self-developed strategies, rewritten to support the new HyperStrategy format from the freqtrade-develop
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.
doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Extract tables from scanned image PDFs using Optical Character Recognition.
ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt
Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software.
Ganeti 3.0 =========== For installation instructions, read the INSTALL and the doc/install.rst files. For a brief introduction, read the ganeti(7) m
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
ArchiveBox Open-source self-hosted web archiving. ▶️ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe
Predictive AI layer for existing databases.
MindsDB is an open-source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning
MkDocs Plugin allowing your visitors to *File Print Save as PDF* the entire site.
mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th
A library for converting HTML into PDFs using ReportLab
XHTML2PDF The current release of xhtml2pdf is xhtml2pdf 0.2.5. Release Notes can be found here: Release Notes As with all open-source software, its us
Predictive AI layer for existing databases.
MindsDB is an open-source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning