348 Repositories
Python pdf-table-extract Libraries
This Lambda will Pull propagated routes from TGW and update VPC route table
AWS-Transitgateway-Route-Propagation This Lambda will Pull propagated routes from TGW and update VPC route table. Tested on python 3.8 Lambda AWS INST
A python module for extract domains
A python module for extract domains
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti
This python module allows to extract data from the RAW-file-format produces by devices from Thermo Fisher Scientific.
fisher_py This Python module allows access to Thermo Orbitrap raw mass spectrometer files. Using this library makes it possible to automate the analys
Table automatically extraction from PDF Document
PDF Table Extractor Table automatically extraction from PDF Document Our Icon 📌 Name : PDF Table Extractor 📌 Authors : Minku Koo Jiyong Park 📌 Deve
Multtable is a collection of multiplication table generators in various languages.
Multtable Multtable is a collection of multiplication table generators in various languages. This project was created as a joke based on one of my bro
Build, deploy and extract satellite public constellations with one command line.
SatExtractor Build, deploy and extract satellite public constellations with one command line. Table of Contents About The Project Getting Started Stru
This is a python table of data implementation with styles, colors
Table This is a python table of data implementation with styles, colors Example Table adapts to the lack of data Lambda color features Full power of l
A string extractor module for python
A string extractor module for python
Convert lecture videos to slides in one line. Takes an input of a directory containing your lecture videos and outputs a directory containing .PDF files containing the slides of each lecture.
Convert lecture videos to slides in one line. Takes an input of a directory containing your lecture videos and outputs a directory containing .PDF files containing the slides of each lecture.
A simple Burp Suite extension to extract datas from source code
DataExtractor A simple Burp Suite extension to extract datas from source code. Features in scope parsing file extensions to ignore files exclusion bas
A bot that extract text from images using the Tesseract OCR.
Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I
The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classifier')
The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classifier')
navigation_commander is a ROS package to command the robot to navigate autonomously to each table for food delivery inside a hotel.
navigation_commander navigation_commander is a ROS package to command the robot to navigate autonomously to each table for food delivery inside a hote
A time table app to notify the user about their class timings
kivyTimeTable A time table app to notify the user about their class timings Features This project incorporates some features i wanted to see in a time
This tool can be used to extract information from any website
WEB-INFO- This tool can be used to extract information from any website Install Termux and run the command --- $ apt-get update $ apt-get upgrade $ pk
A simple pdf size compressing telegram robot witten in python.
Pdf Compressor Telegram Bot ##About : A simple pdf size compressing telegram robot witten in python. Mostly useful for digital documentation. Deploy t
Arxiv2Kindle is a simple script written in python that converts LaTeX source downloaded from Arxiv and recompiles it to better fit a Kindle or other similar reading devices.
Arxiv2Kindle is a simple script written in python that converts LaTeX source downloaded from Arxiv and recompiles it to better fit a read
Convert a .vcf file to 'aa_table.tsv', including depth & alt frequency info
Produce an 'amino acid table' file from a vcf, including depth and alt frequency info.
Convert Lecture Videos to PDF
Convert Lecture Videos to PDF Description Want to go through lecture videos faster without missing any information? Wish you can read the lecture vide
pyETT: Python library for Eleven VR Table Tennis data
pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.
Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti
Simple python tool created for downloading PDF.
PDFdownloader Usage Open PDF in full-screen mode Run scan.exe Enter how many pages you want to scan Focus PDF After scanning is done, run merge.exe En
This book will take you on an exploratory journey through the PDF format, and the borb Python library.
This book will take you on an exploratory journey through the PDF format, and the borb Python library.
Docbarcodes extracts 1D and 2D barcodes from scanned PDF documents or images. It can be used to automate extraction and processing of all kind of documents.
Intro Barcodes are being used in many documents or forms to enable machine reading capabilities and reduce manual processing effort. Simple 1D barcode
This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.
LeasePlan - Scraper This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease. It has
An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.
An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.
Programmatically access the physical and chemical properties of elements in modern periodic table.
API to fetch elements of the periodic table in JSON format. Uses Pandas for dumping .csv data to .json and Flask for API Integration. Deployed on "pyt
Python library to remotely extract credentials on a set of hosts.
Python library to remotely extract credentials on a set of hosts.
A tool to extract the IdP cert from vCenter backups and log in as Administrator
vCenter SAML Login Tool A tool to extract the Identity Provider (IdP) cert from vCenter backups and log in as Administrator Background Commonly, durin
Simple Telegram Bot to extract various types of archives from a telegram file or a direct link
Unzipper Bot A Telegram Bot to Extract Various Types Of Archives Features Extract various types of archives like rar, zip, tar, 7z, tar.xz etc. Passwo
Extract data from ThousandEyes REST API and visualize it on your customized Grafana Dashboard.
ThousandEyes Grafana Dashboard Extract data from the ThousandEyes REST API and visualize it on your customized Grafana Dashboard. Deploy Grafana, Infl
Website desenvolvido em Django para gerenciamento e upload de arquivos (.pdf).
Website para Gerenciamento de Arquivos Features Esta é uma aplicação full stack web construída para desenvolver habilidades com o framework Django. O
distfit - Probability density fitting
Python package for probability density function fitting of univariate distributions of non-censored data
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.
An application which enables the users to perform simple yet intriguing PDF operations
AstutePDF A repository containing the GUI for an application which enables the users to perform simple yet intriguing PDF operations. These include, M
a tool for annotating table
table_annotate_tool a tool for annotating table motivated by wiki2bio,we create a tool to annoate all types of tables,this tool can annotate a table w
I can help you convert your images to pdf file.
IMAGE TO PDF CONVERTER BOT Configs TOKEN - Get bot token from @BotFather API_ID - From my.telegram.org API_HASH - From my.telegram.org Deploy to Herok
Parser manager for parsing DOC, DOCX, PDF or HTML files
Parser manager Description Parser gets PDF, DOC, DOCX or HTML file via API and saves parsed data to the database. Implemented in Ruby 3.0.1 using Acti
Convert emails without attachments to pdf and send as email
Email to PDF to email This script will check an imap folder for unread emails. Any unread email that does not have an attachment will be converted to
Easy to use Python module to extract Exif metadata from digital image files.
Easy to use Python module to extract Exif metadata from digital image files.
Gera um PDF, logo depois de você responder um questionário simples, e envia para o e-mail que você informar.
PDF generator and send it for your email Criador: Francisco Robson de O. Dutra Filho Repositório criado no dia 18/09/2021 Instagram: @robsondutra_ Sob
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.
The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.
x-ray is a Python library for finding bad redactions in PDF documents.
A tool to detect whether a PDF has a bad redaction
Extract XML from the OS X dictionaries.
Extract XML from the OS X dictionaries.
paintable GitHub contribute table
githeart paintable github contribute table how to use: Functions key color select 1,2,3,4,5 clear c drawing mode mode on turn off e print paint matrix
Fraud Multiplication Table Detection in python
Fraud-Multiplication-Table-Detection-in-python In this program, I have detected fraud multiplication table using python without class. Here, I have co
Extract and visualize information from Gurobi log files
GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp
Telegram bot to extract text from image
OCR Bot @Image_To_Text_OCR_Bot A star ⭐ from you means a lot to us! Telegram bot to extract text from image Usage Deploy to Heroku Tap on above button
Python Library to Extract youtube video Tags without Youtube API
YoutubeTags Python Library to Extract youtube video Tags without Youtube API Installation pip install YoutubeTags Example import YoutubeTags from Yout
Python lib for Simple PDF text extraction
Python lib for Simple PDF text extraction
Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.
End-to-end neural table-text understanding models.
WeasyPrint is a smart solution helping web developers to create PDF documents.
WeasyPrint is a smart solution helping web developers to create PDF documents. It turns simple HTML pages into gorgeous statistical reports, invoices, tickets…
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the
find all the URL of a site with a specific Regex
href this program will find all the link with a spesfic Regex pattern from a site. what it will do in any site there are a lots of url that may you ne
borb is a library for reading, creating and manipulating PDF files in python.
borb is a library for reading, creating and manipulating PDF files in python.
Python based tool to extract forensic info from EventTranscript.db (Windows Diagnostic Data)
EventTranscriptParser EventTranscriptParser is python based tool to extract forensically useful details from EventTranscript.db (Windows Diagnostic Da
Fetch McDonald invoices from mailbox and merge them to one PDF file.
concatenate Fetch McDonald invoices from mailbox and merge them to one PDF file. Description This script will fetch all McDonald invoice pdfs from a p
Python script that split PDF files.
Automatic PDF Splitter This script can create new single-page PDFs files from multipaged PDFs. Requirements Python 3.0+ # Debian distros sudo apt-get
TAPEX: Table Pre-training via Learning a Neural SQL Executor
TAPEX: Table Pre-training via Learning a Neural SQL Executor The official repository which contains the code and pre-trained models for our paper TAPE
Merge multiple PDF files into one.
PDF Merger Merge multiple PDF files into one. Usage % python pdf_merger.py -h usage: pdf_merger.py [-h] [-o OUTPUT] [-f [FILES ...]] optional argumen
Images to PDF Telegram Bot
ilovepdf Convert Images to PDF Bot This bot will helps you to create pdf's from your images [without leaving telegram] 😉 By Default: your pdf fil
Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator
Malicious PDF Generator ☠️ Generate ten different malicious pdf files with phone-home functionality. Can be used with Burp Collaborator. Used for pene
Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix
Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.
About Library for extract infomation from thai personal identity card.
ThaiPersonalCardExtract Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract New Feature v1.3.2 🎁 In
Low code JSON to extract data in one line
JSON Inline Low code JSON to extract data in one line ENG RU Installation pip install json-inline Usage Rules Modificator Description ?key:value Searc
Extract knowledge from raw text
Extract knowledge from raw text This repository is a nearly copy-paste of "From Text to Knowledge: The Information Extraction Pipeline" with some cosm
Py_extract is a simple, light-weight python library to handle some extraction tasks using less lines of code
py_extract Py_extract is a simple, light-weight python library to handle some extraction tasks using less lines of code. Still in Development Stage! I
Practical Single-Image Super-Resolution Using Look-Up Table
Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep
Full-text multi-table search application for Django. Easy to install and use, with good performance.
django-watson django-watson is a fast multi-model full-text search plugin for Django. It is easy to install and use, and provides high quality search
Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it
Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it
This app converts an pdf file into the audio file.
PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t
Camelot is a Python library that can help you extract tables from PDFs!
A Python library to extract tabular data from PDFs
Extract Thailand COVID-19 Cluster data from daily briefing pdf.
Thailand COVID-19 Cluster Data Extraction About Extract Clusters from Thailand Daily COVID-19 briefing PDF Download latest data Here. Data will be upd
Handy Tool to check the availability of onion site and to extract the title of submitted onion links.
This tool helps is to quickly investigate a huge set of onion sites based by checking its availability which helps to filter out the inactive sites and collect the site title that might helps us to categories what site we are handling.
Performing the following operations using python on PDF.
Python PDF Handling Tutorial Python is a highly versatile language with a huge set of libraries. It is a high level language with simple syntax. Pytho
A Python tool to generate a static HTML file that represents the internal structure of a PDF file
PDFSyntax A Python tool to generate a static HTML file that represents the internal structure of a PDF file At some point the low-level functions deve
pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative markdown file as input
pystitcher pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative input in the form of a mark
A python library for extracting text from PDFs without losing the formatting of the PDF content.
Multilingual PDF to Text Install Package from Pypi Install it using pip. pip install multilingual-pdf2text The library uses Tesseract which can be ins
a small simple library for generating documentation from docstrings
inkpot a small simple library for generating documentation from docstrings inkpot is available on pip. Please give it a star if you like it! To know m
Code and data for "TURL: Table Understanding through Representation Learning"
TURL This Repo contains code and data for "TURL: Table Understanding through Representation Learning". Environment and Setup Data Pretraining Finetuni
Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft
uvtt2fgu Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft This program wo
A way to export your saved reddit posts to a Notion table.
reddit-saved-to-notion A way to export your saved reddit posts and comments to a Notion table.Uses notion-sdk-py and praw for interacting with Notion
FeTaQA: Free-form Table Question Answering
FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form
Extract the download URL from OneDrive or SharePoint share link and push it to aria2
OneDriveShareLinkPushAria2 Extract the download URL from OneDrive or SharePoint share link and push it to aria2 从OneDrive或SharePoint共享链接提取下载URL并将其推送到a
PyMuPDF is a Python binding with support for MuPDF
PyMuPDF is a Python binding with support for MuPDF (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
Extract MNIST handwritten digits dataset binary file into bmp images
MNIST-dataset-extractor Extract MNIST handwritten digits dataset binary file into bmp images More info at http://yann.lecun.com/exdb/mnist/ Dependenci
Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools
Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu
Geocode rows in a SQLite database table
Geocode rows in a SQLite database table
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
PDFMiner PDFMiner is a text extraction tool for PDF documents. Warning: As of 2020, PDFMiner is not actively maintained. The code still works, but thi
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files.
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.
The programm for collecting data from Tinkoff API and building Excel table.
tinkproject The program for portfolio analysis via Tinkoff API Hello! This is my first project, please, don't judge me. This project was developed for
Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.
Esparto is a simple HTML and PDF document generator for Python. Its primary use is for generating shareable single page reports with content from popular analytics and data science libraries.
Aggrokatz is an aggressor plugin extension for Cobalt Strike which enables pypykatz to interface with the beacons remotely and allows it to parse LSASS dump files and registry hive files to extract credentials and other secrets stored without downloading the file and without uploading any suspicious code to the beacon.
aggrokatz What is this aggrokatz is an Aggressor plugin extension for CobaltStrike which enables pypykatz to interface with the beacons remotely. The
SpikeX - SpaCy Pipes for Knowledge Extraction
SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
ETL Pipeline with Airflow, Spark, s3, MongoDB and Amazon Redshift
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1
Get Landsat surface reflectance time-series from google earth engine
geextract Google Earth Engine data extraction tool. Quickly obtain Landsat multispectral time-series for exploratory analysis and algorithm testing On
Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)
DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent