290 Repositories
Python Html-to-pdf-pdfkit-wkhtml- Libraries
Images to PDF Telegram Bot
ilovepdf Convert Images to PDF Bot This bot will helps you to create pdf's from your images [without leaving telegram] đ By Default: your pdf fil
Generate a bunch of malicious pdf files with phone-home functionality. Can be used with Burp Collaborator
Malicious PDF Generator â ď¸ Generate ten different malicious pdf files with phone-home functionality. Can be used with Burp Collaborator. Used for pene
A management system designed for the employees of MIRAS (Art Gallery). It is used to sell/cancel tickets, book/cancel events and keeps track of all upcoming events.
Art-Galleria-Management-System Its a management system designed for the employees of MIRAS (Art Gallery). Backend : Python Frontend : Django Database
HtmlWebShot - A python3 package which Can Create Images From url, Html-CSS, Svg and from any readable file and texts with many setup features.
A python3 package which Can Create Images From url, Html-CSS, Svg and from any readable file and texts with many setup features
This app converts an pdf file into the audio file.
PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t
Extract Thailand COVID-19 Cluster data from daily briefing pdf.
Thailand COVID-19 Cluster Data Extraction About Extract Clusters from Thailand Daily COVID-19 briefing PDF Download latest data Here. Data will be upd
Evidently helps analyze machine learning models during validation or production monitoring
Evidently helps analyze machine learning models during validation or production monitoring. The tool generates interactive visual reports and JSON profiles from pandas DataFrame or csv files. Currently 6 reports are available.
Simple reuse of partial HTML page templates in the Jinja template language for Python web frameworks.
Jinja Partials Simple reuse of partial HTML page templates in the Jinja template language for Python web frameworks. (There is also a Pyramid/Chameleo
Performing the following operations using python on PDF.
Python PDF Handling Tutorial Python is a highly versatile language with a huge set of libraries. It is a high level language with simple syntax. Pytho
A Python tool to generate a static HTML file that represents the internal structure of a PDF file
PDFSyntax A Python tool to generate a static HTML file that represents the internal structure of a PDF file At some point the low-level functions deve
pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative markdown file as input
pystitcher pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative input in the form of a mark
A python library for extracting text from PDFs without losing the formatting of the PDF content.
Multilingual PDF to Text Install Package from Pypi Install it using pip. pip install multilingual-pdf2text The library uses Tesseract which can be ins
mlscraper: Scrape data from HTML pages automatically with Machine Learning
đ¤ Scrape data from HTML websites automatically with Machine Learning
Sierra is a lightweight Python framework for building and integrating web applications
A lightweight Python framework for building and Integrating Web Applications. Sierra is a Python3 library for building and integrating web applications with HTML and CSS using simple enough syntax. You can develop your web applications with Python, taking advantage of its functionalities and integrating them to the fullest.
âď¸ HTML Template engine for python. Supports XSS preventation and many more!
Htmotor HTML Template Engine for Python! Installation: Open your terminal and type pip install htmotor.
ddgr is a cmdline utility to search DuckDuckGo (html version) from the terminal
ddgr is a cmdline utility to search DuckDuckGo (html version) from the terminal. While googler is extremely popular among cmdline users, in many forums the need of a similar utility for privacy-aware DuckDuckGo came up. DuckDuckGo Bangs are super-cool too! So here's ddgr for you!
A pre-attack hacker tool which aims to find out sensitives comments in HTML comment tag and to help on reconnaissance process
Find Out in Comment Find sensetive comment out in HTML â About This is a pre-attack hacker tool that searches for sensitives words in HTML comments ta
PyMuPDF is a Python binding with support for MuPDF
PyMuPDF is a Python binding with support for MuPDF (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
PDFMiner PDFMiner is a text extraction tool for PDF documents. Warning: As of 2020, PDFMiner is not actively maintained. The code still works, but thi
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files.
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retrieve text and metadata from PDFs as well as merge entire files together.
Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.
Esparto is a simple HTML and PDF document generator for Python. Its primary use is for generating shareable single page reports with content from popular analytics and data science libraries.
Tweak the form field rendering in templates, not in python-level form definitions. CSS classes and HTML attributes can be altered.
django-widget-tweaks Tweak the form field rendering in templates, not in python-level form definitions. Altering CSS classes and HTML attributes is su
The best way to have DRY Django forms. The app provides a tag and filter that lets you quickly render forms in a div format while providing an enormous amount of capability to configure and control the rendered HTML.
django-crispy-forms The best way to have Django DRY forms. Build programmatic reusable layouts out of components, having full control of the rendered
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).
This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1
Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)
DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
README TabNet : Attentive Interpretable Tabular Learning This is a pyTorch implementation of Tabnet (Arik, S. O., & Pfister, T. (2019). TabNet: Attent
đ A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python đ A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf # it's a scriptable c
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f
A machine learning software for extracting information from scholarly documents
GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Camelot: PDF Table Extraction for Humans
Camelot: PDF Table Extraction for Humans Camelot is a Python library that makes it easy for anyone to extract tables from PDF files! Note: You can als
The project that powers MDN.
Kuma Kuma is the platform that powers MDN (developer.mozilla.org) Development Code: https://github.com/mdn/kuma Issues: P1 Bugs (to be fixed ASAP) P2
Visualize Data From Stray Scanner https://keke.dev/blog/2021/03/10/Stray-Scanner.html
StrayVisualizer A set of scripts to work with data collected using Stray Scanner. Usage Installing Dependencies Install dependencies with pip -r requi
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, security responses, troubleshooting, deployments, and more. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html. Questions? https://forum.stackstorm.com/.
StackStorm is a platform for integration and automation across services and tools, taking actions in response to events. Learn more at www.stackstorm.
Parse Robinhood 1099 Tax Document from PDF into CSV
Robinhood 1099 Parser This project converts Robinhood Securities 1099 tax document from PDF to CSV file. This tool will be helpful for those who need
This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta
Render reMarkable documents to PDF
rmrl: reMarkable Rendering Library rmrl is a Python library for rendering reMarkable documents to PDF files. It takes the original PDF document and th
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Ticket shop application for conferences, festivals, concerts, tech events, shows, exhibitions, workshops, barcamps, etc.
pretix Reinventing ticket presales, one ticket at a time. Project status & release cycle While there is always a lot to do and improve on, pretix by n
Easy HTML form without PHP or JavaScript
This repository is no longer active. If you're looking for a simple and powerful hosted form API, please check out https://formspree.io. If you are in
đ Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
ArchiveBox Open-source self-hosted web archiving. âśď¸ Quickstart | Demo | Github | Documentation | Info & Motivation | Community | Roadmap "Your own pe
:mag: Ambar: Document Search Engine
đ Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am
Knowledge Management for Humans using Machine Learning & Tags
HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.
Learn to build a Python Desktop GUI app using pywebview, Python, JavaScript, HTML, & CSS.
Python Desktop App Learn how to make a desktop GUI application using Python, JavaScript, HTML, & CSS all thanks to pywebview. pywebview is essentially
Knowledge Management for Humans using Machine Learning & Tags
HyperTag helps humans intuitively express how they think about their files using tags and machine learning. Represent how you think using tags. Find what you look for using semantic search for your text documents (yes, even PDF's) and images.
Module for automatic summarization of text documents and HTML pages.
Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Module for automatic summarization of text documents and HTML pages.
Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Plugin for generating HTML reports for pytest results
pytest-html pytest-html is a plugin for pytest that generates a HTML report for test results. Resources Documentation Release Notes Issue Tracker Code
MkDocs Plugin allowing your visitors to *File Print Save as PDF* the entire site.
mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th
Generate a single PDF file from MkDocs repository.
PDF Generate Plugin for MkDocs This plugin will generate a single PDF file from your MkDocs repository. This plugin is inspired by MkDocs PDF Export P
An MkDocs plugin to export content pages as PDF files
MkDocs PDF Export Plugin An MkDocs plugin to export content pages as PDF files The pdf-export plugin will export all markdown pages in your MkDocs rep
Your Project with Great Documentation.
Read Latest Documentation - Browse GitHub Code Repository The only thing worse than documentation never written, is documentation written but never di
Colored terminal output for Python's logging module
coloredlogs: Colored terminal output for Python's logging module The coloredlogs package enables colored terminal output for Python's logging module.
đ A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python đ A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
Tweak the form field rendering in templates, not in python-level form definitions. CSS classes and HTML attributes can be altered.
django-widget-tweaks Tweak the form field rendering in templates, not in python-level form definitions. Altering CSS classes and HTML attributes is su
django-tables2 - An app for creating HTML tables
django-tables2 - An app for creating HTML tables django-tables2 simplifies the task of turning sets of data into HTML tables. It has native support fo
A pure-python HTML screen-scraping library
Scrapely Scrapely is a library for extracting structured data from HTML pages. Given some example web pages and the data to be extracted, scrapely con
Html Content / Article Extractor, web scrapping lib in Python
Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a
Extract embedded metadata from HTML markup
extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re
A modern CSS selector implementation for BeautifulSoup
Soup Sieve Overview Soup Sieve is a CSS selector library designed to be used with Beautiful Soup 4. It aims to provide selecting, matching, and filter
Preview GitHub README.md files locally before committing them.
Grip -- GitHub Readme Instant Preview Render local readme files before sending off to GitHub. Grip is a command-line server application written in Pyt
Convert HTML to Markdown-formatted text.
html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to
A Python implementation of John Gruberâs Markdown with Extension support.
Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though
đ A ranked list of awesome python libraries for web development. Updated weekly.
Best-of Web Development with Python đ A ranked list of awesome python libraries for web development. Updated weekly. This curated list contains 540 a
The awesome document factory
The Awesome Document Factory WeasyPrint is a smart solution helping web developers to create PDF documents. It turns simple HTML pages into gorgeous s
A library for converting HTML into PDFs using ReportLab
XHTML2PDF The current release of xhtml2pdf is xhtml2pdf 0.2.5. Release Notes can be found here: Release Notes As with all open-source software, its us
Pythonic HTML Parsing for Humansâ˘
Requests-HTML: HTML Parsing for Humans⢠This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When us
Converts text into a PDF of handwritten notes
Text To Handwritten Notes Converts text into a PDF of handwritten notes Explore the docs  ¡ Report Bug ¡ Request Feature ¡ Steps: $ git clone https:/
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf
NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p
A html canvas based screencasting server with occasional ground-truth updates via screenshots and very fast input drawing
rm2canvas A html canvas based screencasting server for the reMarkable 1/2 digital paper systems. It draws live on the canvas from the remarkables touc
Every web site provides APIs.
Toapi Overview Toapi give you the ability to make every web site provides APIs. Version v2.0.0, Completely rewrote. More elegant. More pythonic v1.0.0
Module for automatic summarization of text documents and HTML pages.
Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim
Pythonic HTML Parsing for Humansâ˘
Requests-HTML: HTML Parsing for Humans⢠This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When us
Convert HTML to Markdown-formatted text.
html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to
A Python implementation of John Gruberâs Markdown with Extension support.
Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though
Safely add untrusted strings to HTML/XML markup.
MarkupSafe MarkupSafe implements a text object that escapes characters so it is safe to use in HTML and XML. Characters that have special meanings are
Standards-compliant library for parsing and serializing HTML documents and fragments in Python
html5lib html5lib is a pure-python library for parsing HTML. It is designed to conform to the WHATWG HTML specification, as is implemented by all majo
Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
Bleach Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes. Bleach can also linkify text safely, appl
Build GUI for your Python program with JavaScript, HTML, and CSS
https://pywebview.flowrl.com pywebview is a lightweight cross-platform wrapper around a webview component that allows to display HTML content in its o
Launched in 2018 Actively developed and supported. Supports tkinter, Qt, WxPython, Remi (in browser). Create custom layout GUI's simply. Python 2.7 & 3 Support. 200+ Demo programs & Cookbook for rapid start. Extensive documentation. Examples using Machine Learning(GUI, OpenCV Integration, Chatterbot), Floating Desktop Widgets, Matplotlib + Pyplot integration, add GUI to command line scripts, PDF & Image Viewer. For both beginning and advanced programmers .
Python GUIs for Humans Transforms the tkinter, Qt, WxPython, and Remi (browser-based) GUI frameworks into a simpler interface. The window definition i
A little Python library for making simple Electron-like HTML/JS GUI apps
Eel Eel is a little Python library for making simple Electron-like offline HTML/JS GUI apps, with full access to Python capabilities and libraries. Ee
A flexible forms validation and rendering library for Python.
WTForms WTForms is a flexible forms validation and rendering library for Python web development. It can work with whatever web framework and template
The best way to have DRY Django forms. The app provides a tag and filter that lets you quickly render forms in a div format while providing an enormous amount of capability to configure and control the rendered HTML.
django-crispy-forms The best way to have Django DRY forms. Build programmatic reusable layouts out of components, having full control of the rendered
A Python HTML form library.
Deform Introduction Use cases Installation Example Status Projects using Deform Community and links Introduction Deform is a Python form library for g