466 Repositories
Python pdf-scraper-with-ocr Libraries
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch
Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica
Incomplete easy-to-use math solver and PDF generator.
Math Expert Let me do your work Preview preview.mp4 Introduction Math Expert is our (@salastro, @younis-tarek, @marawn-mogeb) math high school graduat
Optical Character Recognition + Instance Segmentation for russian and english languages
Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты
Optical character recognition for Japanese text, with the main focus being Japanese manga
Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran
Python library to receive live stream events like comments and gifts in realtime from TikTok LIVE.
TikTokLive A python library to connect to and read events from TikTok's LIVE service A python library to receive and decode livestream events such as
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework Published at palewi.re/docs/first-github-scraper/ Contrib
Let's create a tool to convert Thailand budget from PDF to CSV.
thailand-budget-pdf2csv Let's create a tool to convert Thailand Government Budgeting from PDF to CSV! รวมพลัง Dev แปลงงบ จาก PDF สู่ Machine-readable
This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf
Behavior-Sequence-Transformer-Pytorch This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf This model
Dude is a very simple framework for writing web scrapers using Python decorators
Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax.
⚡TIKTOK BOT - FAST OPTIMIZED ZEFOY SCRIPT
⚡ ZEFOY [ TikTok Zefoy Bot ] Get the script in: discord.gg/onlp !! Official shop: onlp.sellix.io Newest version v.9.0.0 Requirements pip install p
script to scrape direct download links (ddls) from google drive index.
bhadoo Google Personal/Shared Drive Index scraper. A small script to scrape direct download links (ddls) of downloadable files from bhadoo google driv
Read Japanese manga inside browser with selectable text.
mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer
Fast TikTok NO Watermark Video Downloader (username or url)
💎 TD [ TikDown v4 ] Star ⭐ if you want more Discord Server * discord.gg/onlp | Waxor#9999 Why not open source anymore ? * BECAUSE PEOPLE SKID, STEA
A telegram bot written in Python to fetch random SFW & NSFW anime images
Tsuzumi A telegram bot written in python to fetch both random SFW & NSFW Anime images using nekos.life & waifu.pics API Commands SFW Commands : /
A modern pure-Python library for reading PDF files
pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and
for those who dont want to pay $10/month for high school game footage with ads
nfhs-scraper Disclaimer: I am in no way responsible for what you choose to do with this script and guide. I do not endorse avoiding paywalls or any il
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)
Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper
Pydf: A modular Telegram Bot which provides Pdf Tools using PyPdf2
pyDF-Bot 🌍 Pydf - Pyrogram Document File Bot, a modular Telegram Bot which prov
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies
Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti
Wats2PDF - Convert whatsapp exported chat(without media) into a readable pdf format
Wats2PDF convert whatsApp exported chat into a readable pdf format. convert with
Subscrape - A Python scraper for substrate chains
subscrape A Python scraper for substrate chains that uses Subscan. Usage copy co
Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow
Digitalizing-Prescription-Image PIRDS - Prescription Image Recognition and Digit
Multi Account Generator Minecraft/NordVPN/Hulu/Origin And ...
Multi Account Generator Minecraft/NordVPN/Hulu/Origin And ...
Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)
Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc).
Convert PDF to AudioBook and Audio Speech to PDF
In this Python project, we will build a GUI-based PDF to Audio and Audio to PDF converter using the Tkinter, OS, path, pyttsx3, SpeechRecognition, PyPDF4, and Pydub libraries and the messagebox module of the Tkinter library.
Select range and every time the screen changes, OCR is activated.
ASOCR(Auto Screen OCR) Select range and every time you press Space key, OCR is activated. 範囲を選ぶと、あなたがスペースキーを押すたびに、画面が変わる度にOCRが起動します。 usage1: simple OC
VG-Scraper is a python program using the module called BeautifulSoup which allows anyone to scrape something off an website. This program lets you put in a number trough an input and a number is 1 news article.
VG-Scraper VG-Scraper is a convinient program where you can find all the news articles instead of finding one yourself. Installing [Linux] Open a term
Quick Project made to help scrape Lexile and Atos(AR) levels from ISBN
Lexile-Atos-Scraper Quick Project made to help scrape Lexile and Atos(AR) levels from ISBN You will need to install the chrome webdriver if you have n
Split given PDF document into 4 page groups and convert them to booklet format
PUTO: PDF to Booklet converter Split given PDF document into 4 page groups and convert them to booklet format. It creates a PDF like shown below: Fir
Convert MD files to PDF automatically (with CSS) 📄🚀
MD2PDF Action Convert MD files to PDF automatically (with CSS)! Converts a pattern described set of markdown files and converts them to pdf whilst app
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.
COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data
Web scraper build using python.
Web Scraper This project is made in pyhthon. It took some info. from website list then add them into data.json file. The dependencies used are: reques
A bot to view Dilbert comics directly from Discord and get updates of the comics automatically.
A bot to view Dilbert comics directly from Discord and get updates of the comics automatically
Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers
Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers.
OCR, Object Detection, Number Plate, Real Time
README.md PrePareded anaconda env requirements.txt clova AI → deep text recognition → trained weights (ex, .pth) wpod-net weights (ex, .h5 , .json) ht
JoplinPdf2Images - Converts a PDF to images in Joplin and adds it to the specified note as a printout
joplinPdf2Images Converts a PDF to images in Joplin and adds it to the specified
UsernameScraperTool - Username Scraper Tool With Python
UsernameScraperTool Username Scraper for 40+ Social sites. How To use git clone
Svg2pdfgen - Svg To PDF gen with python
Svg2pdfgen - Svg To PDF gen with python
Compare-pdf - A Flask driven restful API for comparing two PDF files
COMPARE-PDF A Flask driven restful API for comparing two PDF files. Description
Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.
OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to
Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia
Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia
Optical character recognition for Japanese text, with the main focus being Japanese manga
Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran
OSTA web scraper, for checking the status of school buses in Ottawa
OSTA-La-Vista OSTA web scraper, for checking the status of school buses in Ottawa. Getting Started Using a Raspberry Pi, download Python 3, and option
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text
Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p
Simple and understandable swin-transformer OCR project
swin-transformer-ocr ocr with swin-transformer Overview Simple and understandable swin-transformer OCR project. The model in this repository heavily r
Scrap-mtg-top-8 - A top 8 mtg scraper using python
Scrap-mtg-top-8 - A top 8 mtg scraper using python
Basic-html-scraper - A complete how to of web scraping with Python for beginners
basic-html-scraper Code from YT Video This video includes a complete how to of w
OCR-ID-Card VietNamese (new id-card)
OCR-ID-Card VietNamese (new id-card) run project: download 2 file weights and pu
Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot
Aliexpress to telegram post Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a b
Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil
Shopee Scraper A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil. The project was crea
OCR-D wrapper for detectron2 based segmentation models
ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm
SEMID - OSINT module with lots of discord functions
SEMID Framework About Semid is a framework with different Discord functions and
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark
Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods
Scraper pour les offres de stage Tesla et les notes sur Oasis (Polytech Paris-Saclay) sous forme de bot Discord
Scraper pour les offres de stage Tesla et les notes sur Oasis (Polytech Paris-Saclay) sous forme de bot Discord
Binance Smart Chain Contract Scraper + Contract Evaluator
Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy
Binance Smart Chain Contract Scraper + Contract Evaluator
Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy
A backend for mdbook in Python for generating PDF based on Chrome DevTools Protocol.
mdbook-pdf A backend for mdbook written in Python for generating PDF based on Chrome DevTools Protocol. Python library dependency Usage Put mdbook-pdf
Discord group chat spammer concept.
GC Spammer [Concept] GC-Spammer for https://discord.com/ Warning: This is purely a concept. In the past the script worked, however, Discord ratelimite
A web scraper which checks price of a product regularly and sends price alerts by email if price reduces.
Amazon-Web-Scarper Created a web scraper using simple functions to check price of a product on amazon (can be duplicated to check price at other marke
FilmMikirAPI - A simple rest-api which is used for scrapping on the Kincir website using the Python and Flask package
FilmMikirAPI - A simple rest-api which is used for scrapping on the Kincir website using the Python and Flask package
CLI tool that checks who does and who does not follow you back on Instagram
CLI tool that checks who does and who does not follow you back on Instagram. It also checks who you don't follow back on Instagram.
Generate daily updated visualizations of user and repository statistics from the GitHub API using GitHub Actions
Generate daily updated visualizations of user and repository statistics from the GitHub API using GitHub Actions for any combination of private and public repositories - dark mode supported
Fully Automated YouTube Channel ▶️with Added Extra Features.
Fully Automated Youtube Channel ▒█▀▀█ █▀▀█ ▀▀█▀▀ ▀▀█▀▀ █░░█ █▀▀▄ █▀▀ █▀▀█ ▒█▀▀▄ █░░█ ░░█░░ ░▒█░░ █░░█ █▀▀▄ █▀▀ █▄▄▀ ▒█▄▄█ ▀▀▀▀ ░░▀░░ ░▒█░░ ░▀▀▀ ▀▀▀░
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
A Very simple free proxy list scraper.
Scrappp A Very simple free proxy list scraper, made in python The tool scrape proxy from diffrent sites and api's. Screenshots About the script !!! RE
🐞 Douban Movie / Douban Book Scarpy
Python3-based Douban Movie/Douban Book Scarpy crawler for cover downloading + data crawling + review entry.
Simple proxy scraper made by using ProxyScrape's api.
What is Moon? Moon is a lightweight and fast proxy scraper made by using ProxyScrape's api. What can i do with this? You can use proxies for varietys
Telegram group scraper tool
Telegram Group Scrapper
Automatically scrapes all menu items from the Taco Bell website
Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.
This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.
Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the
A bot for PDF for doing Many Things....
Telegram PDF Bot A Telegram bot that can: Compress, crop, decrypt, encrypt, merge, preview, rename, rotate, scale and split PDF files Compare text dif
Generate a repository with mirror links for DriveDroid app
DriveDroid Repository Generator Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone Check also an offi
Telegram bot/scraper to get the latest NUS vacancy reports.
Telegram bot/scraper to get the latest NUS vacancy reports. Stay ahead of the curve and don't get modrekt.
PDFSanitizer - Renders possibly unsafe PDF files and outputs harmless PDF files
PDFSanitizer Renders possibly malicious PDF files and outputs harmless PDF files
A Telegram crawler to search groups and channels automatically and collect any type of data from them.
Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb
Ddddocr - 通用验证码识别OCR pypi版
带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时
Python utility library for compositing PDF documents with reportlab.
pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.
NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu
Fine tuning keras-ocr python package with custom synthetic dataset from scratch
OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound
Htmdf - html to pdf with support for variables using fastApi.
htmdf Converts html to pdf with support for variables using fastApi. Installation Clone this repository. git clone https://github.com/ShreehariVaasish
Image Compression GUI APP Python: PyQt5
Image Compression GUI APP Image Compression GUI APP Python: PyQt5 Use : f5 or debug or simply run it on your ids(vscode , pycham, anaconda etc.) socia
Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.
poolbooru_gelscraper a simple python script for scraping images off gelbooru pools. modules required:requests_html, and os by default saves files with
A python script to extract answers to any question on Quora (Quora+ included)
quora-plus-bypass A python script to extract answers to any question on Quora (Quora+ included) Requirements Python 3.x
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
A supercharged version of paperless: scan, index and archive all your physical documents
Paperless-ng Paperless (click me) is an application by Daniel Quinn and contributors that indexes your scanned documents and allows you to easily sear
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.
Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.
NewsScraper A simple Python 3 module to get crypto or news articles and their content from various RSS feeds. 🔧 Installation Clone the repo locally.
Mipdfcompressor - 💕A simple pdf size compressing telegram robot
Pdf Compressor Telegram Bot A simple pdf size compressing telegram robot. Useful for digital documentation. Mandatory Variables API_HASH - Your A
IDCARD-VERIFYING-SYSTEM - The "IDCARD VERIFYING SYSTEM" uses the Google's latest version of Tesseract OCR[Optical Character Recognition]
IDCARD VERIFYING SYSTEM The "IDCARD VERIFYING SYSTEM" uses the Google's latest v
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf
Highlight Translator can help you translate the words quickly and accurately.
Highlight Translator can help you translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT, WORD etc.), the translated results will then be automatically displayed before you.
A bulk pdf generator. This application can generate PDFs in bulk by using just one click.
A bulk html pdf generator. This application can generate PDFs in bulk by using just one click. Screenshots Requirements 🧱 Your system must have the f
Animoo - Python scraper made with BeautifulSoup4 that scrapes images from /c/.
Animoo - Python scraper made with BeautifulSoup4 that scrapes images from /c/. Features Scrapes 10 pages Scrapes each thread Downloads all the images
This wrapper now has async support, its basically the same except it uses asyncio
This is a python wrapper for my api api_url = "https://api.dhravya.me/" This wrapper now has async support, its basically the same except it uses asyn
SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features.
SearchifyX SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features. SearchifyX lets you
A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.
EasyLaMa (WIP) This is a tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background. Installation For GP
轻量级公式 OCR 小工具:一键识别各类公式图片,并转换为 LaTeX 格式
QC-Formula | 青尘公式 OCR 介绍 轻量级开源公式 OCR 小工具:一键识别公式图片,并转换为 LaTeX 格式。 支持从 电脑本地 导入公式图片;(后续版本将支持直接从网页导入图片) 公式图片支持 .png / .jpg / .bmp,大小为 4M 以内均可; 支持印刷体及手写体,前
Divar.ir Ads scrapper
Divar.ir Ads Scrapper Introduction This project first asynchronously grab Divar.ir Ads and then save to .csv and .xlsx files named data.csv and data.x