135 Repositories
Python scraping Libraries
A comand-line utility for taking automated screenshots of websites
shot-scraper A comand-line utility for taking automated screenshots of websites For background on this project see shot-scraper: automated screenshots
Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.
BBB Face Recognizer Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time. Instalati
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework Published at palewi.re/docs/first-github-scraper/ Contrib
Materials to reproduce our findings in our stories, "Amazon Puts Its Own 'Brands' First Above Better-Rated Products" and "When Amazon Takes the Buy Box, it Doesn’t Give it up"
Amazon Brands and Exclusives This repository contains code to reproduce the findings featured in our story "Amazon Puts Its Own 'Brands' First Above B
Dude is a very simple framework for writing web scrapers using Python decorators
Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax.
Fast TikTok NO Watermark Video Downloader (username or url)
💎 TD [ TikDown v4 ] Star ⭐ if you want more Discord Server * discord.gg/onlp | Waxor#9999 Why not open source anymore ? * BECAUSE PEOPLE SKID, STEA
Code for the Open Data Day 2022 publicbodies.org Nepal data scraping activities.
Open Data Day Publicbodies.org Nepal We've gathered on Saturday, 5th March 2022 with Open Knowledge Nepal in order to try and automate the collection
SkyScrapers: A collection of variety of Scraping Apps
SkyScrapers Collection of variety of Web Scraping Apps The web-scrapers involved
Visyerres sgdf woob - Modules Woob pour l'intranet et autres sites Scouts et Guides de France
Vis'Yerres SGDF - Modules Woob Vous avez le sentiment que l'intranet des Scouts
Scrapy-soccer-games - Scraping information about soccer games from a few websites
scrapy-soccer-games Esse projeto tem por finalidade pegar informação de tabela d
A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.
Parallel web scraping The project is a training task for web scraping using python multithreading and a real-time-updated list of available proxy serv
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.
COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data
Simple library for exploring/scraping the web or testing a website you’re developing
Robox is a simple library with a clean interface for exploring/scraping the web or testing a website you’re developing. Robox can fetch a page, click on links and buttons, and fill out and submit forms.
A simple app to scrap data from Twitter.
Twitter-Scraping-App A simple app to scrap data from Twitter. Available Features Search query. Select number of data you want to fetch from twitter. C
An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, and MagnetDL
An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, and MagnetDL
Fetch fund data from avanza.se using Python and some web scraping with bs4
Py(A)vanza Fetch fund data from avanza.se using Python and some web scraping with bs4. The default way is to display the data in the terminal, apply -
Amazon web scraping using Scrapy Framework
Amazon-web-scraping-using-Scrapy-Framework Scrapy Scrapy is an application framework for crawling web sites and extracting structured data which can b
Basic-html-scraper - A complete how to of web scraping with Python for beginners
basic-html-scraper Code from YT Video This video includes a complete how to of w
This was supposed to be a web scraping project, but somehow I've turned it into a spamming project
Introduction This was supposed to be a web scraping project, but somehow I've turned it into a spamming project.
Linkedin webscraping - Linkedin web scraping with python
linkedin_webscraping This is the first step of a full project called "LinkedIn J
Explore scraping with BeautifulSoup!
beautifulsoup-scrape Explore scraping with BeautifulSoup! Part One: Start from Shakespeare As my professor is a poet (yes, and he teaches me data and
Scraper pour les offres de stage Tesla et les notes sur Oasis (Polytech Paris-Saclay) sous forme de bot Discord
Scraper pour les offres de stage Tesla et les notes sur Oasis (Polytech Paris-Saclay) sous forme de bot Discord
This is a Web scraping project using BeautifulSoup and Python to scrape basic information of all the Test matches played till Jan 2022.
Scraping-test-matches-data This is a Web scraping project using BeautifulSoup and Python to scrape basic information of all the Test matches played ti
An application that on a given url, crowls a web page and gets all words, sorts and counts them.
Web-Scrapping-1 An application that on a given url, crowls a web page and gets all words, sorts and counts them. Installation Using the package manage
Web Scraping Instagram photos with Selenium by only using a hashtag.
Web-Scraping-Instagram This project is used to automatically obtain images by web scraping Instagram with Selenium in Python. The required input will
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
A Very simple free proxy list scraper.
Scrappp A Very simple free proxy list scraper, made in python The tool scrape proxy from diffrent sites and api's. Screenshots About the script !!! RE
Scraping Top Repositories for Topics on GitHub,
0.-Webscrapping-using-python Scraping Top Repositories for Topics on GitHub, Web scraping is the process of extracting and parsing data from websites
General Assembly's 2015 Data Science course in Washington, DC
DAT8 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15). Instructor: Kevin Markham (
Web-scraping - Program that scrapes a website for a collection of quotes, picks one at random and displays it
web-scraping Program that scrapes a website for a collection of quotes, picks on
Web Scraping COVID 19 Meta Portal with Python
Web-Scraping-COVID-19-Meta-Portal-with-Python - Requests API and Beautiful Soup to scrape real-time COVID statistics from worldometer website and perform data cleaning and visual analysis in Jupyter notebook.
Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json
Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website (prior form publication) by form number and returns the results as json. It provides the option to download pdfs over a range of years.
Dictionary - Application focused on word search through web scraping
Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.
Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.
poolbooru_gelscraper a simple python script for scraping images off gelbooru pools. modules required:requests_html, and os by default saves files with
A python script to extract answers to any question on Quora (Quora+ included)
quora-plus-bypass A python script to extract answers to any question on Quora (Quora+ included) Requirements Python 3.x
Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster
Scrapy Cluster This Scrapy project uses Redis and Kafka to create a distributed
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Web Scraping com Python - Raspando Vagas para Programadores
Web Scraping com Python - Raspando Vagas para Programadores Sobre o Projeto Web
Web Scraping images using Selenium and Python
Web Scraping images using Selenium and Python A propos de ce document This is a markdown document about Web scraping images and videos using Selenium
Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.
Pythonic Crawling / Scraping Framework Built on Eventlet Features High Speed WebCrawler built on Eventlet. Supports relational databases engines like
Screen scraping and web crawling framework
Pomp Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the
Fast and robust date extraction from web pages, with Python or on the command-line
Find original and updated publication dates of any web page. From the command-line or within Python, all the steps needed from web page download to HTML parsing, scraping, and text analysis are included.
Python framework to scrape Pastebin pastes and analyze them
pastepwn - Paste-Scraping Python Framework Pastebin is a very helpful tool to store or rather share ascii encoded data online. In the world of OSINT,
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
TWINT - Twitter Intelligence Tool No authentication. No API. No limits. Twint is an advanced Twitter scraping tool written in Python that allows for s
Useful PDF-related productivity tool.
Luftmensch 1.4.7 (Español) | 1.4.3 (English) Version 1.4.7 (Español) released in October 2021. Version 1.4.3 (English) released in September 2021. 🏮
A web scraping using Selenium Webdriver
Savee - Images Downloader Project using Selenium Webdriver to download images from someone's profile on https:www.savee.it website. Usage The project
DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques
DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques, based in France Only. The particularity of this program is its ability to find your target's e-mail adresses.
Google Developer Profile Badge Scraper
Google Developer Profile Badge Scraper It is a Google Developer Profile Web Scraper which scrapes for specific badges in a user's Google Developer Pro
Scrapes Every Email Address of Every Society in Every University
society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre
A simple django-rest-framework api using web scraping
Apicell You can use this api to search in google, bing, pypi and subscene and get results Method : POST Parameter : query Example import request url =
A powerful annex BUBT, BUBT Soft, and BUBT website scraping script.
Annex Bubt Scraping Script I think this is the first public repository that provides free annex-BUBT, BUBT-Soft, and BUBT website scraping API script
This is a module that I had created along with my friend. It's a basic web scraping module
QuickInfo PYPI link : https://pypi.org/project/quickinfo/ This is the library that you've all been searching for, it's built for developers and allows
Google Scholar Web Scraping
Google Scholar Web Scraping This is a python script that asks for a user to input the url for a google scholar profile, and then it writes publication
Demonstration on how to use async python to control multiple playwright browsers for web-scraping
Playwright Browser Pool This example illustrates how it's possible to use a pool of browsers to retrieve page urls in a single asynchronous process. i
Creating Scrapy scrapers via the Django admin interface
django-dynamic-scraper Django Dynamic Scraper (DDS) is an app for Django which builds on top of the scraping framework Scrapy and lets you create and
Haphazard scripts for scraping bitcoin/bitcoin data from GitHub
This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data. Each output/pr number folder contains comments.json:
An Web Scraping API for MDL(My Drama List) for Python.
PyMDL An API for MyDramaList(MDL) based on webscraping for python. Description An API for MDL to make your life easier in retriving and working on dat
Web Scraping OLX with Python and Bsoup.
webScrap WebScraping first step. Authors: Paulo, Claudio M. First steps in Web Scraping. Project carried out for training in Web Scrapping. The export
Scraping followers of an instagram account
ScrapInsta A script to scraping data from Instagram Install First of all you can run: pip install scrapinsta After that you need to install these requ
The first public repository that provides free BUBT website scraping API script on Github.
BUBT WEBSITE SCRAPPING SCRIPT I think this is the first public repository that provides free BUBT website scraping API script on github. When I was do
A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.
cybernews A package that provides you Latest Cyber/Hacker News from website using Web-Scraping. Latest Cyber/Hacker News Using Webscraping Developed b
Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan
About 千葉県の地域別の詳細感染者統計(Excelファイル) をCSVに変換し、かつ地域別の日時感染者集計値を出力するスクリプトです。 Requirement POSIX互換なシェル, e.g. GNU Bash (1) curl (1) python = 3.8 pandas = 1.1.
Simple and easy to use python API for the COVID registration booking system of the math department @ unipd (torre archimede)
Simple and easy to use python API for the COVID registration booking system of the math department @ unipd (torre archimede). This API creates an interface with the official browser, with more useful functionalities.
Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback.
spotify-lyrics-to-lrc Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback. How to use: Open Spotify
A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items
combined-shop-scraper A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items. Features Define an
Consulta de CPF e CNPJ na Receita Federal com Web-Scraping
Repositório contendo scripts Python que realizam a consulta de CPF e CNPJ diretamente no site da Receita Federal.
A Python module to bypass Cloudflare's anti-bot page.
cloudflare-scrape A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Reque
Scraping Bot for the Covid19 vaccination website of the Canton of Zurich, Switzerland.
Hi 👋 , I'm David A passionate developer from France. 🌱 I’m currently learning Kotlin, ReactJS and Kubernetes 👨💻 All of my projects are available
Here I provide the source code for doing web scraping using the python library, it is Selenium.
Here I provide the source code for doing web scraping using the python library, it is Selenium.
Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".
Real-time stock predictions with deep learning and news scraping This repository contains a partial implementation of my bachelor's thesis "Real-time
Scraping comments from the political section of popular Nigerian blog (Nairaland), and saving in a CSV file.
Scraping_Nairaland This project scraped comments from the political section of popular Nigerian blog www.nairaland.com using the Python BeautifulSoup
4CAT: Capture and Analysis Toolkit
4CAT: Capture and Analysis Toolkit 4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to m
Scraping weather data using Python to receive umbrella reminders
A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.
Web Scraping Practica With Python
Web-Scraping-Practica Integrants: Guillem Vidal Pallarols. Lídia Bandrés Solé Fitxers: Aquest document és el primer que trobem. A continuació trobem u
Facebook Group Scraping Using Beautiful Soup & Selenium
Extract Facebook group posts that are related to a specific topic and write them to a .json file.
A project that automatically sends you a Medium article on a topic of your choosing to your email address daily.
Daily Article from Medium ✏️ About A project that automatically sends you a Medium article on a topic of your choosing to your email address daily. No
PyJPBoatRace: Python-based Japanese boatrace tools 🚤
pyjpboatrace :speedboat: provides you with useful tools for data analysis and auto-betting for boatrace.
Scraping web pages to get data
Scraping Data Get public data and save in database This is project use Python How to run a project 1 - Clone the repository 2 - Install beautifulsoup4
Web-Scraping using Selenium Master
Web-Scraping using Selenium What is the need of Selenium? Some websites don't like to be scrapped and in that case you need to disguise your webscrapi
Example of scraping a paginated API endpoint and dumping the data into a DB
Provider API Scraper Example Example of scraping a paginated API endpoint and dumping the data into a DB. Pre-requisits Python = 3.9 Pipenv Setup # i
PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different languages
PyMultiDictionary PyMultiDictionary is a Dictionary Module for Python 3+ to get meanings, translations, synonyms and antonyms of words in 20 different
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Subdomain enumeration,Web scraping and finding usernames automation script written in python
Subdomain enumeration,Web scraping and finding usernames automation script written in python
Better GitHub statistics images for your profile, with stats from private and public repos
Better GitHub statistics images for your profile, with stats from private and public repos
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Scraping news from Ucsal portal with Scrapy.
NewsScraping Esse é um projeto de raspagem das últimas noticias, de 2021, do portal da universidade Ucsal http://noosfero.ucsal.br/institucional Tecno
A Python package that scrapes Google News article data while remaining undetected by Google.
A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)
Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.
Web Scrapping Popular Youtube Tech Channels with Selenium Data Mining, Data Wrangling, and Exploratory Data Analysis About the Data Web scrapi
The core packages of security analyzer web crawler
Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu
The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.
The open-source web scrapers that feed the Los Angeles Times' California coronavirus tracker. Processed data ready for analysis is available at datade
Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2
Iceberg Locations Antarctic large iceberg positions derived from ASCAT and OSCAT-2. All data collected here are from the NASA SCP website Overview Thi
A tool for scraping and organizing data from NewsBank API searches
nbscraper Overview This simple tool automates the process of copying, pasting, and organizing data from NewsBank API searches. Curerntly, nbscrape onl
A web scraping pipeline project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database.
New to Streaming Scraper An in-progress web scraping project built with Python, R, and SQL. The scraped data are movie and TV show information. The go
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
PyQuery-based scraping micro-framework.
demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip
NASA APOD Discord Bot - Fetches information from NASA APOD site.
NASA APOD Discord Bot - Fetches information from NASA APOD site.
Python SCript to scrape members from a selected Telegram group.
A python script to scrape all the members in a telegram group anad save in a CSV file. REGESTRING Go to this link https://core.telegram.org/api/obtain
A Web Scraping Program.
Web Scraping AUTHOR: Saurabh G. MTech Information Security, IIT Jammu. If you find this repository useful. I would appreciate if you Star it and Fork
Automated network configuration backups using Github actions and git-scraping
Network Config Scraper This repository demonstrates the use of Github Actions and git-scraping to build an automated backup solution for network confi
Django API that scrapes and provides the last news of the city of Carlos Casares by semantic way (RDF format).
"Casares News" API Api that scrapes and provides the last news of the city of Carlos Casares by semantic way (RDF format). Usage Consume the articles