127 Repositories
Python reddit-crawler Libraries
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎
Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico 💬 About: This script a
Dude is a very simple framework for writing web scrapers using Python decorators
Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax.
A anti-repostbot script for reddit, runs u/ThisIsARepostBotBot
ThisIsARepostBotBot So you found a repost bot, now what? This is a bot to reply to all posts of a repost bot with a message urging users to report and
🕷 Phone Crawler with multi-thread functionality
Phone Crawler: Phone Crawler with multi-thread functionality Disclaimer: I'm not responsible for any illegal/misuse actions, this program was made for
This repo has the source code for the crawler and data crawled from auto-data.net
This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars
Python project that aims to discover CDP neighbors and map their Layer-2 topology within a shareable medium like Visio or Draw.io.
Python project that aims to discover CDP neighbors and map their Layer-2 topology within a shareable medium like Visio or Draw.io.
Python script for crawling ResearchGate.net papers✨⭐️📎
ResearchGate Crawler Python script for crawling ResearchGate.net papers About the script This code start crawling process by urls in start.txt and giv
Meme-videos - Scrapes memes and turn them into a video compilations
Meme Videos Scrapes memes from reddit using praw and request and then converts t
Amazon web scraping using Scrapy Framework
Amazon-web-scraping-using-Scrapy-Framework Scrapy Scrapy is an application framework for crawling web sites and extracting structured data which can b
A simple discord bot written in python which can surf subreddits, send a random meme, jokes and also weather of a given place
A simple Discord Bot A simple discord bot written in python which can surf subreddits, send a random meme, jokes and also weather of a given place. We
Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot
Aliexpress to telegram post Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a b
Just a python library to make reddit post caching easier
Reddist Just a python library to make reddit post caching easier. Caching Options In Memory Caching Redis Caching Pickle Caching Usage Installation: D
Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call.
Discord_Meme_Bot 🤣 Bot that embeds a random hysterical meme from Reddit into your text channel as an embedded message, using an API call. Add the bot
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
A Python package that can be used to download post and comment data from Reddit.
Reddit Data Collector Reddit Data Collector is a Python package that allows a user to collect post and comment data from Reddit. It is built on top of
Check bookings for TUM libraries.
TUM Library Checker Only for educational purposes This repository contains a crawler to save bookings for TUM libraries in a CSV file. Sample data fro
A web crawler for recording posts in "sina weibo"
Web Crawler for "sina weibo" A web crawler for recording posts in "sina weibo" Introduction This script helps collect attributes of posts in "sina wei
A Telegram crawler to search groups and channels automatically and collect any type of data from them.
Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb
High available distributed ip proxy pool, powerd by Scrapy and Redis
高可用IP代理池 README | 中文文档 本项目所采集的IP资源都来自互联网,愿景是为大型爬虫项目提供一个高可用低延迟的高匿IP代理池。 项目亮点 代理来源丰富 代理抓取提取精准 代理校验严格合理 监控完备,鲁棒性强 架构灵活,便于扩展 各个组件分布式部署 快速开始 注意,代码请在release
A distributed crawler for weibo, building with celery and requests.
A distributed crawler for weibo, building with celery and requests.
A Python bot that uses the Reddit API to send users inspiring messages.
AnonBot By Edric Antoine A Python bot that uses the Reddit API to send users inspiring messages. When a message includes 'What would Anon do?', the bo
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.
TechSEO Crawler Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index. Play with the r
NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess
Project 3: Web APIs & NLP Problem Statement How do r/Libertarian and r/Neoliberal differ on Biden post-inaguration? The goal of the project is to see
Upvotes and karma for Discord: Heart 💗 or Crush 💔 a comment to give points to an user, or Star ⭐ it to add it to the Best Of!
🤖 Reto Reto is a community-oriented Discord bot, featuring a karma system, a way to reward the best comments, leaderboards, and so much more! React t
Amazon scraper using scrapy, a python framework for crawling websites.
#Amazon-web-scraper This is a python program, which use scrapy python framework to crawl all pages of the product and scrap products data. This progra
Create crawler get some new products with maximum discount in banimode website
crawler-banimode create crawler and get some new products with maximum discount in banimode website. این پروژه کوچک جهت یادگیری و کار با ابزار سلنیوم
Screen scraping and web crawling framework
Pomp Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the
Python Testing Crawler 🐍 🩺 🕷️ A crawler for automated functional testing of a web application
Python Testing Crawler 🐍 🩺 🕷️ A crawler for automated functional testing of a web application Crawling a server-side-rendered web application is a
A bot framework for Reddit to manage threads, wiki pages, widgets, menus and more.
Sub Manager Sub Manager is a bot framework for Reddit to automate a variety of tasks on one or more subreddits, and can be configured and run without
coURLan: Clean, filter, normalize, and sample URLs
coURLan: Clean, filter, normalize, and sample URLs Why coURLan? “Given that the bandwidth for conducting crawls is neither infinite nor free, it is be
Crawl BookCorpus
These are scripts to reproduce BookCorpus by yourself.
A reddit.com bot that will return reference links from official python documentation site for the standard library.
Python Docs Bot A reddit.com bot that will return documentation links for the library and language reference sections of the python docs website. The
This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.
crawler_to_visual_gmane Analyzing an EMAIL Archive from gmane and vizualizing the data using the D3 JavaScript library. This is a set of tools that al
PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management
PaperRobot PaperRobot 是一个论文抓取工具,可以快速批量下载大量论文,方便后期进行持续的论文管理与学习。 PaperRobot通过多个接口抓取论文,目前抓取成功率维持在90%以上。通过配置Config文件,可以抓取任意计算机领域相关会议的论文。 Installation Down
An Amazon Product Scraper built using scapy module of python
Amazon Product Scraper This is an Amazon Product Scraper built using scapy module of python Features it scrape various things Product Title Product Im
ProxyBroker is an open source tool that asynchronously finds public proxies from multiple sources and concurrently checks them
ProxyBroker is an open source tool that asynchronously finds public proxies from multiple sources and concurrently checks them. Features F
A dead simple crawler to get books information from Douban.
Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)
A dead simple crawler to get books information from Douban.
Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)
A reddit bot that imitates the popular reddit bot "u/repostsleuthbot" to trick people into clicking on a rickroll
Reddit-Rickroll-Bot A reddit bot that imitates the popular reddit bot "u/repostsleuthbot" to trick people into clicking on a rickroll Made with The Py
Fully automated YouTube Channel. Using Reddit and YouTube API.
Fully Automated YouTube Shorts Channel This code will show you how to setup and fully autmated YouTube Channel. Content is gathered from Reddit using
Stream comments, submissions from subreddits and users across reddit right in your terminal
reddit_from_terminal stream comments, submissions from subreddits and users across reddit right in your terminal Alert! : Can't watch media contents(p
About Solve CTF offline disconnection problem - based on python3's small crawler
About Solve CTF offline disconnection problem - based on python3's small crawler, support keyword search and local map bed establishment, currently support Jianshu, xianzhi,anquanke,freebuf,seebug
A telegram bot that sends a meme a day, from reddit's top meme of the day
MemeBot A telegram bot that sends a meme a day, from reddit's top meme of the day You can use the bot either with an external scheduler (ex: pythonany
Rottentomatoes, Goodreads and IMDB sites crawler. Semantic Web final project.
Crawler Rottentomatoes, Goodreads and IMDB sites crawler. Crawler written by beautifulsoup, selenium and lxml to gather books and films information an
HackerNews and Reddit in one placce
EDIT: this project is 3.5 years old. I found it sad it's just laying around, so I did some minimal fixes and deployed it. Hope you enjoy! (PR's welcom
A python gui program to generate reddit text to speech videos from the id of any post.
Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate
Posts locally saved videos to the desired subreddit
redditvideoposter posts locally saved videos to the desired subreddit ================================================================= STEPS: pip ins
Building house price data pipelines with Apache Beam and Spark on GCP
This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.
Google Maps crawler using Selenium
Google Maps Crawler using Selenium Built as part of the Antifragile Dev Project Selenium crawler that browses Google Maps as a regular user and stores
A small bot to interact with the reddit API. Get top viewers and update the sidebar widget.
LiveStream_Reddit_Bot Get top twitch and facebook stream viewers for a game and update the sidebar widget and old reddit sidebar to show your communit
A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.
Reddit text to speech generator A basic reddit tts video generator Current functionality Generate videos for subs based on comments,(askreddit) so rea
Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.
crawlersuseragents This Python script can be used to check if there is any differences in responses of an application when the request comes from a se
A simple reddit scraper to get memes (only images) from r/ProgrammerHumor.
memey A simple reddit scraper to get memes (only images) from r/ProgrammerHumor. Note Only works if you have firefox installed (yet). Instructions foo
A Pixiv web crawler module
Pixiv-spider A Pixiv spider module WARNING It's an unfinished work, browsing the code carefully before using it. Features 0004 - Readme.md updated, co
Reddit bot for r/khiphop
khiphop-bot Description This project is a collection of scripts that better the state of the r/khiphop subreddit, which represents Korean Hip-Hop and
Simple Reddit bot that replies to comments containing a certain word.
reddit-replier-bot Small comment reply bot based on PRAW. This script will scan the comments of a subreddit as they come in and look for a trigger wor
Reddit comment bot emulating Telugu actor N. Bala Krishna.
Balayya-Bot Reddit comment bot emulating Telugu actor N. Bala Krishna. Project structure config.py contains Bot's higher level configuration. generate
Emo-Fun is a bot which emojifies the text you send it
About Emo-Fun is a bot which emojifies the text you send it. It is easier to understand by an example Input : Hey this is to show my working!! Output
A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!
🕳️ CygnusX1 Code by Trong-Dat Ngo. Overviews 🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engine
Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.
Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo. (Todas as infomações)
PASTRIE: A Corpus of Prepositions Annotated with Supersense Tags in Reddit International English
PASTRIE Official release of the corpus described in the paper: Michael Kranzlein, Emma Manning, Siyao Peng, Shira Wein, Aryaman Arora, and Nathan Schn
Reddit bot that uses sentiment analysis
Reddit Bot Project 2: Neural Network Boogaloo Reddit bot that uses sentiment analysis from NLTK.VADER WIP_WIP_WIP_WIP_WIP_WIP Link to test subreddit:
Deep Web Miner Python | Spyder Crawler
Webcrawler written in Python. This crawler does dig in till the 3 level of inside addressed and mine the respective data accordingly
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard.
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard. Features 🔥 Article monitor: helps you capture the trend at a
Mini Tool to lovers of debe from eksisozluk (one of the most famous website -reffered as collaborative dictionary like reddit- in Turkey) for pushing debe (Most Liked Entries of Yesterday) to kindle every day via Github Actions.
debe to kindle Mini Tool to lovers of debe from eksisozluk (one of the most famous website -refered as collaborative dictionary like reddit- in Turkey
Shred your reddit comment and post history
trasheddit Shred your reddit comment and post history (x89/Shreddit replacement) Usage Simple Example Download trasheddit: git clone https://github.co
A web crawler script that crawls the target website and lists its links
A web crawler script that crawls the target website and lists its links || A web crawler script that lists links by scanning the target website.
A crawler of doubamovie
豆瓣电影 A crawler of doubamovie 一个小小的入门级scrapy框架的应用,选取豆瓣电影对排行榜前1000的电影数据进行爬取。 spider.py start_requests方法为scrapy的方法,我们对它进行重写。 def start_requests(self):
Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.
Toxicity comments crawler Crawler job that scrapes comments from social media posts and saves them in a S3 bucket. Twitter Tweets and replies are scra
Audio media crawler for lbry.
Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe
A Python package that scrapes Google News article data while remaining undetected by Google.
A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)
Console application for downloading images from Reddit in Python
RedditImageScraper Console application for downloading images from Reddit in Python Introduction This short Python script was created for the mass-dow
👁️ Tool for Data Extraction and Web Requests.
httpmapper 👁️ Project • Technologies • Installation • How it works • License Project 🚧 For educational purposes. This is a project that I developed,
Crawler in Python 3.7, 3.8. 3.9. Pypy3
Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn
The core packages of security analyzer web crawler
Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
A replacement for Reddit /r/copypasta CummyBot2000 with extra measures to avoid it being banned.
CummyBot1984 A replacement for Reddit /r/copypasta's CummyBot2000 with extra measures to respect Reddit's API rules. Features Copies and replies to ev
An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!
Social Media Scraper An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line! Go to the website » Vie
This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.
LeasePlan - Scraper This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease. It has
输入Google Hacking语句,自动调用Chrome浏览器爬取结果
Google-Hacking-Crawler 该脚本可输入Google Hacking语句,自动调用Chrome浏览器爬取结果 环境配置 python -m pip install -r requirements.txt 下载Chrome浏览器
Telegram bot to scrape images from the reddit universe
Telegram bot to scrape images from the reddit universe
Create Fast and easy image datasets using reddit
Reddit-Image-Scraper Reddit Reddit is an American Social news aggregation, web content rating, and discussion website. Reddit has been devided by topi
A low-code tool that generates python crawler code based on curl or url
KKBA Intruoduction A low-code tool that generates python crawler code based on curl or url Requirement Python = 3.6 Install pip install kkba Usage Co
Python script to download all images/webms of a 4chan thread
Python3 script to continuously download all images/webms of multiple 4chan thread simultaneously - without installation
A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.
ACL-Anthology-Crawler A toolkit to automatically crawl the paper list and download paper pdfs of ACL Anthology
Pixiv 爬虫,使用 Python 实现。支持批量下载、上传到图床。
用 Python 实现的 Pixiv 爬虫,支持批量下载和上传。 随机图片 API: https://loliapi.ml/ Deploy Github Action 集成部署 建议使用本方法部署,相较于本地部署,无需搭建环境,全程在线上完成。并且使用国外服务器下载、上传,网络更加通畅。 Fork
a really simple bot that send you memes from reddit to whatsapp
a really simple bot that send you memes from reddit to whatsapp want to use use it? install the dependencies with pip3 install -r requirements.txt the
A Web Scraper built with beautiful soup, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file
Udemy Scraper A Web Scraper built with beautiful soup, that fetches udemy course information. Installation Virtual Environment Firstly, it is recommen
一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件
QQ音乐歌词爬虫 一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件,默认去除了所有演唱会(Live)版本的歌曲。 使用方法 直接运行python run.py即可,然后输入你想获取的歌手名字,然后静静等待片刻。 output目录下保存生成的歌词和歌名文件。以周杰伦为例,会生成两
mlscraper: Scrape data from HTML pages automatically with Machine Learning
🤖 Scrape data from HTML websites automatically with Machine Learning
This is a cryptocurrency trading bot that analyses Reddit sentiment and places trades on Binance based on reddit post and comment sentiment. If you like this project please consider donating via brave. Thanks.
This is a cryptocurrency trading bot that analyses Reddit sentiment and places trades on Binance based on reddit post and comment sentiment. The bot f
Reddit cli to slack at work
Reddit CLI (v1.0) Introduction Why Reddit CLI? Coworker who sees me looking at something in a browser: "Glad you're not busy; I need you to do this, t
A bot that escrows crypto transactions on Reddit
EscrowBot I NEED BCH TESTNET FOR TESTING. Please send me some BCH testnet if you have some: bchtest:qz5eur3prqyvd8u77m6fzf9z6cruz9q7vq4qvgdnuk Depende
A way to export your saved reddit posts to a Notion table.
reddit-saved-to-notion A way to export your saved reddit posts and comments to a Notion table.Uses notion-sdk-py and praw for interacting with Notion
Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .
🌳 Fingerprinting Fine-tuned Language Models in the wild This is the code and dataset for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned La
discovering subdomains, hidden paths, extracting unique links
python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give
Automatically detect changes made to the official Telegram sites.
🕷 Telegram Web Crawler This project is developed to automatically detect changes made to the official Telegram sites. This is necessary for anticipat
Exports saved posts and comments on Reddit to a csv file.
reddit-saved-to-csv Exports saved posts and comments on Reddit to a csv file. Columns: ID, Name, Subreddit, Type, URL, NoSFW ID: Starts from 1 and inc
Bulk Downloader for Reddit
saveddit is a bulk media downloader for reddit pip3 install saveddit Setting up authorization Register an application with Reddit Write down your clie