163 Python Amazon-Crawler Libraries

Audio media crawler for lbry.

Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe

4 Dec 3, 2022

A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)

6 Aug 10, 2022

Console application for downloading images from Reddit in Python

RedditImageScraper Console application for downloading images from Reddit in Python Introduction This short Python script was created for the mass-dow

0 Jul 4, 2021

👁️ Tool for Data Extraction and Web Requests.

httpmapper 👁️ Project • Technologies • Installation • How it works • License Project 🚧 For educational purposes. This is a project that I developed,

15 Dec 5, 2021

Crawler in Python 3.7, 3.8. 3.9. Pypy3

Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn

2 Mar 12, 2022

The core packages of security analyzer web crawler

Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu

10 Jul 3, 2022

This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.

LeasePlan - Scraper This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease. It has

4 Nov 18, 2022

输入Google Hacking语句，自动调用Chrome浏览器爬取结果

Google-Hacking-Crawler 该脚本可输入Google Hacking语句，自动调用Chrome浏览器爬取结果环境配置 python -m pip install -r requirements.txt 下载Chrome浏览器

4 Jun 21, 2022

user friendly python script who is able to catch fish in the game New World

new-world-fishing-bot release 1.1.1 click img for demonstration Download guide Click at latest release: Download and extract bot.zip: When you run fil

297 Jan 8, 2023

A low-code tool that generates python crawler code based on curl or url

KKBA Intruoduction A low-code tool that generates python crawler code based on curl or url Requirement Python = 3.6 Install pip install kkba Usage Co

8 Sep 20, 2021

Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.

Airflow on Docker in EC2 + GitLab's CI/CD Personal project for simple data pipeline using Airflow. Airflow will be installed inside Docker container,

13 Nov 29, 2022

S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.

138 Jan 3, 2023

Django email backends and webhooks for Amazon SES, Mailgun, Mailjet, Postmark, SendGrid, Sendinblue, SparkPost and more

1.4k Jan 1, 2023

A Django email backend for Amazon's Simple Email Service

Django-SES Info: A Django email backend for Amazon's Simple Email Service Author: Harry Marr (http://github.com/hmarr, http://twitter.com/harrymarr) C

882 Dec 29, 2022

A django package which act as a gateway to send and receive email with amazon SES.

django-email-gateway: Introduction: A Simple Django app to easily send emails, receive inbound emails from users with different email vendors like AWS

28 Nov 9, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Python script to download all images/webms of a 4chan thread

Python3 script to continuously download all images/webms of multiple 4chan thread simultaneously - without installation

208 Jan 4, 2023

Patch PL to disable LK verification. Patch LK to disable boot/recovery verification.

Simple Python(3) script to disable LK verification in Amazon Preloader images and boot/recovery image verification in Amazon LK ("Little Kernel") images.

18 Mar 17, 2022

A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.

ACL-Anthology-Crawler A toolkit to automatically crawl the paper list and download paper pdfs of ACL Anthology

9 Oct 9, 2022

Pixiv 爬虫，使用 Python 实现。支持批量下载、上传到图床。

用 Python 实现的 Pixiv 爬虫，支持批量下载和上传。随机图片 API: https://loliapi.ml/ Deploy Github Action 集成部署建议使用本方法部署，相较于本地部署，无需搭建环境，全程在线上完成。并且使用国外服务器下载、上传，网络更加通畅。 Fork

18 Feb 26, 2022

Automated endpoint management for Amazon Aurora Global Database

This sample code can be used to manage Aurora global database endpoints. After failover the global database writer endpoints swap from one region to the other. This solution automates creation and management of Route 53 based endpoints, so the applications don't have to change the connections strings.

13 Dec 8, 2022

Amazon Scraper: A command-line tool for scraping Amazon product data

Amazon Product Scraper: 2021 Description A command-line tool for scraping Amazon product data to CSV or JSON format(s). Requirements Python 3 pip3 Ins

49 Nov 15, 2021

A Web Scraper built with beautiful soup, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file

Udemy Scraper A Web Scraper built with beautiful soup, that fetches udemy course information. Installation Virtual Environment Firstly, it is recommen

15 May 17, 2022

HTTP Calls to Amazon Web Services Rest API for IoT Core Shadow Actions 💻🌐💡

aws-iot-shadow-rest-api HTTP Calls to Amazon Web Services Rest API for IoT Core Shadow Actions 💻 🌐 💡 This simple script implements the following aw

3 Jun 6, 2022

一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件

QQ音乐歌词爬虫一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件，默认去除了所有演唱会（Live）版本的歌曲。使用方法直接运行python run.py即可，然后输入你想获取的歌手名字，然后静静等待片刻。 output目录下保存生成的歌词和歌名文件。以周杰伦为例，会生成两

11 Jul 27, 2022

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

360 Dec 10, 2022

Integrating Amazon API Gateway private endpoints with on-premises networks

Integrating Amazon API Gateway private endpoints with on-premises networks Read the blog about this application: Integrating Amazon API Gateway privat

12 Sep 9, 2022

mlscraper: Scrape data from HTML pages automatically with Machine Learning

🤖 Scrape data from HTML websites automatically with Machine Learning

798 Dec 29, 2022

discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

4 Sep 5, 2022

Universal Command Line Interface for Amazon Web Services

This package provides a unified command line interface to Amazon Web Services.

13.3k Jan 7, 2023

Automatically detect changes made to the official Telegram sites.

🕷 Telegram Web Crawler This project is developed to automatically detect changes made to the official Telegram sites. This is necessary for anticipat

115 Dec 31, 2022

Backend, modern REST API for obtaining match and odds data crawled from multiple sites. Using FastAPI, MongoDB as database, Motor as async MongoDB client, Scrapy as crawler and Docker.

Introduction Apiestas is a project composed of a backend powered by the awesome framework FastAPI and a crawler powered by Scrapy. This project has fo

54 Dec 13, 2022

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

lxSpider 爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》简介：时光荏苒，记不清写了多少案例了。

793 Jan 5, 2023

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

3.3k Dec 31, 2022

A pythonic interface to Amazon's DynamoDB

PynamoDB A Pythonic interface for Amazon's DynamoDB. DynamoDB is a great NoSQL service provided by Amazon, but the API is verbose. PynamoDB presents y

2.1k Dec 30, 2022

Amazon S3 Transfer Manager for Python

s3transfer - An Amazon S3 Transfer Manager for Python S3transfer is a Python library for managing Amazon S3 transfers. Note This project is not curren

158 Jan 7, 2023

Universal Command Line Interface for Amazon Web Services

aws-cli This package provides a unified command line interface to Amazon Web Services. Jump to: Getting Started Getting Help More Resources Getting St

13.3k Jan 1, 2023

Seamlessly serve your static assets of your Flask app from Amazon S3

flask-s3 Seamlessly serve the static assets of your Flask app from Amazon S3. Maintainers Flask-S3 is maintained by @e-dard, @eriktaubeneck and @SunDw

188 Aug 24, 2022

Web crawling framework based on asyncio.

Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo

2k Jan 5, 2023

Incredibly fast crawler designed for OSINT.

Photon Incredibly fast crawler designed for OSINT. Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap Key Features Dat

9.3k Jan 2, 2023

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

704 Jan 6, 2023

Async Python 3.6+ web scraping micro-framework based on asyncio

Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame

1.6k Jan 1, 2023

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It

4.8k Jan 4, 2023

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

Gerapy Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Scrapyd-Client, Scrapyd-API, Django and Vue.js. Documentation Documentation

2.9k Jan 3, 2023

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li

12.3k Jan 7, 2023

Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

45.5k Jan 7, 2023

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

instagram_scraper This is a minimalistic Instagram scraper written in Python. It can fetch media, accounts, videos, comments etc. `Comment` and `Like`

2.5k Nov 16, 2022

Publicly Open Amazon AWS S3 Bucket Viewer

S3Viewer Publicly open storage viewer (Amazon S3 Bucket, Azure Blob, FTP server, HTTP Index Of/) s3viewer is a free tool for security researchers that

377 Dec 2, 2022

The algorithm performs a simple user registration (Name, CPF, E-mail and Telephone) in an Amazon RDS database and also performs the storage, training and facial recognition of the user's face to identify the users already registered in the system in a next time the user is seen.

Registration form with RDS AWS database and facial recognition via OpenCV The algorithm performs a simple user registration (Name, CPF, E-mail and Tel

59 Nov 27, 2022

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

3.3k Jan 4, 2023

Python Amazon-Crawler Resources

Python Amazon-Crawler Libraries

Audio media crawler for lbry.

A Python package that scrapes Google News article data while remaining undetected by Google.

Console application for downloading images from Reddit in Python

👁️ Tool for Data Extraction and Web Requests.

Crawler in Python 3.7, 3.8. 3.9. Pypy3

The core packages of security analyzer web crawler

This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.

输入Google Hacking语句，自动调用Chrome浏览器爬取结果

user friendly python script who is able to catch fish in the game New World

A low-code tool that generates python crawler code based on curl or url

Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.

S3-plugin is a high performance PyTorch dataset library to efficiently access datasets stored in S3 buckets.

Django email backends and webhooks for Amazon SES, Mailgun, Mailjet, Postmark, SendGrid, Sendinblue, SparkPost and more

A Django email backend for Amazon's Simple Email Service

A django package which act as a gateway to send and receive email with amazon SES.

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Python script to download all images/webms of a 4chan thread

Patch PL to disable LK verification. Patch LK to disable boot/recovery verification.

A toolkit to automatically crawl the paper list and download paper pdfs of ACL Ahthology.

Pixiv 爬虫，使用 Python 实现。支持批量下载、上传到图床。

Automated endpoint management for Amazon Aurora Global Database

Amazon Scraper: A command-line tool for scraping Amazon product data

A Web Scraper built with beautiful soup, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file

HTTP Calls to Amazon Web Services Rest API for IoT Core Shadow Actions 💻🌐💡

一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Integrating Amazon API Gateway private endpoints with on-premises networks

mlscraper: Scrape data from HTML pages automatically with Machine Learning

discovering subdomains, hidden paths, extracting unique links

Universal Command Line Interface for Amazon Web Services

Automatically detect changes made to the official Telegram sites.

Backend, modern REST API for obtaining match and odds data crawled from multiple sites. Using FastAPI, MongoDB as database, Motor as async MongoDB client, Scrapy as crawler and Docker.

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

A pythonic interface to Amazon's DynamoDB

Amazon S3 Transfer Manager for Python

Universal Command Line Interface for Amazon Web Services

Seamlessly serve your static assets of your Flask app from Amazon S3

Web crawling framework based on asyncio.

Incredibly fast crawler designed for OSINT.

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

Async Python 3.6+ web scraping micro-framework based on asyncio

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Scrapy, a fast high-level web crawling & scraping framework for Python.

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Publicly Open Amazon AWS S3 Bucket Viewer

The algorithm performs a simple user registration (Name, CPF, E-mail and Telephone) in an Amazon RDS database and also performs the storage, training and facial recognition of the user's face to identify the users already registered in the system in a next time the user is seen.

Discover hidden deepweb pages

Python wrapper to access the amazon selling partner API

Python wrapper for Wikipedia

A simple library for interacting with Amazon S3.

The unofficial Amazon search CLI & Python API

A simple Python wrapper for the Amazon.com Product Advertising API ⛺

Every web site provides APIs.

News, full-text, and article metadata extraction in Python 3. Advanced docs:

A pythonic interface to Amazon's DynamoDB

A Powerful Spider(Web Crawler) System in Python.

A light-weight, modular, message representation and mail delivery framework for Python.

Official s3cmd repo -- Command line tool for managing Amazon S3 and CloudFront services

Run MapReduce jobs on Hadoop or Amazon Web Services

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

Related tags