224 Repositories
Python amazon-scraping Libraries
Async Python 3.6+ web scraping micro-framework based on asyncio
Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
Scrapy, a fast high-level web crawling & scraping framework for Python.
Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag
Publicly Open Amazon AWS S3 Bucket Viewer
S3Viewer Publicly open storage viewer (Amazon S3 Bucket, Azure Blob, FTP server, HTTP Index Of/) s3viewer is a free tool for security researchers that
Python binding to Modest engine (fast HTML5 parser with CSS selectors).
A fast HTML5 parser with CSS selectors using Modest engine. Installation From PyPI using pip: pip install selectolax Development version from github:
Pythonic HTML Parsing for Humans™
Requests-HTML: HTML Parsing for Humans™ This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When us
The algorithm performs a simple user registration (Name, CPF, E-mail and Telephone) in an Amazon RDS database and also performs the storage, training and facial recognition of the user's face to identify the users already registered in the system in a next time the user is seen.
Registration form with RDS AWS database and facial recognition via OpenCV The algorithm performs a simple user registration (Name, CPF, E-mail and Tel
Python wrapper to access the amazon selling partner API
PYTHON-AMAZON-SP-API Amazon Selling-Partner API If you have questions, please join on slack Contributions very welcome! Installation pip install pytho
Download song lyrics and metadata from Genius.com 🎶🎤
LyricsGenius: a Python client for the Genius.com API lyricsgenius provides a simple interface to the song, artist, and lyrics data stored on Genius.co
A simple library for interacting with Amazon S3.
BucketStore is a very simple Amazon S3 client, written in Python. It aims to be much more straight-forward to use than boto3, and specializes only in
The unofficial Amazon search CLI & Python API
amzSear The unofficial Amazon Product CLI & API. Easily search the amazon product directory from the command line without the need for an Amazon API k
A simple Python wrapper for the Amazon.com Product Advertising API ⛺
Amazon Simple Product API A simple Python wrapper for the Amazon.com Product Advertising API. Features An object oriented interface to Amazon products
Pythonic HTML Parsing for Humans™
Requests-HTML: HTML Parsing for Humans™ This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When us
A pythonic interface to Amazon's DynamoDB
PynamoDB A Pythonic interface for Amazon's DynamoDB. DynamoDB is a great NoSQL service provided by Amazon, but the API is verbose. PynamoDB presents y
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
Visual scraping for Scrapy
Portia Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web pag
Web Scraping Framework
Grab Framework Documentation Installation $ pip install -U grab See details about installing Grab on different platforms here http://docs.grablib.
A light-weight, modular, message representation and mail delivery framework for Python.
Marrow Mailer A highly efficient and modular mail delivery framework for Python 2.6+ and 3.2+, formerly called TurboMail. © 2006-2019, Alice Bevan-McG
Official s3cmd repo -- Command line tool for managing Amazon S3 and CloudFront services
S3cmd tool for Amazon Simple Storage Service (S3) Author: Michal Ludvig, [email protected] Project homepage (c) TGRMN Software and contributors S3tools
Run MapReduce jobs on Hadoop or Amazon Web Services
mrjob: the Python MapReduce library mrjob is a Python 2.7/3.4+ package that helps you write and run Hadoop Streaming jobs. Stable version (v0.7.4) doc
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana