10485 Repositories
Python python-utility Libraries
Library to scrape and clean web pages to create massive datasets.
lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr
Web crawling framework based on asyncio.
Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo
A python module to parse the Open Graph Protocol
OpenGraph is a module of python for parsing the Open Graph Protocol, you can read more about the specification at http://ogp.me/ Installation $ pip in
Incredibly fast crawler designed for OSINT.
Photon Incredibly fast crawler designed for OSINT. Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap Key Features Dat
A pure-python HTML screen-scraping library
Scrapely Scrapely is a library for extracting structured data from HTML pages. Given some example web pages and the data to be extracted, scrapely con
Html Content / Article Extractor, web scrapping lib in Python
Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a
Transistor, a Python web scraping framework for intelligent use cases.
Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec
Web Content Retrieval for Humans™
Lassie Lassie is a Python library for retrieving basic content from websites. Usage import lassie lassie.fetch('http://www.youtube.com/watch?v
Async Python 3.6+ web scraping micro-framework based on asyncio
Ruia 🕸️ Async Python 3.6+ web scraping micro-framework based on asyncio. ⚡ Write less, run faster. Overview Ruia is an async web scraping micro-frame
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It
A Python module to bypass Cloudflare's anti-bot page.
cloudscraper A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests.
a small library for extracting rich content from urls
A small library for extracting rich content from urls. what does it do? micawber supplies a few methods for retrieving rich metadata about a variety o
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
A modern CSS selector implementation for BeautifulSoup
Soup Sieve Overview Soup Sieve is a CSS selector library designed to be used with Beautiful Soup 4. It aims to provide selecting, matching, and filter
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li
Parse feeds in Python
feedparser - Parse Atom and RSS feeds in Python. Copyright 2010-2020 Kurt McKee [email protected] Copyright 2002-2008 Mark Pilgrim feedparser
Scrapy, a fast high-level web crawling & scraping framework for Python.
Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag
Read/sync your IMAP mailboxes (python2)
Upstream status (master branch): Upstream status (next branch): Financial contributors: Links: Official github code repository: offlineimap Website: w
A Python library to access Instagram's private API.
Instagram Private API A Python wrapper for the Instagram private API with no 3rd party dependencies. Supports both the app and web APIs. Overview I wr
Facebook open graph api implementation using the Django web framework in python
Django Facebook by Thierry Schellenbach (mellowmorning.com) Status Django and Facebook are both rapidly changing at the moment. Meanwhile, I'm caught
Python Client for Instagram API
This project is not actively maintained. Proceed at your own risk! python-instagram A Python 2/3 client for the Instagram REST and Search APIs Install
Script for downloading Coursera.org videos and naming them.
Coursera Downloader Coursera Downloader Introduction Features Disclaimer Installation instructions Recommended installation method for all Operating S
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
Google Images Download Python Script for 'searching' and 'downloading' hundreds of Google images to the local hard disk! Documentation Documentation H
google-resumable-media Apache-2google-resumable-media (🥉28 · ⭐ 27) - Utilities for Google Media Downloads and Resumable.. Apache-2
google-resumable-media Utilities for Google Media Downloads and Resumable Uploads See the docs for examples and usage. Experimental asyncio Support Wh
Soundcloud Music Downloader
Soundcloud Music Downloader Description This script is able to download music from SoundCloud and set id3tag to the downloaded music. Compatible with
:electric_plug: Generating short urls with python has never been easier
pyshorteners A simple URL shortening API wrapper Python library. Installing pip install pyshorteners Documentation https://pyshorteners.readthedocs.i
IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies
IMDbPY is a Python package for retrieving and managing the data of the IMDb movie database about movies, people and companies. Revamp notice Starting
scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
instagram_scraper This is a minimalistic Instagram scraper written in Python. It can fetch media, accounts, videos, comments etc. `Comment` and `Like`
Unofficial Python API client for Notion.so
notion-py Unofficial Python 3 client for Notion.so API v3. Object-oriented interface (mapping database tables to Python classes/attributes) Automatic
An unofficial client library for Google Music.
gmusicapi: an unofficial API for Google Play Music gmusicapi allows control of Google Music with Python. from gmusicapi import Mobileclient api = Mob
Scrape the Twitter Frontend API without authentication.
Twitter Scraper 🇰🇷 Read Korean Version Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has
Scrapes an instagram user's photos and videos
Instagram Scraper instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos.
:lock: Python 2.7/3.X client for HashiCorp Vault
hvac HashiCorp Vault API client for Python 3.x Tested against the latest release, HEAD ref, and 3 previous minor versions (counting back from the late
WeChat SDK for Python
___ __ _______ ________ ___ ___ ________ _________ ________ ___ ___ |\ \ |\ \|\ ___ \ |\ ____\|\ \|\ \|\ __ \|\___
A very simple Salesforce.com REST API client for Python
Simple Salesforce Simple Salesforce is a basic Salesforce.com REST API client built for Python 3.5, 3.6, 3.7 and 3.8. The goal is to provide a very lo
Python Twitter API
Python Twitter Tools The Minimalist Twitter API for Python is a Python API for Twitter, everyone's favorite Web 2.0 Facebook-style status updater for
(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.
Googletrans Googletrans is a free and unlimited python library that implemented Google Translate API. This uses the Google Translate Ajax API to make
Python client library for Google Maps API Web Services
Python Client for Google Maps Services Description Use Python? Want to geocode something? Looking for directions? Maybe matrices of directions? This l
Slack Developer Kit for Python
Python Slack SDK The Slack platform offers several APIs to build apps. Each Slack API delivers part of the capabilities from the platform, so that you
A Python wrapper around the Twitter API.
Python Twitter A Python wrapper around the Twitter API. By the Python-Twitter Developers Introduction This library provides a pure Python interface fo
📷 Instagram Bot - Tool for automated Instagram interactions
InstaPy Tooling that automates your social media interactions to “farm” Likes, Comments, and Followers on Instagram Implemented in Python using the Se
A Python module for communicating with the Twilio API and generating TwiML.
twilio-python The default branch name for this repository has been changed to main as of 07/27/2020. Documentation The documentation for the Twilio AP
Python Telegram bot api.
pyTelegramBotAPI A simple, but extensible Python implementation for the Telegram Bot API. Getting started. Writing your first bot Prerequisites A simp
TuShare is a utility for crawling historical data of China stocks
TuShare Tushare Pro版已发布,请访问新的官网了解和查询数据接口! https://tushare.pro TuShare是实现对股票/期货等金融数据从数据采集、清洗加工 到 数据存储过程的工具,满足金融量化分析师和学习数据分析的人在数据获取方面的需求,它的特点是数据覆盖范围广,接口
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
24 July 2020 Actively soliciting contributers! Ping @ronncc if you would like to help out! pytube pytube is a very serious, lightweight, dependency-fr
Pure Python 3 MTProto API Telegram client library, for bots too!
Telethon ⭐️ Thanks everyone who has starred the project, it means a lot! Telethon is an asyncio Python 3 MTProto library to interact with Telegram's A
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP
Typed interactions with the GitHub API v3
PyGitHub PyGitHub is a Python library to access the GitHub API v3 and Github Enterprise API v3. This library enables you to manage GitHub resources su
GitPython is a python library used to interact with Git repositories.
Gitoxide: A peek into the future… I started working on GitPython in 2009, back in the days when Python was 'my thing' and I had great plans with it. O
🐍 The official Python client library for Google's discovery based APIs.
Google API Client This is the Python client library for Google's discovery based APIs. To get started, please see the docs folder. These client librar
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
Best-of Machine Learning with Python 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. This curated list contains 840 awe
Provides syntax for Python-Markdown which allows for the inclusion of the contents of other Markdown documents.
Markdown-Include This is an extension to Python-Markdown which provides an "include" function, similar to that found in LaTeX (and also the C pre-proc
A fast, extensible and spec-compliant Markdown parser in pure Python.
mistletoe mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest Comm
Preview GitHub README.md files locally before committing them.
Grip -- GitHub Readme Instant Preview Render local readme files before sending off to GitHub. Grip is a command-line server application written in Pyt
Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python!
markdown-it-py Markdown parser done right. Follows the CommonMark spec for baseline parsing Configurable syntax: you can add new rules and even replac
Awesome Django Markdown Editor, supported for Bootstrap & Semantic-UI
martor Martor is a Markdown Editor plugin for Django, supported for Bootstrap & Semantic-UI. Features Live Preview Integrated with Ace Editor Supporte
Comprehensive Markdown plugin built for Django
Django MarkdownX Django MarkdownX is a comprehensive Markdown plugin built for Django, the renowned high-level Python web framework, with flexibility,
Convert HTML to Markdown-formatted text.
html2text html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to
markdown2: A fast and complete implementation of Markdown in Python
Markdown is a light text markup format and a processor to convert that to HTML. The originator describes it as follows: Markdown is a text-to-HTML con
Extensions for Python Markdown
PyMdown Extensions Extensions for Python Markdown. Documentation Extension documentation is found here: https://facelessuser.github.io/pymdown-extensi
Static site generator that supports Markdown and reST syntax. Powered by Python.
Pelican Pelican is a static site generator, written in Python. Write content in reStructuredText or Markdown using your editor of choice Includes a si
A Python implementation of John Gruber’s Markdown with Extension support.
Python-Markdown This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though
A test fixtures replacement for Python
factory_boy factory_boy is a fixtures replacement based on thoughtbot's factory_bot. As a fixtures replacement tool, it aims to replace static, hard t
Automated Security Testing For REST API's
Astra REST API penetration testing is complex due to continuous changes in existing APIs and newly added APIs. Astra can be used by security engineers
Load and performance benchmark tool
Yandex Tank Yandextank has been moved to Python 3. Latest stable release for Python 2 here. Yandex.Tank is an extensible open source load testing tool
Integration layer between Requests and Selenium for automation of web actions.
Requestium is a Python library that merges the power of Requests, Selenium, and Parsel into a single integrated tool for automatizing web actions. The
Python Rest Testing
pyresttest Table of Contents What Is It? Status Installation Sample Test Examples Installation How Do I Use It? Running A Simple Test Using JSON Valid
Parameterized testing with any Python test framework
Parameterized testing with any Python test framework Parameterized testing in Python sucks. parameterized fixes that. For everything. Parameterized te
A Modular Penetration Testing Framework
fsociety A Modular Penetration Testing Framework Install pip install fsociety Update pip install --upgrade fsociety Usage usage: fsociety [-h] [-i] [-
Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source.
Mockoon Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source. It has been built wi
a socket mock framework - for all kinds of socket animals, web-clients included
mocket /mɔˈkɛt/ A socket mock framework for all kinds of socket animals, web-clients included - with gevent/asyncio/SSL support ...and then MicroPytho
User-oriented Web UI browser tests in Python
Selene - User-oriented Web UI browser tests in Python (Selenide port) Main features: User-oriented API for Selenium Webdriver (code like speak common
Declarative HTTP Testing for Python and anything else
Gabbi Release Notes Gabbi is a tool for running HTTP tests where requests and responses are represented in a declarative YAML-based form. The simplest
Selenium-python but lighter: Helium is the best Python library for web automation.
Selenium-python but lighter: Helium Selenium-python is great for web automation. Helium makes it easier to use. For example: Under the hood, Helium fo
A mocking library for requests
httmock A mocking library for requests for Python 2.7 and 3.4+. Installation pip install httmock Or, if you are a Gentoo user: emerge dev-python/httm
Aioresponses is a helper for mock/fake web requests in python aiohttp package.
aioresponses Aioresponses is a helper to mock/fake web requests in python aiohttp package. For requests module there are a lot of packages that help u
Mixer -- Is a fixtures replacement. Supported Django, Flask, SqlAlchemy and custom python objects.
The Mixer is a helper to generate instances of Django or SQLAlchemy models. It's useful for testing and fixture replacement. Fast and convenient test-
A command-line tool and Python library and Pytest plugin for automated testing of RESTful APIs, with a simple, concise and flexible YAML-based syntax
1.0 Release See here for details about breaking changes with the upcoming 1.0 release: https://github.com/taverntesting/tavern/issues/495 Easier API t
✅ Python web automation and testing. 🚀 Fast, easy, reliable. 💠
Build fast, reliable, end-to-end tests. SeleniumBase is a Python framework for web automation, end-to-end testing, and more. Tests are run with "pytes
A set of pytest fixtures to test Flask applications
pytest-flask An extension of pytest test runner which provides a set of useful tools to simplify testing and development of the Flask extensions and a
Web testing library for Robot Framework
SeleniumLibrary Contents Introduction Keyword Documentation Installation Browser drivers Usage Extending SeleniumLibrary Community Versions History In
A Django plugin for pytest.
Welcome to pytest-django! pytest-django allows you to test your Django project/applications with the pytest testing tool. Quick start / tutorial Chang
HTTP client mocking tool for Python - inspired by Fakeweb for Ruby
HTTPretty 1.0.5 HTTP Client mocking tool for Python created by Gabriel Falcão . It provides a full fake TCP socket module. Inspired by FakeWeb Github
splinter - python test framework for web applications
splinter - python tool for testing web applications splinter is an open source tool for testing web applications using Python. It lets you automate br
A test fixtures replacement for Python
factory_boy factory_boy is a fixtures replacement based on thoughtbot's factory_bot. As a fixtures replacement tool, it aims to replace static, hard t
Automatically mock your HTTP interactions to simplify and speed up testing
VCR.py 📼 This is a Python version of Ruby's VCR library. Source code https://github.com/kevin1024/vcrpy Documentation https://vcrpy.readthedocs.io/ R
A utility for mocking out the Python Requests library.
Responses A utility library for mocking out the requests Python library. Note Responses requires Python 2.7 or newer, and requests = 2.0 Installing p
A browser automation framework and ecosystem.
Selenium Selenium is an umbrella project encapsulating a variety of tools and libraries enabling web browser automation. Selenium specifically provide
Scalable user load testing tool written in Python
Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst
Code coverage measurement for Python
Coverage.py Code coverage testing for Python. Coverage.py measures code coverage, typically during test execution. It uses the code analysis tools and
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
mitmproxy mitmproxy is an interactive, SSL/TLS-capable intercepting proxy with a console interface for HTTP/1, HTTP/2, and WebSockets. mitmdump is the
The easy-to-use and developer-friendly CMS
django CMS Open source enterprise content management system based on the Django framework and backed by the non-profit django CMS Association. Get inv
A Django content management system focused on flexibility and user experience
Wagtail is an open source content management system built on Django, with a strong community and commercial support. It's focused on user experience,
ASGI support for the Tartiflette GraphQL engine
tartiflette-asgi is a wrapper that provides ASGI support for the Tartiflette Python GraphQL engine. It is ideal for serving a GraphQL API over HTTP, o
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine, do not hesitate to take a look of the Tartiflette project.
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine. You can take a look at the Tartiflette API documentation. U
A new GraphQL library for Python 🍓
Strawberry GraphQL Python GraphQL library based on dataclasses Installation ( Quick Start ) The quick start method provides a server and CLI to get go
Django registration and authentication with GraphQL.
Django GraphQL Auth Django registration and authentication with GraphQL. Demo About Abstract all the basic logic of handling user accounts out of your
GraphQL Engine built with Python 3.6+ / asyncio
Tartiflette is a GraphQL Server implementation built with Python 3.6+. Summary Motivation Status Usage Installation Installation dependencies Tartifle
Adds GraphQL support to your Flask application.
Flask-GraphQL Adds GraphQL support to your Flask application. Usage Just use the GraphQLView view from flask_graphql from flask import Flask from flas