392 Repositories
Python aws-cdk-pipelines-datalake-infrastructure Libraries
AWS Quick Start Team
EKS CDK Quick Start (in Python) DEVELOPER PREVIEW NOTE: Thise project is currently available as a preview and should not be considered for production
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.
Fetch the details of assets hosted on AWS.
onaws onaws is a simple tool to check if an IP/hostname belongs to the AWS IP space or not. It uses the AWS IP address ranges data published by AWS to
Playing videos through S3 buckets (Wasabi, AWS, etc.) through client-side VideoJS player
Playing videos through S3 buckets (Wasabi, AWS, etc.) through client-side VideoJS player without incurring ingress/egree traffic on EC2 Instance.
SmartSim Infrastructure Library.
Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten
Create a Neo4J graph of users and roles trust policies within an AWS Organization.
AWS_ORG_MAPPER This tool uses sso-oidc to authenticate to the AWS organization. Once authenticated the tool will attempt to enumerate all users and ro
An experiment to deploy a serverless infrastructure for a scrapy project.
Serverless Scrapy project This project aims to evaluate the feasibility of an architecture based on serverless technology for a web crawler using scra
The elegance of Airflow + the power of AWS
Orkestra The elegance of Airflow + the power of AWS
An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.
An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.
Create Multiple CF entry for multiple websites
AWS-CloudFront Problem: Deploy multiple CloudFront for account with multiple domains. Functionality: Running this script in loop and deploy CloudFront
A fire and forget command-line tool to allow for easy transitions of VPN connections between a pool of AWS machines.
VPN Swapper A fire and forget command-line tool to allow for easy transitions of VPN connections between a pool of AWS machines. Dependencies poetry -
Get an SNS alert for High Severity GuardDuty findings
Automation AWS-GuardDuty findings Get an SNS alert for High Severity GuardDuty findings Problem: Getting notified when there is Red finding in AWS Gua
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image
A comprehensive reference for all topics related to building and maintaining microservices
This pandect (πανδέκτης is Ancient Greek for encyclopedia) was created to help you find and understand almost anything related to Microservices that i
This repository contains free labs for setting up an entire workflow and DevOps environment from a real-world perspective in AWS
DevOps-The-Hard-Way-AWS This tutorial contains a full, real-world solution for setting up an environment that is using DevOps technologies and practic
Orchest is a browser based IDE for Data Science.
Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well as on a large scale cloud cluster.
Declarative assertions for AWS
AWSsert AWSsert is a Python library providing declarative assertions about AWS resources to your tests. Installation Use the package manager pip to in
Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program
Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi
This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.
AWS Data Engineering Pipeline This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For
Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed
iterable-subprocess Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be
Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once
Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once
A toolkit for developing and deploying serverless Python code in AWS Lambda.
Python-lambda is a toolset for developing and deploying serverless Python code in AWS Lambda. A call for contributors With python-lambda and pytube bo
framework providing automatic constructions of vulnerable infrastructures
中文 | English 1 Introduction Metarget = meta- + target, a framework providing automatic constructions of vulnerable infrastructures, used to deploy sim
AWS Auto Inventory allows you to quickly and easily generate inventory reports of your AWS resources.
Photo by Denny Müller on Unsplash AWS Automated Inventory ( aws-auto-inventory ) Automates creation of detailed inventories from AWS resources. Table
Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions.
🤫 Easily manage configs and secrets in your Python projects (with CLI support)
Installation pip install confidential How does it work? Confidential manages secrets for your project, using AWS Secrets Manager. First, store a secr
tfquery: Run SQL queries on your Terraform infrastructure. Query resources and analyze its configuration using a SQL-powered framework.
🌩️ tfquery 🌩️ Run SQL queries on your Terraform infrastructure. Ask questions that are hard to answer 🚀 What is tfquery? tfquery is a framework tha
TorchFlare is a simple, beginner-friendly, and easy-to-use PyTorch Framework train your models effortlessly.
TorchFlare TorchFlare is a simple, beginner-friendly and an easy-to-use PyTorch Framework train your models without much effort. It provides an almost
Universal Command Line Interface for Amazon Web Services
This package provides a unified command line interface to Amazon Web Services.
SSH-Restricted deploys an SSH compliance rule (AWS Config) with auto-remediation via AWS Lambda if SSH access is public.
SSH-Restricted SSH-Restricted deploys an SSH compliance rule with auto-remediation via AWS Lambda if SSH access is public. SSH-Auto-Restricted checks
AHA is an incident management & communication framework to provide real-time alert customers when there are active AWS event(s). For customers with AWS Organizations, customers can get aggregated active account level events of all the accounts in the Organization. Customers not using AWS Organizations still benefit alerting at the account level.
Table of Contents Introduction Architecture Configuring an Endpoint Creating a Amazon Chime Webhook URL Creating a Slack Webhook URL Creating a Micros
Validate all your Customer IAM Policies against AWS Access Analyzer - Policy Validation
✅ Access Analyzer - Batch Policy Validator This script will analyze using AWS Access Analyzer - Policy Validation all your account customer managed IA
Policy and data administration, distribution, and real-time updates on top of Open Policy Agent
⚡ OPAL ⚡ Open Policy Administration Layer OPAL is an administration layer for Open Policy Agent (OPA), detecting changes to both policy and policy dat
Download and process satellite imagery in Python using Sentinel Hub services.
Description The sentinelhub Python package allows users to make OGC (WMS and WCS) web requests to download and process satellite images within your Py
geobeam - adds GIS capabilities to your Apache Beam and Dataflow pipelines.
geobeam adds GIS capabilities to your Apache Beam pipelines. What does geobeam do? geobeam enables you to ingest and analyze massive amounts of geospa
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline!
LocalStack - A fully functional local AWS cloud stack LocalStack provides an easy-to-use test/mocking framework for developing Cloud applications. Cur
Lightspin AWS IAM Vulnerability Scanner
Red-Shadow Lightspin AWS IAM Vulnerability Scanner Description Scan your AWS IAM Configuration for shadow admins in AWS IAM based on misconfigured den
Extra blocks for scikit-learn pipelines.
scikit-lego We love scikit learn but very often we find ourselves writing custom transformers, metrics and models. The goal of this project is to atte
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
combo: A Python Toolbox for Machine Learning Model Combination Deployment & Documentation & Stats Build Status & Coverage & Maintainability & License
Yes, it's true :purple_heart: This repository has 353 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If
Yes, it's true :orange_heart: This repository has 346 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If
A honey token manager and alert system for AWS.
SpaceSiren SpaceSiren is a honey token manager and alert system for AWS. With this fully serverless application, you can create and manage honey token
A honey token manager and alert system for AWS.
SpaceSiren SpaceSiren is a honey token manager and alert system for AWS. With this fully serverless application, you can create and manage honey token
IP address management (IPAM) and data center infrastructure management (DCIM) tool.
NetBox is an IP address management (IPAM) and data center infrastructure management (DCIM) tool. Initially conceived by the network engineering team a
Yes, it's true :heartbeat: This repository has 337 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If
Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance frameworks.
aws-allowlister Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance fr
Yes, it's true :yellow_heart: This repository has 326 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
Prodmodel is a build system for data science pipelines. Users, testers, contributors are welcome! Motivation · Concepts · Installation · Usage · Contr
Easy pipelines for pandas DataFrames.
pdpipe ˨ Easy pipelines for pandas DataFrames (learn how!). Website: https://pdpipe.github.io/pdpipe/ Documentation: https://pdpipe.github.io/pdpipe/d
A collection of infrastructure and tools for research in neural network interpretability.
Lucid Lucid is a collection of infrastructure and tools for research in neural network interpretability. We're not currently supporting tensorflow 2!
Machine Learning Platform for Kubernetes
Reproduce, Automate, Scale your data science. Welcome to Polyaxon, a platform for building, training, and monitoring large scale deep learning applica
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista
Yes, it's true :two_hearts: This repository has 316 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serverless! If
Yes, it's true :revolving_hearts: This repository has 301 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serv
Yes, it's true :revolving_hearts: This repository has 301 stars.
Yes, it's true! Inspired by a similar repository from @RealPeha, but implemented using a webhook on AWS Lambda and API Gateway, so it's serv
A tool that helps keeping track of your AWS quota utilization
aws-quota-checker A tool that helps keeping track of your AWS quota utilization. It'll determine the limits of your AWS account and compare them to th
An AWS Pentesting tool that lets you use one-liner commands to backdoor an AWS account's resources with a rogue AWS account - or share the resources with the entire internet 😈
An AWS Pentesting tool that lets you use one-liner commands to backdoor an AWS account's resources with a rogue AWS account - or share the resources with the entire internet 😈
Cookiecutter template for FastAPI projects using: Machine Learning, Poetry, Azure Pipelines and Pytests
cookiecutter-fastapi In order to create a template to FastAPI projects. 🚀 Important To use this project you don't need fork it. Just run cookiecutter
Serverless Python
Zappa - Serverless Python About Installation and Configuration Running the Initial Setup / Settings Basic Usage Initial Deployments Updates Rollback S
Run your jupyter notebooks as a REST API endpoint. This isn't a jupyter server but rather just a way to run your notebooks as a REST API Endpoint.
Jupter Notebook REST API Run your jupyter notebooks as a REST API endpoint. This isn't a jupyter server but rather just a way to run your notebooks as
A tool to convert AWS EC2 instances back and forth between On-Demand and Spot billing models.
ec2-spot-converter This tool converts existing AWS EC2 instances back and forth between On-Demand and 'persistent' Spot billing models while preservin
Serverless Python
Zappa - Serverless Python About Installation and Configuration Running the Initial Setup / Settings Basic Usage Initial Deployments Updates Rollback S
Testinfra test your infrastructures
Testinfra test your infrastructure Latest documentation: https://testinfra.readthedocs.io/en/latest About With Testinfra you can write unit tests in P
Command line driven CI frontend and development task automation tool.
tox automation project Command line driven CI frontend and development task automation tool At its core tox provides a convenient way to run arbitrary
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana
MinIO Client SDK for Python
MinIO Python SDK for Amazon S3 Compatible Cloud Storage MinIO Python SDK is Simple Storage Service (aka S3) client to perform bucket and object operat
A pythonic interface to Amazon's DynamoDB
PynamoDB A Pythonic interface for Amazon's DynamoDB. DynamoDB is a great NoSQL service provided by Amazon, but the API is verbose. PynamoDB presents y
Amazon S3 Transfer Manager for Python
s3transfer - An Amazon S3 Transfer Manager for Python S3transfer is a Python library for managing Amazon S3 transfers. Note This project is not curren
AWS SDK for Python
Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri
Universal Command Line Interface for Amazon Web Services
aws-cli This package provides a unified command line interface to Amazon Web Services. Jump to: Getting Started Getting Help More Resources Getting St
Publicly Open Amazon AWS S3 Bucket Viewer
S3Viewer Publicly open storage viewer (Amazon S3 Bucket, Azure Blob, FTP server, HTTP Index Of/) s3viewer is a free tool for security researchers that
AWS Lambda & API Gateway support for ASGI
Mangum Mangum is an adapter for using ASGI applications with AWS Lambda & API Gateway. It is intended to provide an easy-to-use, configurable wrapper
A Blazing fast Security Auditing tool for Kubernetes
A Blazing fast Security Auditing tool for kubernetes!! Basic Overview Kubestriker performs numerous in depth checks on kubernetes infra to identify th
AWS DeepRacer Free Student Workshop: Run faster by using your custom waypoints
AWS DeepRacer Free Student Workshop: Run faster by using your custom waypoints Reward Function Template for waypoints def reward_function(params):
Harvis is designed to automate your C2 Infrastructure.
Harvis Harvis is designed to automate your C2 Infrastructure, currently using Mythic C2. 📌 What is it? Harvis is a python tool to help you create mul
The algorithm performs a simple user registration (Name, CPF, E-mail and Telephone) in an Amazon RDS database and also performs the storage, training and facial recognition of the user's face to identify the users already registered in the system in a next time the user is seen.
Registration form with RDS AWS database and facial recognition via OpenCV The algorithm performs a simple user registration (Name, CPF, E-mail and Tel
A simple library for interacting with Amazon S3.
BucketStore is a very simple Amazon S3 client, written in Python. It aims to be much more straight-forward to use than boto3, and specializes only in
AWS SDK for Python
Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri
A pythonic interface to Amazon's DynamoDB
PynamoDB A Pythonic interface for Amazon's DynamoDB. DynamoDB is a great NoSQL service provided by Amazon, but the API is verbose. PynamoDB presents y
Determined: Deep Learning Training Platform
Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det
Accelerated deep learning R&D
Accelerated deep learning R&D PyTorch framework for Deep Learning research and development. It focuses on reproducibility, rapid experimentation, and
Model Serving Made Easy
The easiest way to build Machine Learning APIs BentoML makes moving trained ML models to production easy: Package models trained with any ML framework
Parris, the automated infrastructure setup tool for machine learning algorithms.
README Parris, the automated infrastructure setup tool for machine learning algorithms. What Is This Tool? Parris is a tool for automating the trainin
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista
[UNMAINTAINED] Automated machine learning for analytics & production
auto_ml Automated machine learning for production and analytics Installation pip install auto_ml Getting started from auto_ml import Predictor from au
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo
The easiest way to automate your data
Hello, world! 👋 We've rebuilt data engineering for the data science era. Prefect is a new workflow management system, designed for modern infrastruct
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Luigi is a Python (3.6, 3.7 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow managemen
Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
Latest Salt Documentation Open an issue (bug report, feature request, etc.) Salt is the world’s fastest, most intelligent and scalable automation engi
pyinfra automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more.
pyinfra automates/provisions/manages/deploys infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployme
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana
A supercharged AWS command line interface (CLI).
SAWS Motivation AWS CLI Although the AWS CLI is a great resource to manage your AWS-powered services, it's tough to remember usage of: 70+ top-level c