Parris, the automated infrastructure setup tool for machine learning algorithms.

Related tags

Deep Learning Parris
Overview

README

Parris Icon

Parris, the automated infrastructure setup tool for machine learning algorithms.

What Is This Tool?

Parris is a tool for automating the training of machine learning algorithms. If you're the kind of person that works on ML algorithms and spends too much time setting up a server to run it on, having to log into it to monitor its progress, etc., then you will find this tool helpful. No need to SSH into instances to get your training jobs done!

Setup

You'll need an AWS account, AWS credentials loaded to your workstation (set up through $ aws configure), a machine learning algorithm to train, and of course a dataset that it can be trained on. You'll also likely want an S3 bucket or some other storage location for your algorithm's training results.

UNIX/Linux:

$ git clone https://github.com/jgreenemi/parris.git && cd parris
$ virtualenv -p python3 env
$ source env/bin/activate
(env) $ pip --version
pip 9.0.1 from .../env/lib/python3.6/site-packages (python 3.6)
(env) $ pip install -r requirements.txt 

Windows:

$ git clone https://github.com/jgreenemi/parris.git && cd parris
$ virtualenv -p python3.exe env
$ env\Scripts\activate
(env) $ pip --version
pip 9.0.1 from ...\python\python36\lib\site-packages (python 3.6)
(env) $ pip install -r requirements.txt 

How To Use

To use Parris, follow the Getting Started guide which will take you from setup all the way to launching your first ML training stack. While getting familiar with the tool you'll also want to consult the Configuration guide to better understand what options are available to you. This will help a lot in conjunction with the Getting Started guide.

FAQ

Consult the FAQ page in the documentation as many questions are answered there. If your question was not answered, please get in touch, either via a new Github Issue (preferred) or via an email below. The former is preferred as others with the same question can benefit from seeing the answer posted publicly.

Contributions

This tool is an open source project released under the Apache 2.0 license. Contributions from the community are more than welcome! Do consult the Issues page for known feature requests, roadmap items, and bugs you can work on.

Contact

Comments
  • Add support for AWS Spot Instances/other options besides On-Demand instances.

    Add support for AWS Spot Instances/other options besides On-Demand instances.

    The tool is set to launch servers in the On-Demand way, meaning this is the most expensive way to go about it. This should be changed such that users can introduce the use of Spot instances to save on costs, or even take advantage of Reserved Instances if they have RI capacity already set aside. I'm unfamiliar how the latter works when used with CFN stacks, so this will require additional research.

    enhancement help wanted 
    opened by jgreenemi 0
  • Update setup.py to keep statements in a separate function for extensibility.

    Update setup.py to keep statements in a separate function for extensibility.

    Move lambda setup statements to their own function. lambda_creation would be a great name though it's already in use, so consider if a refactor is necessary. Also rename internal functions per PEP8. All this would bring the tool to being closer to pipeline integration, as in its current state one can't (easily) import it into an existing tool like a dependency.

    help wanted good first issue 
    opened by jgreenemi 0
  • Update logging levels for more appropriate log entries.

    Update logging levels for more appropriate log entries.

    Go through the package and change the logger messages so not everything's logged at Warning logging.warning('') level. This was done so that key stages could be identified in testing, as a result of my not setting up the StreamHandler beyond the defaults. Many of the Warning messages should be Info level, and the StreamHandler should be set to display them so when you run $ python setup.py you get useful outputs.

    Can also create options for the user to have a FileHandler so logging output is kept in a file rather than just in the terminal buffer, but depends on whether this is a big user ask or not - could be overkill since the most useful logging messages will be errors and will display with the default settings anyway.

    help wanted good first issue 
    opened by jgreenemi 0
  • Add `no-clobber` option for Lambda function update.

    Add `no-clobber` option for Lambda function update.

    Allow lambda-config.json to specify if this function should be replaced or not if setup.py is run and an existing function is found. At present it just overwrites whatever code is already there to make updates seamless, but user may not want to have their Lambda function be so easily overwritten.

    help wanted good first issue 
    opened by jgreenemi 0
  • Add in-script IAM role creation for each training job.

    Add in-script IAM role creation for each training job.

    Make the IAM role configurable in config (and then created just-in-time as Lambda is launched) for what they may need from it, rather than ask that it be set up ahead of time. Can give some preset IAM options for things like S3 read/S3 write/EC2 tag updates/EC2 termination so user gives a couple keywords in the config and the script auto-generates the necessary IAM roles on the fly.

    Training config will need to be updated such that an existing IAM role can be used, otherwise a new one is auto-created.

    help wanted 
    opened by jgreenemi 0
  • Allow multiple AWS account credential sources

    Allow multiple AWS account credential sources

    Customize the method by which the AWS keys are decided. User may want to create their training stack under a different AWS account than their environment variables are set, and the tool right now is set to always launch in the default creds' account.

    Can introduce as part of a config file to set the credential retrieval method, and the ACCESS_KEY to use. Make sure not to ask that the SECRET_KEY be loaded into a config file as this is not secure by default and we shouldn't encourage users to keep plaintext creds on their computers.

    enhancement help wanted 
    opened by jgreenemi 0
Owner
Joseph Greene
I work on machine learning and software development challenges. Python, Kotlin. Formerly Amazon Halo, Amazon Go, and AWS!
Joseph Greene
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery

Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery Lorien is an infrastructure to massively explore/benchmark the best sc

Amazon Web Services - Labs 45 Dec 12, 2022
Machine learning framework for both deep learning and traditional algorithms

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for

NeoML 704 Dec 27, 2022
Predict stock movement with Machine Learning and Deep Learning algorithms

Project Overview Stock market movement prediction using LSTM Deep Neural Networks and machine learning algorithms Software and Library Requirements Th

Naz Delam 46 Sep 13, 2022
[UNMAINTAINED] Automated machine learning for analytics & production

auto_ml Automated machine learning for production and analytics Installation pip install auto_ml Getting started from auto_ml import Predictor from au

Preston Parry 1.6k Jan 2, 2023
Add-on for importing and auto setup of character creator 3 character exports.

CC3 Blender Tools An add-on for importing and automatically setting up materials for Character Creator 3 character exports. Using Blender in the Chara

null 260 Jan 5, 2023
A setup script to generate ITK Python Wheels

ITK Python Package This project provides a setup.py script to build ITK Python binary packages and infrastructure to build ITK external module Python

Insight Software Consortium 59 Dec 14, 2022
Python script that allows you to automatically setup your Growtopia server.

AutoSetup Python script that allows you to automatically setup your Growtopia server. How To Use Firstly, install all the required modules that used i

Aspire 3 Mar 6, 2022
SmartSim Infrastructure Library.

Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten

Cray Labs 139 Jan 1, 2023
FwordCTF 2021 Infrastructure and Source code of Web/Bash challenges

FwordCTF 2021 You can find here the source code of the challenges I wrote (Web and Bash) in FwordCTF 2021 and the source code of the platform with our

Kahla 5 Nov 25, 2022
Infrastructure as Code (IaC) for a self-hosted version of Gnosis Safe on AWS

Welcome to Yearn Gnosis Safe! Setting up your local environment Infrastructure Deploying Gnosis Safe Prerequisites 1. Create infrastructure for secret

Numan 16 Jul 18, 2022
Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

Order of the Overflow 46 Sep 13, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 8, 2023
PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear

Ahmed Gad 1.1k Dec 26, 2022
BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work. For this project, I used the sigmoid function as an activation function along with stochastic gradient descent to adjust the weights and biases.

Manas Bommakanti 1 Jan 22, 2022
An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

DTech.AIML An AI made using artificial intelligence (AI) and machine learning algorithms (ML) . This is created by help of some members in my team and

null 1 Jan 6, 2022