Traingenerator 🧙 A web app to generate template code for machine learning ✨

Overview

Traingenerator

🧙   A web app to generate template code for machine learning ✨

Gitter Heroku Code style: black



🎉 Traingenerator is now live! 🎉

Try it out:
https://traingenerator.jrieke.com


Generate custom template code for PyTorch & sklearn, using a simple web UI built with streamlit. Traingenerator offers multiple options for preprocessing, model setup, training, and visualization (using Tensorboard or comet.ml). It exports to .py, Jupyter Notebook, or Google Colab. The perfect tool to jumpstart your next machine learning project!


For updates, follow me on Twitter, and if you like this project, please consider sponsoring ☺




Adding new templates

You can add your own template in 4 easy steps (see below), without changing any code in the app itself. Your new template will be automatically discovered by Traingenerator and shown in the sidebar. That's it! 🎈

Want to share your magic? 🧙 PRs are welcome! Please have a look at CONTRIBUTING.md and write on Gitter.

Some ideas for new templates: Keras/Tensorflow, Pytorch Lightning, object detection, segmentation, text classification, ...

  1. Create a folder under ./templates. The folder name should be the task that your template solves (e.g. Image classification). Optionally, you can add a framework name (e.g. Image classification_PyTorch). Both names are automatically shown in the first two dropdowns in the sidebar (see image). ✨ Tip: Copy the example template to get started more quickly.
  2. Add a file sidebar.py to the folder (see example). It needs to contain a method show(), which displays all template-specific streamlit components in the sidebar (i.e. everything below Task) and returns a dictionary of user inputs.
  3. Add a file code-template.py.jinja to the folder (see example). This Jinja2 template is used to generate the code. You can write normal Python code in it and modify it (through Jinja) based on the user inputs in the sidebar (e.g. insert a parameter value from the sidebar or show different code parts based on the user's selection).
  4. Optional: Add a file test-inputs.yml to the folder (see example). This simple YAML file should define a few possible user inputs that can be used for testing. If you run pytest (see below), it will automatically pick up this file, render the code template with its values, and check that the generated code runs without errors. This file is optional – but it's required if you want to contribute your template to this repo.

Installation

Note: You only need to install Traingenerator if you want to contribute or run it locally. If you just want to use it, go here.

git clone https://github.com/jrieke/traingenerator.git
cd traingenerator
pip install -r requirements.txt

Optional: For the "Open in Colab" button to work you need to set up a Github repo where the notebook files can be stored (Colab can only open public files if they are on Github). After setting up the repo, create a file .env with content:

GITHUB_TOKEN=<your-github-access-token>
REPO_NAME=<user/notebooks-repo>

If you don't set this up, the app will still work but the "Open in Colab" button will only show an error message.

Running locally

streamlit run app/main.py

Make sure to run always from the traingenerator dir (not from the app dir), otherwise the app will not be able to find the templates.

Deploying to Heroku

First, install heroku and login. To create a new deployment, run inside traingenerator:

heroku create
git push heroku main
heroku open

To update the deployed app, commit your changes and run:

git push heroku main

Optional: If you set up a Github repo to enable the "Open in Colab" button (see above), you also need to run:

heroku config:set GITHUB_TOKEN=
   
    
heroku config:set REPO_NAME=
    

    
   

Testing

First, install pytest and required plugins via:

pip install -r requirements-dev.txt

To run all tests:

pytest ./tests

Note that this only tests the code templates (i.e. it renders them with different input values and makes sure that the code executes without error). The streamlit app itself is not tested at the moment.

You can also test an individual template by passing the name of the template dir to --template, e.g.:

pytest ./tests --template "Image classification_scikit-learn"

The mage image used in Traingenerator is from Twitter's Twemoji library and released under Creative Commons Attribution 4.0 International Public License.

Comments
  • New Feature: Semantic Segmentation

    New Feature: Semantic Segmentation

    I started to work on semantic segmentation with TF/Keras. i will mainly use segmentation models library, if you have any suggestions or if you want to help feel free to contact me!

    new template 
    opened by erentknn 3
  • Arbitrary class training in Pytorch #5

    Arbitrary class training in Pytorch #5

    Summary

    This PR adds support for declaring number of clasees for Image Classification Pytorch models.

    Details

    • Adds a parameter in the sidebar below the "Use pre-trained model" checkbox
    • Default value is 1000 (what torchvision uses)
    • If pre-trained is True, imagenet weights are loaded and last fc layer is redefined

    Checklist

    • [X] all tests are passing (see README.md on how to run tests)
    • [ ] if you created a new template: it contains a file test-inputs.yml, which specifies a few input values to test the code template (the test is then automatically run by pytest)
    • [X] you formatted all code with black
    • [X] you checked all new functionality live, i.e. in the running web app
    • [X] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [X] you added comments in your code that explain what it does
    • [X] the PR explains in detail what's new
    opened by murthy95 1
  • Add Aim integration to PyTorch image classification template

    Add Aim integration to PyTorch image classification template

    Summary

    Added Aim to PyTorch image classification template as an experiment logger option.

    Details

    What was added:

    • Initialization of aim.Session
    • Logging parameters with Aim
    • Logging metrics with Aim
    • Aim as a logger option in the sidebar
    • Experiment name input in the sidebar
    • Markdown of how to run UI and link to the full documentation

    Checklist

    • [x] all tests are passing (see README.md on how to run tests)
    • [x] you formatted all code with black
    • [x] you checked all new functionality live, i.e. in the running web app
    • [x] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [x] you added comments in your code that explain what it does
    • [x] the PR explains in detail what's new
    opened by gorarakelyan 0
  • [WIP] Refactor to make templates more independent from app

    [WIP] Refactor to make templates more independent from app

    Summary (2-3 sentences)

    This PR tries to make templates (i.e. what is rendered as code) completely independent from the core app (i.e. the website, buttons, logic behind the rendering). This will enable contributors to more easily create templates for new tasks/frameworks, without having to understand or modify the app itself.

    Details

    Each template is now one directory under ./templates with 3 files:

    • code-template.py.jinja: The template for the code that will be rendered (as before, as a jinja template)
    • sidebar.py: Needs to contain a method show(), which renders all template-specific streamlit components into the sidebar
    • test-inputs.yml: A yml file that defines some input values for the code template (used by pytest to check that the template runs w/o errors)

    Existing templates in ./templates are automatically detected by the app. Contributing a new template does not require a single change in ./app nor in any other template. Also, pytest will automatically pick up the new template and its test-inputs.yml file and make sure that the template renders and runs without errors (i.e. the user doesn't have to add any specific code for testing).

    This PR also changes the ordering of components in the sidebar slightly and fixes a few minor bugs in the code templates that I found during refactoring.

    opened by jrieke 0
  • Update requirements.txt

    Update requirements.txt

    Issues in installing

    Summary

    Details

    Checklist

    • [ ] all tests are passing (see README.md on how to run tests)
    • [ ] if you created a new template: it contains a file test-inputs.yml, which specifies a few input values to test the code template (the test is then automatically run by pytest)
    • [ ] you formatted all code with black
    • [ ] you checked all new functionality live, i.e. in the running web app
    • [ ] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [ ] you added comments in your code that explain what it does
    • [x] the PR explains in detail what's new
    opened by tanujdhiman 2
  • Add Mlflow tracking

    Add Mlflow tracking

    Summary

    As mentioned in #17, adding training run and experiment tracking capabilities using Mlflow for the following templates:

    • templates/Image classification_PyTorch
    • templates/Image classification_scikit-learn

    Details

    This PR implements basic metric/model info logging features. As a second logical next step, model artifact logging (checkpoints and final model) will be added.

    Checklist

    • [x] all tests are passing (see README.md on how to run tests)
    • [ ] if you created a new template: it contains a file test-inputs.yml, which specifies a few input values to test the code template (the test is then automatically run by pytest)
    • [x] you formatted all code with black
    • [x] you checked all new functionality live, i.e. in the running web app
    • [x] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [ ] you added comments in your code that explain what it does
    • [x] the PR explains in detail what's new
    opened by andodet 0
  • Add mlflow tracking

    Add mlflow tracking

    First of all thanks for the project, it's an interesting way to take a stab at reducing the amount of boilerplate needed even for fairly simple models. Secondly, it would be interesting to implement experiment/run tracking using MLflow.

    Have a working example on the Image classification_PyTorch/ template, happy to submit a PR if you consider this of any interest.

    existing template 
    opened by andodet 1
  • Style Transfer Template

    Style Transfer Template

    Summary

    I just added the style transfer template

    Checklist

    • [ ] all tests are passing (see README.md on how to run tests)
    • [ ] if you created a new template: it contains a file test-inputs.yml, which specifies a few input values to test the code template (the test is then automatically run by pytest)
    • [ ] you formatted all code with black
    • [ ] you checked all new functionality live, i.e. in the running web app
    • [ ] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [ ] you added comments in your code that explain what it does
    • [ ] the PR explains in detail what's new
    opened by The-ML-Hero 0
  • New Template: Semantic Segmentation; Generative Adversarial Networks; Image Caption; Style Transfer;

    New Template: Semantic Segmentation; Generative Adversarial Networks; Image Caption; Style Transfer;

    @jrieke

    Hi, everybody, I'm looking forward to contributing this project can I add

    1. Semantic Segmentation
    2. Generative Adversarial Networks
    3. Image Caption
    4. Style Transfer

    as I think this would greatly help beginners who are starting out their deep learning journey and want to do interesting things like these.

    PS: can I also work on object detection using the YOLO-v5 achritexture.

    Thanks a lot please reply as soon as possible

    new template 
    opened by The-ML-Hero 2
  • Text classification template with scikit-learn

    Text classification template with scikit-learn

    Summary

    This PR consists of a text classification template using scikit-learn as stated in #4.

    Details

    • Scikit-learn
    • NLTK for stemming and lemmatization
    • Example text files provided in the data/ folder (in spanish)
    • test.py file added in order to check the functionality of the template code itself (needs to be removed before merging)

    Checklist

    • [x] all tests are passing (see README.md on how to run tests)
    • [x] if you created a new template: it contains a file test-inputs.yml, which specifies a few input values to test the code template (the test is then automatically run by pytest)
    • [x] you formatted all code with black
    • [x] you checked all new functionality live, i.e. in the running web app
    • [x] any generated code is formatted nicely, both in .py and in .ipynb ("nicely" = comparable to the existing templates)
    • [x] you added comments in your code that explain what it does
    • [x] the PR explains in detail what's new
    opened by themrcesi 3
Owner
Johannes Rieke
Product manager dev experience @streamlit
Johannes Rieke
Machine learning template for projects based on sklearn library.

Machine learning template for projects based on sklearn library.

Janez Lapajne 17 Oct 28, 2022
Turns your machine learning code into microservices with web API, interactive GUI, and more.

Turns your machine learning code into microservices with web API, interactive GUI, and more.

Machine Learning Tooling 2.8k Jan 2, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
A machine learning web application for binary classification using streamlit

Machine Learning web App This is a machine learning web application for binary classification using streamlit options this application contains 3 clas

abdelhak mokri 1 Dec 20, 2021
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,

Samrat Mitra 2 Nov 18, 2021
Katana project is a template for ASAP 🚀 ML application deployment

Katana project is a FastAPI template for ASAP ?? ML API deployment

Mohammad Shahebaz 100 Dec 26, 2022
Uber Open Source 1.6k Dec 31, 2022
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 21 Jun 25, 2022
100 Days of Machine and Deep Learning Code

?? Days of Machine Learning and Deep Learning Code MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Cluste

Tanishq Gautam 66 Nov 2, 2022
This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Interpretable Machine Learning with Python, published by Packt

Packt 299 Jan 2, 2023
PLUR is a collection of source code datasets suitable for graph-based machine learning.

PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.

Google Research 76 Nov 25, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 3, 2023
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Daniel Formoso 5.7k Dec 30, 2022
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Jean de Dieu Nyandwi 3.8k Jan 9, 2023