CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Overview

SmartSim Example Zoo

This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

The CrayLabs team will attempt to keep examples updated with current releases but all user contibuted examples should specify the release they were created with.

Contibuting Examples

We welcome any and all contibutions to this repository. The CrayLabs team will do their best to review in a timely manner. We ask that, if you contribute examples, please include a description and all references to code and relavent previous implemenations or open source code that the work is based off of for the benefit of anyone who would like to try out your example.

Examples by Paper

The following examples are implemented based on existing research papers. Each example lists the paper, previous works, and links to the implementation (possibly stored within this repository or a seperate repository)

1. DeepDriveMD

  • Contibuting User: CrayLabs
  • Tags: OpenMM, CVAE, online inference, unsupervised online learning, PyTorch, ensemble

This use case highlights many features of SmartSim and SmartRedis and together they can be used to orchestrate complex workflows with coupled applications without using the filesystem for exchanging information.

More specifically, this use case is based on the original DeepDriveMD work. DeepDriveMD was furthered with an asynchronous streaming version. SmartSim extends the streaming implementation through the use of the SmartSim architecture. The main difference between the SmartSim implementation and the previous implementations, is that neither ML models, nor Molecular Dynamics (MD) intermediate results are stored on the file system. Additionally, the inference portion of the workflow takes place inside the database instead of a seperate task launched on the system.

2. TensorFlowFoam

  • Contributing User: CrayLabs
  • Tags: Online Inference, TensorFlow, OpenFOAM, supervised learning

This example shows how to use TensorFlow inside of OpenFOAM simulations using SmartSim.

More specifically, this SmartSim use case adapts the TensorFlowFoam work which utilized a deep neural network to predict steady-state turbulent viscosities of the Spalart-Allmaras (SA) model. This use case highlights that a machine learning model can be evaluated using SmartSim from within a simulation with minimal external library code. For the OpenFOAM use case herein, only four SmartRedis client API calls are needed to initialize a client connection, send tensor data for evaluation, execute the TensorFlow model, and retrieve the model inference result.

In general, this example provides a useful driver script for those looking to run OpenFOAM with SmartSim.

3. ML-EKE

  • Contributing User: CrayLabs
  • Tags: Online inference, MOM6, climate modeling, ensemble, parameterization replacement

This example was a collaboration between CrayLabs (HPE), NCAR, and the university of Victoria. Using SmartSim, this example shows how to run an ensemble of simulations all using the SmartSim architecture to replace a parameterization (MEKE) within each global ocean simulation (MOM6).

Paper Abstract:

We demonstrate the first climate-scale, numerical ocean simulations improved through distributed, online inference of Deep Neural Networks (DNN) using SmartSim. SmartSim is a library dedicated to enabling online analysis and Machine Learning (ML) for traditional HPC simulations. In this paper, we detail the SmartSim architecture and provide benchmarks including online inference with a shared ML model on heterogeneous HPC systems. We demonstrate the capability of SmartSim by using it to run a 12-member ensemble of global-scale, high-resolution ocean simulations, each spanning 19 compute nodes, all communicating with the same ML architecture at each simulation timestep. In total, 970 billion inferences are collectively served by running the ensemble for a total of 120 simulated years. Finally, we show our solution is stable over the full duration of the model integrations, and that the inclusion of machine learning has minimal impact on the simulation runtimes.

Since this is original research done by CrayLabs, there is no previous implementation.

Examples by Simulation Model

LAMMPS

SmartSim examples with LAMMPS which is a Molecular Dynamics simulation model.

1. Online Analysis of Atom Position

  • Contibuting User: CrayLabs
  • Tags: Molecular Dynamics, online analysis, visualizations.

LAMMPS has dump styles which are custom I/O methods that can be implmentated by users. CrayLabs implemented a SMARTSIM dump style which uses the SmartRedis clients to stream data to an Orchestrator database created by SmartSim.

Once the data is in the database, any application with a SmartRedis client can consume that data. For this example, we have a simple Python script that uses iPyVolume to plot the data every 100 iterations.

Examples by System

High Performance Computing Systems are a bit like snowflakes, they are all different. Since each one has their own quirks, some examples for specific and popular systems can be of benefit to new users.

National Center for Atmospheric Research (NCAR)

1. Cheyenne

  • Contibuting User: CrayLabs
  • implementation (this repo)
  • WLM: PBSPro
  • System: SGI 8600
  • CPU: intel
  • GPU: None

2. Casper

  • Contibuting user: @jedwards4b
  • Implementation (this repo)
  • WLM: PBSPro
  • GPU: Nvidia
  • CPU: Intel
  • SmartSim Version: 0.3.2
  • SmartRedis Version: 0.2.0

Oak Ridge National Lab

1. Summit

  • Contributing user: CrayLabs
  • implementation (this repo)
  • System:
  • OS: Red Hat Enterprise Linux (RHEL)
  • CPU: Power9
  • GPU: Nvidia V100
You might also like...
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Create large-scale ML-driven multiscale simulation ensembles to study the interactions

MuMMI RAS v0.1 Released: Nov 16, 2021 MuMMI RAS is the application component of the MuMMI framework developed to create large-scale ML-driven multisca

2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.
2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

Fluid Simulation Usage Download this repo and store it in your computer. Open a terminal and go to the root directory of this folder. Make sure you ha

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service
machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service

This is a machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service. We initially made this project as a requirement for an internship at Indian Servers. We are now making it open to contribution.

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

Comments
  • Update Zoo examples for latest Release

    Update Zoo examples for latest Release

    Description

    Update to SmartSim 0.4.0 and SmartRedis 0.3.0 after the release.

    Acceptance Criteria

    Run through each Zoo example updating outdated API calls with a focus on portability functions and the system examples.

    • [x] SmartSim-openmm
    • [x] SmartSim-lammps
    • [x] SmartSim-openfoam
    opened by Spartee 0
  • Update examples to use new API

    Update examples to use new API

    Description

    The examples need to be updated to use the tweaked API following updates in the SmartRedis library. This has two main sources:

    1. poll_dataset() has been added and should now be used to check for dataset existence instead of poll_tensor().
    2. C and Fortran API calls now return an error code that should be checked and addressed appropriately when not success.

    Acceptance Criteria

    1. Calls to poll_dataset() should be implemented where appropriate
    2. Zoo examples in C and Fortran should build and run correctly with the latest SmartRedis library APIs.
    opened by billschereriii 0
  • Including ALCF online training example from Riccardo and Filippo

    Including ALCF online training example from Riccardo and Filippo

    Hi, I included our example in the Theta directory. I didn't add any code, but instead added the description with links to our directory which has the code and also build instructions.

    opened by rickybalin 0
  • Add Theta & Thetagpu

    Add Theta & Thetagpu

    This PR adds Theta and Theta GPU examples to the zoo.

    • Theta examples use aprun
    • ThetaGPU examples use mpirun
    • Two installers are included, one for Theta and one for ThetaGPU
    opened by al-rigazzi 0
Owner
Cray Labs
Cray Labs
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 21 Jun 25, 2022
A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

Davis E. King 11.6k Jan 2, 2023
Machine learning that just works, for effortless production applications

Machine learning that just works, for effortless production applications

Elisha Yadgaran 16 Sep 2, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy machine learning and deep learning models tabular data.

Robin 55 Dec 27, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just

wenqi 2 Jun 26, 2022
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Amplo 10 May 15, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

Ayşe Nur Türkaslan 9 Oct 14, 2022
Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Breast-Cancer-Classification - Using SKLearn breast cancer dataset which contains 569 examples and 32 features classifying has been made with 6 different algorithms

Mert Sezer Ardal 1 Jan 31, 2022
Simulation of early COVID-19 using SIR model and variants (SEIR ...).

COVID-19-simulation Simulation of early COVID-19 using SIR model and variants (SEIR ...). Made by the Laboratory of Sustainable Life Assessment (GYRO)

José Paulo Pereira das Dores Savioli 1 Nov 17, 2021