Basic repository showing how to use Hydra + Hydra launchers on SLURM cluster

Overview

Slurm-Hydra-Submitit

This repository is a minimal working example on how to:

Set up Hydra

⚠️ You need to install hydra-core for this step.

Hydra is fairly easy to set-up:

By simply running python slurm_hydra_submitit/script.py, you'll see how the main function takes the arguments from the configuration file and pass them to the following underlying functions.

Launch jobs on a SLURM cluster with Hydra submitit launcher

Launch a job on the cluster

⚠️ You need to install hydra-submitit-launcher for this step.

Now that our Hydra conf is setup, we want to run the job on a SLURM cluster instead of our local computer. For that, we need to:

  • specify the hydra launcher to work on the SLURM cluster
  • specify the hardware specifications for the SLURM job

If you connect to your SLURM cluster scheduler node, just by installing hydra-submitit-launcher, you can already launch jobs on the cluster with:

python slurm_hydra_submitit/script.py --multirun hydra/launcher=submitit_slurm

To test locally before sending to the cluster, you can switch the hydra/launcher argument to submitit_local.

Adapt node parameters

You can easily adapt the SLURM parameters by modifying the following arguments SLURM launcher arguments.

For example, the following script is executed on nodes with 10 CPUs: python slurm_hydra_submitit/script.py --multirun hydra/launcher=submitit_slurm hydra.launcher.cpus_per_task=10

Launch array of jobs on the SLURM cluster

Grid Search

You can launch multiple jobs at once by specifying their values in the launch command.

For example, the following command launches 4 jobs which corresponds to all the possible combinations of arguments.

python slurm_hydra_submitit/script.py --multirun hydra/launcher=submitit_slurm project_name=P1,P2 train.epochs=30,40

Specific Parameters Combinations

Alternatively, you can pass sets of parameters to test together:

python slurm_hydra_submitit/script.py --multirun hydra/launcher=submitit_slurm +compile="{project_name:P1,train.epochs:30}, {project_name:P2,train.epochs:40}"

To clean this command a bit, we can create a bash script similar to this:

#!/bin/bash
params=(
    '{project_name:P1,train.epochs:10},'
    '{project_name:P2,train.epochs:20}'
)

slurm_hydra_submitit/script.py --multirun hydra/launcher=submitit_slurm +compile="${params[*]}"
You might also like...
A basic python project which replicates the functionalities on an 8 Ball.
A basic python project which replicates the functionalities on an 8 Ball.

Magic-8-Ball To the people who wish to make decisions using a Magic 8 Ball but can't get one? I gotchu. This is a basic python project which replicate

basic tool for NFT. let's spam, this is the easiest way to generate a hell lotta image

NFT generator this is the easiest way to generate a hell lotta image buckle up and follow me! how to first have your image in .png (transparent backgr

A basic notes app to store your notes.

Notes Webapp A basic notes webapp to keep your notes.You can add, edit and delete notes after signing up. To add a note type your note in the text box

Basic Clojure REPL for Sublime Text

Basic Clojure REPL for Sublime Text Goals: Decomplected: just REPL, nothing more Zero dependencies: works directly with pREPL Compact: Display code ev

This is a pretty basic but relatively nice looking Python Pomodoro Timer.
This is a pretty basic but relatively nice looking Python Pomodoro Timer.

Python Pomodoro-Timer This is a pretty basic but relatively nice looking Pomodoro Timer. Currently its set to a very basic mode, but the funcationalit

B-Pkg is a simple tool in python for installing all basic package in termux

Basic-Pkg 👉🏻 Basic-Pkg 👈🏻 B-Pkg is a simple tool in python for installing all basic package in termux This is my first tool, I hope you will like

A basic ticketing software.
A basic ticketing software.

Ticketer A basic ticketing software. Screenshots Program Launched Issuing Ticket Show your Ticket Entry Done Program Exited Code Features to implement

A basic layout of atm working of my local database
A basic layout of atm working of my local database

Software for working Banking service 😄 This project was developed for Banking service. mysql server is required To have mysql server on your system u

A basic layout of atm working of my local database
A basic layout of atm working of my local database

Software for working Banking service 😄 This project was developed for Banking service. mysql server is required To have mysql server on your system u

Owner
Raphael Meudec
PhD candidate @Parietal-INRIA
Raphael Meudec
easy_sbatch - Batch submitting Slurm jobs with script templates

easy_sbatch - Batch submitting Slurm jobs with script templates

Wei Shen 13 Oct 11, 2022
A simple website-based resource monitor for slurm system.

Slurm Web A simple website-based resource monitor for slurm system. Screenshot Required python packages flask, colored, humanize, humanfriendly, beart

Tengda Han 17 Nov 29, 2022
An example file showing a simple endpoints like a login/logout function and maybe some others.

Flask API Example An example project showing a simple endpoints like a login/logout function and maybe some others. How to use: Open up your IDE (or u

Kevin 1 Oct 27, 2021
Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM

Serverless-capture-lambda-payload-demo Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM This wi

Datadog, Inc. 1 Nov 2, 2021
A step-by-step tutorial for how to work with some of the most basic features of Nav2 using a Jupyter Notebook in a warehouse environment to create a basic application.

This project has a step-by-step tutorial for how to work with some of the most basic features of Nav2 using a Jupyter Notebook in a warehouse environment to create a basic application.

Steve Macenski 49 Dec 22, 2022
Running a complete single-node all-in-one cluster instance of TIBCO ActiveMatrix™ BusinessWorks 6.8.0.

TIBCO ActiveMatrix™ BusinessWorks 6.8 Docker Image Image for running a complete single-node all-in-one cluster instance of TIBCO ActiveMatrix™ Busines

Federico Alpi 1 Dec 10, 2021
Distribute PySPI jobs across a PBS cluster

Distribute PySPI jobs across a PBS cluster This repository contains scripts for distributing PySPI jobs across a PBS-type cluster. Each job will conta

Oliver Cliff 1 Feb 10, 2022
This repository requires you to solve a problem by writing some basic python code.

Can You Solve a Problem? A beginner friendly repository that requires you to solve familiar problems with python. This could be as simple as implement

Precious Kolawole 11 Nov 30, 2022
A curses based mpd client with basic functionality and album art.

Miniplayer A curses based mpd client with basic functionality and album art. After installation, the player can be opened from the terminal with minip

Tristan Ferrua 102 Dec 24, 2022
Simple GUI menu for micropython using a rotary encoder and basic display.

Micropython encoder based menu This is a simple menu system written in micropython. It uses a switch, a rotary encoder and an OLED display.

null 80 Jan 7, 2023