Custom SLURM wrapper scripts to make finding job histories and system resource usage more easily accessible

Overview

SLURM Wrappers

Executables

job-history

A simple wrapper for grabbing data for completed and running jobs.

nodes-busy

Developed for the HPC systems at University of Arizona. High memory nodes are differentiated from standard nodes with AvailableFeatures=hi_mem.

nodes-busy.mp4
You might also like...
Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python

Termos e acordos Ao iniciar este projeto, você concorda com as diretrizes do Código de Ética e Conduta e do Manual da Pessoa Estudante da Trybe. Boas

NUM Alert - A work focus aid created for the Hack the Job hackathon

Contributors: Uladzislau Kaparykha, Amanda Hahn, Nicholas Waller Hackathon Team Name: N.U.M General Purpose: The general purpose of this program is to

Standalone PyQGIS application for executing custom scripts without a QGIS GUI.

PyQGIS Standalone Script Executer Standalone PyQGIS application that is able to run a custom script, in this case Proximity.py without the need of a G

This is a Poetry plugin that will make it possible to build projects using custom TOML files

Poetry Multiproject Plugin This is a Poetry plugin that will make it possible to build projects using custom TOML files. This is especially useful whe

Coinloggr - A learning resource and social platform for the coin collecting community
Coinloggr - A learning resource and social platform for the coin collecting community

Coinloggr A learning resource and social platform for the coin collecting commun

python scripts - mostly automation scripts

python python scripts - mostly automation scripts You can set your environment in various ways bash #!/bin/bash python - locally on remote host #!/bi

A simple bot that will help you in your learning and make it more fun.

hyperskill-SimpleChattyBot-python A simple bot that will help you in your learning and make it more fun. Syntax bot.py Stages Stage #1: Zuhura Bot we

Make after-work Mending More flexible In Python

Mending Make after-work Mending More flexible In Python A Lite Package focuses on making project's after-post mending pythonic and flexible. Certainly

Comments
  • KeyError: GPUAlloc error when running nodes-busy

    KeyError: GPUAlloc error when running nodes-busy

    Howdy,

    Blake J pointed us (UAB Research Computing) to your nice slurm-wrapper repo :-)

    While job-history and system-busy work great on our cluster, nodes-busy is crashing on our cluster with the following:

    ❯ ./bin/nodes-busy
    
    nodes-busy: visualize live system resource usage.
    Trouble seeing the output? Try 'nodes-busy --ascii'
    
    Traceback (most recent call last):
      File "./bin/nodes-busy", line 766, in <module>
        merged = merge(job_data, node_data)
      File "./bin/nodes-busy", line 330, in merge
        nodes_dictionary[node]["JOBS"][job] = {"CPUs":cpus, "GPUs":jobs_dictionary[job]["GPUAlloc"],"EndTime":jobs_dictionary[job]["EndTime"],"Partition":jobs_dictionary[job]["Partition"],"Restarts":jobs_dictionary[job]["Restarts"]}
    KeyError: 'GPUAlloc'
    

    I'm digging through the code, but figured I'd post this for tracking purposes.

    Thanks, Mike

    opened by flakrat 7
  • nodes-busy

    nodes-busy "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 53196: invalid start byte"

    Howdy, noticed the following error today when running nodes-busy on our cluster.

    > nodes-busy
    
    Traceback (most recent call last):
      File "/share/apps/rc/bin/nodes-busy", line 1177, in <module>
        job_data = get_scontrol_job_data()
      File "/share/apps/rc/bin/nodes-busy", line 370, in get_scontrol_job_data
        output = out.decode('utf-8').split("\n")
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 53196: invalid start byte
    

    I have no idea what's in position 53196, but I fixed the issue with the following addition to output.decode to ignore errors:

    diff --git a/bin/nodes-busy b/bin/nodes-busy
    index 6fb1744..aee728f 100755
    --- a/bin/nodes-busy
    +++ b/bin/nodes-busy
    @@ -367,7 +367,7 @@ def get_scontrol_job_data(target_job = None):
             sys.exit(1)
    
         # Split up space-delimited output into a job dictionary
    -    output = out.decode('utf-8').split("\n")
    +    output = out.decode('utf-8', 'ignore').split("\n")
         for job in output:
             details = job.split(' ')
             for i in details:
    
    opened by flakrat 3
  • Feature added user arg to past-jobs

    Feature added user arg to past-jobs

    Added new argument to past-jobs to support selecting jobs by someone other than the user running the script:

    $ past-jobs -d 30 -u flakrat
    
                             Jobs submitted by user flakrat in last 30 days.
    
    JobID    Start       User            JobName         Partition  Account    State       ExitCode
    ------------------------------------------------------------------------------------------------
    14364561 2022-06-02  flakrat         hostname        express    flakrat    CANCELLED        0:0
    14385956 2022-06-05  flakrat         hostname        express    flakrat    COMPLETED        0:0
    14850816 2022-06-12  flakrat         hostname        express    flakrat    COMPLETED        0:0
    
    opened by flakrat 0
Releases(v0.2.0)
  • v0.2.0(May 11, 2022)

  • v0.1.0(Dec 9, 2021)

    The scripts in this repository have some bits and pieces that are specific to the HPC setup at University of Arizona. These parts are being cleaned up.

    Source code(tar.gz)
    Source code(zip)
Owner
Sara
HPC Consultant at University of Arizona, UITS
Sara
Allow you to create you own custom decentralize job management system.

ants Allow you to create you own custom decentralize job management system. Install $> git clone https://github.com/hvuhsg/ants.git Run monitor exampl

null 1 Feb 15, 2022
easy_sbatch - Batch submitting Slurm jobs with script templates

easy_sbatch - Batch submitting Slurm jobs with script templates

Wei Shen 13 Oct 11, 2022
Basic repository showing how to use Hydra + Hydra launchers on SLURM cluster

Slurm-Hydra-Submitit This repository is a minimal working example on how to: setup Hydra setup batch of slurm jobs on top of Hydra via submitit-launch

Raphael Meudec 2 Jul 25, 2022
A collection of daily usage utility scripts in python. Helps in automation of day to day repetitive tasks.

Kush's Utils Tool is my personal collection of scripts which is used to automated daily tasks. It is a evergrowing collection of scripts and will continue to evolve till the day I program. This is also my first python project.

Kushagra 10 Jan 16, 2022
Your E-Canteen that is convenient and accessible wherever you are in the campus

Food Web E-Canteen System Your E-Canteen that is convenient and accessible wherever you are in the campus. Table of Contents About The Project Contrib

Pudding 5 Jan 7, 2023
LinuxHelper - A collection of utilities for non-technical Linux users accessible via a GUI

Linux Helper A collection of utilities for non-technical Linux users accessible via a GUI This app is still in very early development, expect bugs and

Seth 7 Oct 3, 2022
Custom Weapons 3 attribute support for Custom Weapons X

CW3toX Allows use of Custom Weapons 3 attributes in Custom Weapons X. Requiremen

null 2 Mar 1, 2022
Coded in Python 3 - I make for education, easily clone simple website.

Simple Website Cloner - Single Page Coded in Python 3 - I make for education, easily clone simple website. How to use ? Install Python 3 first. Instal

Phạm Đức Thanh 2 Jan 13, 2022
Job Guy Backend

جاب‌گای چیست؟ اونجا وضعیت چطوریه؟ یه سوال به همین کلیت و ابهام معمولا وقتی برای یه شرکت رزومه می‌فرستیم این سوال کلی و بزرگ برای همه پیش میاد.اونجا وض

Jobguy.work 217 Dec 25, 2022
App to get data from popular polish pages with job offers

Job board parser I written simple app to get me data from popular pages with job offers, because I wanted to knew immidietly if there is some new offe

null 0 Jan 4, 2022