Jug: A Task-Based Parallelization Framework

Luis Pedro Coelho

Last update: Dec 21, 2022

Related tags

Concurrency and Parallelism python workflow hpc workflow-engine parallel-computing python-3 python-2

Overview

Jug: A Task-Based Parallelization Framework

Jug allows you to write code that is broken up into tasks and run different tasks on different processors.

Join the chat at https://gitter.im/luispedro/jug

It uses the filesystem to communicate between processes and works correctly over NFS, so you can coordinate processes on different machines.

Jug is a pure Python implementation and should work on any platform.

Python versions 3.5 and above are supported.

Website: http://luispedro.org/software/jug

Documentation: https://jug.readthedocs.org/

Video: On vimeo or showmedo

Mailing List: http://groups.google.com/group/jug-users

Testimonials

"I've been using jug with great success to distribute the running of a reasonably large set of parameter combinations" - Andreas Longva

Install

You can install Jug with pip:

pip install Jug

or use, if you are using conda, you can install jug from conda-forge using the following commands:

conda config --add channels conda-forge
conda install jug

Citation

If you use Jug to generate results for a scientific publication, please cite

Coelho, L.P., (2017). Jug: Software for Parallel Reproducible Computation in Python. Journal of Open Research Software. 5(1), p.30.

http://doi.org/10.5334/jors.161

Short Example

Here is a one minute example. Save the following to a file called primes.py (if you have installed jug, you can obtain a slightly longer version of this example by running jug demo on the command line):

from jug import TaskGenerator
from time import sleep

@TaskGenerator
def is_prime(n):
    sleep(1.)
    for j in range(2,n-1):
        if (n % j) == 0:
            return False
    return True

primes100 = [is_prime(n) for n in range(2,101)]

This is a brute-force way to find all the prime numbers up to 100. Of course, this is only for didactical purposes, normally you would use a better method. Similarly, the sleep function is so that it does not run too fast. Still, it illustrates the basic functionality of Jug for embarassingly parallel problems.

Type jug status primes.py to get:

Task name                  Waiting       Ready    Finished     Running
----------------------------------------------------------------------
primes.is_prime                  0          99           0           0
......................................................................
Total:                           0          99           0           0

This tells you that you have 99 tasks called primes.is_prime ready to run. So run jug execute primes.py &. You can even run multiple instances in the background (if you have multiple cores, for example). After starting 4 instances and waiting a few seconds, you can check the status again (with jug status primes.py):

Task name                  Waiting       Ready    Finished     Running
----------------------------------------------------------------------
primes.is_prime                  0          63          32           4
......................................................................
Total:                           0          63          32           4

Now you have 32 tasks finished, 4 running, and 63 still ready. Eventually, they will all finish and you can inspect the results with jug shell primes.py. This will give you an ipython shell. The primes100 variable is available, but it is an ugly list of jug.Task objects. To get the actual value, you call the value function:

In [1]: primes100 = value(primes100)

In [2]: primes100[:10]
Out[2]: [True, True, False, True, False, True, False, False, False, True]

What's New

Version 2.1.1 (18 March 2021)

Include requirements files in distribution

Version 2.1.0 (18 March 2021)

Improvements to webstatus (by Robert Denham)
Removed Python 2.7 support
Fix output encoding for Python 3.8
Fix bug mixing mapreduce() & status --cache
Make block_access (used in mapreduce()) much faster (20x)
Fix important redis bug
More precise output in cleanup command

Version 2.0.2 (Thu Jun 11 2020)

Fix command line argument parsing

Version 2.0.1 (Thu Jun 11 2020)

Fix handling of JUG_EXIT_IF_FILE_EXISTS environmental variable
Fix passing an argument to jug.main() function
Extend --pdb to exceptions raised while importing the jugfile (issue #79)

version 2.0.0 (Fri Feb 21 2020)

jug.backend.base_store has 1 new method 'listlocks'
jug.backend.base_lock has 2 new methods 'fail' and 'is_failed'
Add 'jug execute --keep-failed' to preserve locks on failing tasks.
Add 'jug cleanup --failed-only' to remove locks from failed tasks
'jug status' and 'jug graph' now display failed tasks
Check environmental exit variables by default (suggested by Renato Alves, issue #66)
Fix 'jug sleep-until' in the presence of barrier() (issue #71)

For older version see ChangeLog file or the full history.

Comments

Jug doesn't like jobs when they're ended by SIGTERM

There seems to be a weird interaction when you run jug cleanup while jobs are still running, but I could be wrong, but in addition, when a job is ended by sigterm jobs seem to enter the completed stage and never finish.

opened by danpf 12
Implement a wrapper to run jug with gridmap
A light-weight wrapper to run jug with gridmap on a Grid Engine cluster

Get the best of both worlds of gridmap and jug

http://pythonhosted.org/Jug/

http://gridmap.readthedocs.org/en/latest/

This implements jug.grid_jug to run jug jobs with all the comfortable features of gridmap.

As I am not sure whether this belongs into the gridmap or the jug package, I create a pull request for both to foster discussion between the two projects. The file jug/grid.py is identical with the file gridmap/jug.py .
opened by andsor 9
Processes only start working after checking status

When using subprocess.Popen(["jug", "execute", "exmp.py"], stderr=STDOUT, creationflags=CREATE_NEW_CONSOLE) runing jug process, the subprocess will only starts after checking status. The same thing happens when running by subprocess.Popen(["jug", "execute", "exmp.py"], stderr=STDOUT, shell=True)

This problem doesn't appear when using subprocess.Popen(["jug", "execute", "exmp.py"], stderr=STDOUT)

Would you mind to help?

opened by chenxi-shi 7
Failed to assign tasks correctly

Hi, I wrote a small piece of python code on a macbook which has 10 + 1 tasks. 10 similar sleeping task and 1 joint task. I am running it by 4 processes and found that sometime Jug cannot assign task correctly. It happens like one process does all tasks along and, at the same time, another 3 share the 11 tasks.

opened by chenxi-shi 6
Raises error if jugdir is not defined
If jugdir is not provided on command-line to "jug execute", and there is no jug config file, the program errors out, since this is prior to getting to the call to init() which handles the missing jugdir.

Just define a default:

parser.add_option('--jugdir', action='store', dest='jugdir', default='jugdir', # define default value help='Where to save intermediate results')
opened by rjplevin 6
IterateTask as a decorator
I have been using a decorator to provide the functionality of iteratetask instead of the function.

The reason I like this is that it specifies how to handle the output (i.e. how many items in the tuple) closer to the place the output is specified (i.e. at the end of the function) rather than where the workflow is specified (which is usually a different file).

So it looks like:

@IterateTask(n=3) @jug.TaskGenerator def do_thing(args): return a, b, c

I've been using the following for a few weeks without incident:

import jug from functools import wraps def IterateTask(n): def decorator(f): @wraps(f) def wrapper(*args, **kwargs): return jug.iteratetask(f(*args, **kwargs), n=n) return wrapper return decorator

Is this something you'd be interested in adding? Is there any reason I haven't foreseen that this isn't a good idea?
opened by justinrporter 5
The right way to use Jug with class object
Hi, I defined a class which has a init. There are some methods in this class is decorated with @TaskGenerator. These methods cannot get class instance correctly. I tried 2 ways to solve it and they all don't work.

decorate init with @TaskGenerator then init() return's a jug Task instead of None

put a barrier() after defining the class object then got TypeError: cannot serialize '_io.TextIOWrapper' object

Would you mind to tell me the correct way to do it?
opened by chenxi-shi 5
Backwards compatibility fixes

sys.argv now contains jugfile and unparsed arguments (i.e. after --) The subcommand api tests were also leaking causing other parts of the testsuite to fail.

opened by unode 5

`RuntimeError: maximum recursion depth exceeded while calling a Python object`

Came across this error today:

CRITICAL:root:Exception while running tangent_lstm.test_lstm: maximum recursion depth exceeded while calling a Python object
Traceback (most recent call last):
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/bin/jug", line 5, in <module>
    main()
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/jug.py", line 415, in main
    execute(options)
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/jug.py", line 254, in execute
    execution_loop(tasks, options)
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/jug.py", line 163, in execution_loop
    t.run(debug_mode=options.debug)
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/task.py", line 95, in run
    self.store.dump(self._result, name)
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/backends/file_store.py", line 139, in dump
    encode_to(object, output)
  File "/gpfs/main/home/skainswo/Research/kaggle_gal/venv/local/lib/python2.7/site-packages/jug/backends/encode.py", line 77, in encode_to
    write(object, stream)
RuntimeError: maximum recursion depth exceeded while calling a Python object

I'm getting this error when creating a Task that wraps a function which returns a Keras model.

opened by samuela 5

A few Fixes to the redis backend and the hash function, probably worth to merge, probably not.
The url decomposition regexp for the redis back-end apparently had a problem.

The hash-from-arguments approach didn't work for my particular problem: a type can be 'pickle-able' without the pickle being unique, something that makes tasks non-uniquely identifiable. So, made it possible for the user to control the hash somehow.
opened by alcidesv 5
Allow cleanup of only locks

I recently had an issue with a few of my jug processes being killed and therefore locks being kept on some active tasks. Normally, I would just delete the lock files in the jugdir, but I was using the Redis backend, so this was not possible. As suggested in the docs, I ran jug cleanup, expecting only the locks to be cleared; I was surprised to find the completed jobs cleared as well.

Is there any way to have jug cleanup only clear the locks? If not, can there be an option for this?

opened by freemansw1 4

Invalidate --target is too greedy

-------------------------------------------------------------------------------------
           0           0          1           0           0  jugfile.function
           0           0          1           0           0  jugfile.function_other
.....................................................................................
           0           0          2           2           0  Total

Running jug invalidate --target jugfile.function invalidates everything.

opened by unode 4

Releases(v2.2.2)

v2.2.2(Jul 18, 2022)

Fix jug cleanup when packs are being used (after jug pack)
Source code(tar.gz)
Source code(zip)
v2.2.0(May 2, 2022)

Major change is addition of jug pack subcommand.

Full Changelog: https://github.com/luispedro/jug/compare/v2.1.1...v2.2.0
Source code(tar.gz)
Source code(zip)
v2.1.1(Mar 21, 2021)

Important are the fixes for Python 3.8 and redis.

Compared to 2.1.0, this patch release includes some missing files in the package.

Full list of changes * Removed Python 2.7 support * Fix output encoding for Python 3.8 * Fix bug mixing mapreduce() & status --cache * Make block_access (used in mapreduce()) much faster (20x) * Fix important redis bug * More precise output in cleanup command
Source code(tar.gz)
Source code(zip)
v2.0.2(Jun 12, 2020)
Bugfix release.

Compared to 2.0.0:

Fix handling of JUG_EXIT_IF_FILE_EXISTS environmental variable

Fix passing an argument to jug.main() function

Extend --pdb to exceptions raised while importing the jugfile (issue #79)

Source code(tar.gz)
Source code(zip)
v2.0.1(Jun 12, 2020)

This version is not recommended: use v2.0.2
Source code(tar.gz)
Source code(zip)
v2.0.0(Jun 12, 2020)
The big changes are failed tasks (contributed by @unode).

Also, environmental variables are now always checked and creating a file called __jug_please_stop.txt will stop a jug execute run in a clean way.

Full ChangeLog

jug.backend.base_store has 1 new method listlocks

jug.backend.base_lock has 2 new methods fail and is_failed

Add 'jug execute --keep-failed' to preserve locks on failing tasks.

Add 'jug cleanup --failed-only' to remove locks from failed tasks

'jug status' and 'jug graph' now display failed tasks

Check environmental exit variables by default (suggested by Renato Alves, issue #66)

Fix 'jug sleep-until' in the presence of barrier() (issue #71)

Source code(tar.gz)
Source code(zip)
v2.0.0rc0(Jan 31, 2020)

This is a major new release

The big changes are failed tasks (contributed by @unode).
Source code(tar.gz)
Source code(zip)
v1.6.9(Jan 31, 2020)

Include a compatibility fix for some versions of numpy

See https://github.com/numpy/numpy/pull/14063
Source code(tar.gz)
Source code(zip)

v1.6.3(Nov 1, 2017)

Add a polite request for citations to the Jug paper:

Coelho, L.P., (2017). Jug: Software for Parallel Reproducible Computation in
Python. Journal of Open Research Software. 5(1), p.30.

http://doi.org/10.5334/jors.161

Source code(tar.gz)
Source code(zip)

v1.6.2(Nov 1, 2017)
Minor improvements.

Full ChangeLog:

Add return_value argument to jug_execute

Add exit_env_vars() function

Source code(tar.gz)
Source code(zip)
v1.6.1(Nov 1, 2017)

Fix bug with invalidate() inside jug shell
Source code(tar.gz)
Source code(zip)
v1.6.0(Aug 24, 2017)
Release 1.6.0

Adds the jug graph subcommand to generate a little graph with tasks and their state.

Add jug graph subcommand

Generates a graph of tasks

jug execute --keep-going now ends with non-zero exit code in case of failures

Fix bug with cleanup in dict_store not providing the number of removed records

Add jug cleanup --keep-locks to remove obsolete results without affecting locks

Almost all of the credit for changes since 1.5.0 goes to Renato Alves (@unode on github).
Source code(tar.gz)
Source code(zip)
v1.5.0(Jul 16, 2017)
Several new functions (is_jug_running, invalidate in shell, ...) and a new internal structure which allows for extensions (work by Renato Alves, aka @Unode).

Full ChangeLog:

Add demo subcommand

Add is_jug_running() function

Fix bug in finding config files

Improved --debug mode: check for unsupported recursive task creation

Add invalidate() to shell environment

Use ~/.config/jug/jugrc as configuration file

Add experimental support for extensible commands, use ~/.config/jug/jug_user_commands.py

jugrc: execute_wait_cycle_time_secs is now execute_wait_cycle_time

Expose sync_move in jug.utils

Source code(tar.gz)
Source code(zip)
v1.3.0(Nov 1, 2016)
Major improvements are IPython 5 compatibility, jug_execute, and timing.

Full ChangeLog:

Update shell subcommand to IPython 5

Use ~/.config/jugrc as configuration file

Cleanup usage string

Use bottle instead of web.py for webstatus subcommand

Add jug_execute function

Add timing functionality

Source code(tar.gz)
Source code(zip)

release-1.2.1(Jun 8, 2016)

A bugfix release, especially fixing jug on Windows (previously unusable as Windows does not support fsync on directories).

Full ChangeLog:

* Changed execution loop to ensure that all tasks are checked (issue #33
on github)
* Fixed bug that made 'check' or 'sleep-until' slower than necessary
* Fixed jug on Windows (which does not support fsync on directories)
* Made Tasklets use slightly less memory

Source code(tar.gz)
Source code(zip)

release-1.2.0(Jun 8, 2016)
Several optimizations in code and minor fixes made a new release worthwhile. Because the hashes have been changed, this needs to be a new (minor) version (see issue #25): upgrading will require you to rerun any tasks that use kwargs.

Full ChangeLog

Use HIGHEST_PROTOCOL when pickle()ing

Add compress_numpy option to file_store

Add register_hook_once function

Optimize case when most (or all) tasks are already run

Add --short option to jug status and jug execute

Fix bug with dictionary order in kwargs (fix by Andreas Sorge)

Fix ipython colors (fix by Andreas Sorge)

Sort tasks in jug status

Source code(tar.gz)
Source code(zip)