SCOOP (Scalable COncurrent Operations in Python)

Overview

SCOOP logo

SCOOP (Scalable COncurrent Operations in Python) is a distributed task module allowing concurrent parallel programming on various environments, from heterogeneous grids to supercomputers. Its documentation is available on http://scoop.readthedocs.org/ .

Philosophy

SCOOP was designed from the following ideas:

  • The future is parallel;
  • Simple is beautiful;
  • Parallelism should be simpler.

These tenets are translated concretely in a minimum number of functions allowing maximum parallel efficiency while keeping at minimum the inner knowledge required to use them. It is implemented with Python 3 in mind while being compatible with Python 2.6+ to allow fast prototyping without sacrificing efficiency and speed.

Some comments we received on SCOOP:

Features

SCOOP features and advantages over futures, multiprocessing and similar modules are as follows:

  • Harness the power of multiple computers over network;
  • Ability to spawn multiple tasks inside a task;
  • API compatible with PEP-3148;
  • Parallelizing serial code with only minor modifications;
  • Efficient load-balancing.

Anatomy of a SCOOPed program

SCOOP can handle multiple diversified multi-layered tasks. With it, you can submit your different functions and data simultaneously and effortlessly while the framework executes them locally or remotely. Contrarily to most multiprocessing frameworks, it allows to launch subtasks within tasks.

http://scoop.readthedocs.org/en/latest/_images/introductory_tree.png

Through SCOOP, you can execute simultaneously tasks that are different by nature, shown by the task color, or different by complexity, shown by the task radius. The module will handle the physical considerations of parallelization, such as task distribution over your resources (load balancing), communications, etc.

Applications

The common applications of SCOOP consist but is not limited to:

  • Evolutionary Algorithms
  • Monte Carlo simulations
  • Data mining
  • Data processing
  • Graph traversal

Citing SCOOP

Authors of scientific papers including results generated using SCOOP are encouraged to cite the following paper.

{{{ @inproceedings{SCOOP_XSEDE2014, title={Once you SCOOP, no need to fork}, author={Hold-Geoffroy, Yannick and Gagnon, Olivier and Parizeau, Marc}, booktitle={Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment}, pages={60}, year={2014}, organization={ACM} } }}}

Useful links

You can download the latest stable version, check the project documentation, post to the mailing list or submit an issue if you've found one.

Comments
  • Installation fails

    Installation fails

    Output of pip install scoop:

    Collecting scoop
      Could not find a version that satisfies the requirement scoop (from versions: 0.7.0.release, 0.7.1.release)
      Some externally hosted files were ignored as access to them may be unreliable (use --allow-external scoop to allow).
      No matching distribution found for scoop
    

    Python version is 2.7.9

    opened by sesslerp 7
  • Failure trying to distribute work in HPC environment

    Failure trying to distribute work in HPC environment

    This error is repeated for each node I am trying to distribute work to.

    [2019-07-17 14:34:28,536] workerLaunch (127.0.0.1:36053) WARNING Could not successfully launch the remote worker on pn033.
    Requested remote group process id, received:
    b''
    Group id decoding error:
    invalid literal for int() with base 10: ''
    SSH process stderr:
    
    /.../venv/bin/python: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared
    object file: No such file or directory
    /.../venv/bin/python: error while loading shared libraries: lib
    python3.6m.so.1.0: cannot open shared object file: No such file or directory
    /.../venv/bin/python: err
    or while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory
    /.../venv/bin/python: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file o
    r directory
    
    opened by rgov 4
  • Error with SLURM

    Error with SLURM

    I'm trying to use scoop in a cluster that uses SLURM. I'm trying to run the example you provide in the documentation (helloworld example). I've run the example in the head node with few cpu's and it works (so it seems installation is correct up to some level at least), but when I run it through sbatch it returns the following error:

    EXECUTE PYTHON .PY FILE Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/user/.local/lib/python2.7/site-packages/scoop/main.py", line 21, in main() File "/home/user/.local/lib/python2.7/site-packages/scoop/launcher.py", line 454, in main args.external_hostname = [utils.externalHostname(hosts)] File "/home/user/.local/lib/python2.7/site-packages/scoop/utils.py", line 101, in externalHostname hostname = hosts[0][0] IndexError: list index out of range END OF JOBS

    In the documentation I read scoop is compatible with slurm, is there a particular configuration step that is not documented (the SSH keys are already configured)?

    Thanks,

    opened by pmolea 4
  • Scoop does not find zmq

    Scoop does not find zmq

    What version of the product are you using? On what operating system?
    
    $ python -V
    Python 2.7.7
    
    $ lsb_release -a
    LSB Version:    
    :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd6
    4:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
    Distributor ID: RedHatEnterpriseServer
    Description:    Red Hat Enterprise Linux Server release 6.4 (Santiago)
    Release:        6.4
    Codename:       Santiago
    
    What steps will reproduce the problem?
    1. Download 0.7.1 tarball, open it and cd into the directory 
    
    2. Verify that pyzmq is available:
    
    python -c "import zmq; print 'It is there'"
    
    3. Try to install scoop with:
    
    python setup.py install --prefix=$MY_INSTALL_DIR
    
    What is the expected output? What do you see instead?
    
    (....a bunch of irrelevant output....)
    
    error: command 'gcc' failed with exit status 1
    
    Failed with default libzmq, trying again with /usr/local
    ************************************************
    Configure: Autodetecting ZMQ settings...
        Custom ZMQ dir:       /usr/local
    Assembler messages:
    Fatal error: can't create 
    build/temp.linux-x86_64-2.7/scratch/tmp/easy_install-L0czGy/pyzmq-14.3.1/temp/ti
    mer_create9prql0.o: No such file or directory
    build/temp.linux-x86_64-2.7/scratch/vers.c:4:17: fatal error: zmq.h: No such 
    file or directory
     #include "zmq.h"
                     ^
    compilation terminated.
    
    The problem is that zmq.h is in a non-standard location, available to python 
    but not to scoop. 
    
    What is the recommended way of telling scoop where zmq is located?
    
    PS: this is related to issue 8, but that has been closed without anybody being 
    assigned to it and just dismissing as a pyzmq installation problem (which is 
    not: both pyzmq and zmq are correctly installed and used by other software on 
    this system)
    
    

    Original issue reported on code.google.com by [email protected] on 15 Jul 2014 at 5:04

    Type-Defect Priority-Medium auto-migrated 
    opened by GoogleCodeExporter 4
  • [**Time Sensitive**] pyscoop.org domain expired

    [**Time Sensitive**] pyscoop.org domain expired

    The http://pyscoop.org/ URL currently redirects to a page saying the domain has expired and is up for renewal. It is currently in the autoRenewPeriod status.

    @soravux Thank you for building this amazing library that has made my life as well as many others' lives much easier. I hope you can continue to maintain and contribute to this very useful piece of software. However, if you have decided to abandon this project and no longer intend to renew the domain name, please contact me ASAP. I would be happy to take over the domain name registration and the full-time maintenance of the project if needed. I have a vested interest in the continued survival of this project.

    opened by islamelnabarawy 3
  • If there any possibility to support IBM LSF system?

    If there any possibility to support IBM LSF system?

    Hi, I have developed my algorithm using DEAP with SCOOP and tested on our little cluster using PBS system. Now, I want to run on our big cluster which installed IBM LSF. Since I cannot specify the accessible hosts, I found SCOOP cannot detect the LSF system. Then, I found the following instruction.

    SCOOP natively supports Sun Grid Engine (SGE), Torque (PBS-compatible, Moab, Maui) and SLURM. That means that a minimum launch file is needed while the framework recognizes automatically the nodes assigned to your task.

    So, I wonder if there any possibility to support IBM LSF system?

    Thank you very much!

    opened by crazyzlj 3
  • scoop confused by bashrc echo

    scoop confused by bashrc echo

    I have noted the following problem.

    I have code in my .bashrc that echo's a status message. When ssh-ing to this machine while dispatching jobs, SCOOP appears to be confused by this message and appears unable to launch jobs on this host effectively. (I get an error message specifically mentioning the bashrc status message. From what I could see, Only one job per node could be launched in such a case. Is there any way to fix this without having to disable the status message? (The message is rather useful when sshing to the machine in other contexts). Upon disabling the output, the error message disappears and the scheduling works fine.

    opened by maharjun 3
  • Allocation issue with SLURM

    Allocation issue with SLURM

    When $SLURM_JOB_NODELIST is e.g. "nodes[006,011]" I get the following error:

    File "/python27/lib/python2.7/site-packages/scoop/utils.py", line 209, in parseSLURM bmin,bmax = rng.split('-') ValueError: need more than 1 value to unpack

    If $SLURM_JOB_NODELIST is e.g. "nodes[021-022]" the workers are deployed over the two hosts.

    opened by croessert 3
  • Is this project abandoned?

    Is this project abandoned?

    Looks like this project did not receive any updates recently, the web page http://pyscoop.org/ which is in the description and the README redirects to some news site. If the authors are still active on github, could they please comment on the status of the project?

    opened by johann-petrak 2
  • Using scoop with SLURM

    Using scoop with SLURM

    Is there any documentation for how to use scoop with SLURM?

    One of the main things I'm wondering about is whether to provide a hosts file to scoop or not when running it from SLURM. Does it automatically figure out the hosts and run simulations on them otherwise?

    #!/bin/bash
    #SBATCH [email protected]
    #SBATCH --mail-type=ALL
    #SBATCH --nodes=7
    #SBATCH --ntasks=72
    #SBATCH --time=99:00:00
    #SBATCH --mem=10G
    #SBATCH --output=python_job_slurm.out
    
    # Which one is correct?
    python -m scoop --hostfile hosts.txt my-script.py
    python -m scoop my-script.py
    

    and I run it with sbatch python.slurm

    opened by anandtrex 2
  • socket.gaierror: [Errno -2] Name or service not known

    socket.gaierror: [Errno -2] Name or service not known

    I couldn't use ssh host name.

    I checked the parameter to pass the method getaddrinfo.

    scoop.BROKER.externalHostname returned thx and scoop.BROKER.task_port returned random int.

    How to use other computer through ssh with ~/.ssh/config file? It looks not supported as long as I check the process.

    Environment

    • Ubuntu 16.04 x64
    • Python 2.7.12
    • Scoop 0.7.2.0
    • ssh thx is properly working.
    thx@thx-Prime:~/workspace/scoop$ python -m scoop --host thx -vv scoop_test.py
    [2016-12-21 12:11:44,449] launcher  INFO    SCOOP 0.7 2.0 on linux2 using Python 2.7.12 (default, Oct 21 2016, 22:26:43) [GCC 5.4.0 20160609], API: 1013
    [2016-12-21 12:11:44,449] launcher  INFO    Deploying 1 worker(s) over 1 host(s).
    [2016-12-21 12:11:44,449] launcher  DEBUG   Using hostname/ip: "thx" as external broker reference.
    [2016-12-21 12:11:44,449] launcher  DEBUG   The python executable to execute the program with is: /home/thx/.pyenv/versions/2.7.12/bin/python.
    [2016-12-21 12:11:44,449] launcher  INFO    Worker d--istribution:
    [2016-12-21 12:11:44,449] launcher  INFO       thx:     0 + origin
    [2016-12-21 12:11:44,449] brokerLaunch DEBUG   Launching remote broker: ssh -tt -x -oStrictHostKeyChecking=no -oBatchMode=yes -oUserKnownHostsFile=/dev/null -oServerAliveInterval=300 thx /home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.broker.__main__ --echoGroup --echoPorts --backend ZMQ
    [2016-12-21 12:11:44,750] brokerLaunch DEBUG   Foreign broker launched on ports 46544, 45126 of host thx.
                                                                                                             [2016-12-21 12:11:44,750] launcher  DEBUG   Initialising remote origin worker 1 [thx].
                                                                                                                                                                                                   [2016-12-21 12:11:44,751] launcher  DEBUG   thx: Launching '/home/thx/.pyenv/versions/2.7.12/bin/python -m scoop.launch.__main__ 1 3 --size 1 --workingDirectory "/home/thx/workspace/scoop" --brokerHostname 127.0.0.1 --externalBrokerHostname thx --taskPort 46544 --metaPort 45126 --origin --backend=ZMQ -vvv scoop_test.py'
                                                                                          Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.
    Launching 1 worker(s) using /bin/bash.
    Executing '['/home/thx/.pyenv/versions/2.7.12/bin/python', '-m', 'scoop.bootstrap.__main__', '--size', '1', '--workingDirectory', '/home/thx/workspace/scoop', '--brokerHostname', '127.0.0.1', '--externalBrokerHostname', 'thx', '--taskPort', '46544', '--metaPort', '45126', '--origin', '--backend=ZMQ', '-vvv', 'scoop_test.py']'...
    Traceback (most recent call last):
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 174, in _run_module_as_main
        "__main__", fname, loader, pkg_name)
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/runpy.py", line 72, in _run_code
        exec code in run_globals
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 298, in <module>
        b.main()
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
        self.run()
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 285, in run
        futures_startup()
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 266, in futures_startup
        run_name="__main__"
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/futures.py", line 65, in _startup
        result = _controller.switch(rootFuture, *args, **kargs)
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_control.py", line 199, in runController
        execQueue = FutureQueue()
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_types.py", line 264, in __init__
        self.socket = Communicator()
      File "/home/thx/.pyenv/versions/2.7.12/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 70, in __init__
        info = socket.getaddrinfo(scoop.BROKER.externalHostname, scoop.BROKER.task_port)[0]
    socket.gaierror: [Errno -2] Name or service not known
    Exception AttributeError: "'FutureQueue' object has no attribute 'socket'" in <bound method FutureQueue.__del__ of <scoop._types.FutureQueue object at 0x7f5abe703a90>> ignored
    Connection to 192.168.21.10 closed.
    [2016-12-21 12:11:45,344] launcher  INFO    Root process is done.
                                                                     [2016-12-21 12:11:45,345] workerLaunch DEBUG   Closing workers on thx (1 workers).
                                                                                                                                                       [2016-12-21 12:11:45,345] brokerLaunch DEBUG   Closing broker on host thx.
            Warning: Permanently added '192.168.21.10' (ECDSA) to the list of known hosts.
    
    opened by fx-kirin 2
  • Python 3.10 Cannot import name 'Iterable' from 'collections'

    Python 3.10 Cannot import name 'Iterable' from 'collections'

    In Python 3.10 this class has been moved.

    from collections import Iterable is now from collections.abc import Iterable

    Error message:

    Traceback (most recent call last):
      File ".../app/onemax_island_scoop.py", line 28, in <module>
        from scoop import futures
      File ".../venv/lib/python3.10/site-packages/scoop/futures.py", line 19, in <module>
        from collections import namedtuple, Iterable
    ImportError: cannot import name 'Iterable' from 'collections' (/opt/homebrew/Cellar/[email protected]/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/collections/__init__.py)
    
    opened by remibaar 0
  • requested remote group process id received:b''

    requested remote group process id received:b''

    I used Openssh to connect two win10 computer. The connection works with OpenSSH. but when I use remote computer I get an error: _ Requested remote group process id, received: b'' Group id decoding error: invalid literal for int() with base 10:‘’ SSH process stderr: b'\xb4\xcb\xca\xb1\xb2\xbb\xd3\xa6\xd3\xd0 )\xa1\xa3\r\n' _

    image

    my command is : python -m scoop --hostfile hostsUse -vv -n 8 SCOOPTest.py

    opened by kukuwa 0
  • cpu_usage 100% In centos

    cpu_usage 100% In centos

    When I use scoop in centos, no matter how many do I set the number of -n ,My cpu usage is 100%, but cpu load is very normal in macos. ps. My centos machine is 40 cores

    how does it happened?

    opened by eheroqiu 0
  • Another question, help me please.

    Another question, help me please.

    Hello! I run the example in the document, the code is as the following: """test.py""" from math import hypot from random import random from scoop import futures import time def test(tries): return sum(hypot(random(), random()) < 1 for _ in range(tries)) def calcPi(nbFutures, tries): expr = futures.map(test, [tries] * nbFutures) return 4. * sum(expr) / float(nbFutures * tries) if __name__ == "__main__": bt = time.time() print("pi = {}".format(calcPi(3000, 5000))) print('time:', time.time() - bt)

    I run this code in the windows cmd with the command "python -m scoop test.py", however, it cost 33 seconds. Then I used "map" instead of "futures.map", it just cost 4 seconds.

    What should I do to let SCOOP be able to speed up my code?

    Thanks.

    opened by sc1101 0
  • Why it is much slower when using python -m scoop deap_ga_onemax.py

    Why it is much slower when using python -m scoop deap_ga_onemax.py

    Hello! When I used "python -m scoop deap_ga_onemax.py" to run the example file "deap_ga_onemax.py", I found that is much slower than just used "python deap_ga_onemax.py" to run it. Why is that happened? What should I do to use scoop to speed up the code of "deap_ga_onemax.py"?

    opened by sc1101 0
  • How to wait for the finish of subprocess.run ([

    How to wait for the finish of subprocess.run (["./executer"]) when lunched by SCOOP?

    I am trying to using SCOOP and DEAP for the optimization of a FEA model. In python the model is run by subprocess.run (["./executer"]) . Using python's multiprocess.pool way, I can successfully run the external executor using subprocess.run. But when lunched by SCOOP, the subprocess.run soon skipped without waiting for the modeling result though I can see the external executor was running on the background. There is no error information while running. Please tell me how to solve this? Thank you!

    opened by esiwgnahz 1
rosny is a lightweight library for building concurrent systems.

rosny is a lightweight library for building concurrent systems. Installation Tested on: Linux Python >= 3.6 From pip: pip install rosny From source: p

Ruslan Baikulov 6 Oct 5, 2021
A concurrent sync tool which works with multiple sources and targets.

Concurrent Sync A concurrent sync tool which works similar to rsync. It supports syncing given sources with multiple targets concurrently. Requirement

Halit Şimşek 2 Jan 11, 2022
A curated list of awesome Python asyncio frameworks, libraries, software and resources

Awesome asyncio A carefully curated list of awesome Python asyncio frameworks, libraries, software and resources. The Python asyncio module introduced

Timo Furrer 3.8k Jan 8, 2023
Trio – a friendly Python library for async concurrency and I/O

Trio – a friendly Python library for async concurrency and I/O The Trio project aims to produce a production-quality, permissively licensed, async/awa

null 5k Jan 7, 2023
A lightweight (serverless) native python parallel processing framework based on simple decorators and call graphs.

A lightweight (serverless) native python parallel processing framework based on simple decorators and call graphs, supporting both control flow and dataflow execution paradigms as well as de-centralized CPU & GPU scheduling.

null 102 Jan 6, 2023
A Python package for easy multiprocessing, but faster than multiprocessing

MPIRE, short for MultiProcessing Is Really Easy, is a Python package for multiprocessing, but faster and more user-friendly than the default multiprocessing package.

null 753 Dec 29, 2022
Simple package to enhance Python's concurrent.futures for memory efficiency

future-map is a Python library to use together with the official concurrent.futures module.

Arai Hiroki 2 Nov 15, 2022
A Python script that exports users from one Telegram group to another using one or more concurrent user bots.

ExportTelegramUsers A Python script that exports users from one Telegram group to another using one or more concurrent user bots. Make sure to set all

Fasil Minale 17 Jun 26, 2022
Fast as FUCK nvim completion. SQLite, concurrent scheduler, hundreds of hours of optimization.

Fast as FUCK nvim completion. SQLite, concurrent scheduler, hundreds of hours of optimization.

i love my dog 2.8k Jan 5, 2023
rosny is a lightweight library for building concurrent systems.

rosny is a lightweight library for building concurrent systems. Installation Tested on: Linux Python >= 3.6 From pip: pip install rosny From source: p

Ruslan Baikulov 6 Oct 5, 2021
Functional interface for concurrent futures, including asynchronous I/O.

Futured provides a consistent interface for concurrent functional programming in Python. It wraps any callable to return a concurrent.futures.Future,

A. Coady 11 Nov 27, 2022
A library to make concurrent selenium tests that automatically download and setup webdrivers

AutoParaSelenium A library to make parallel selenium tests that automatically download and setup webdrivers Usage Installation pip install autoparasel

Ronak Badhe 8 Mar 13, 2022
A wrapper around ffmpeg to make it work in a concurrent and memory-buffered fashion.

Media Fixer Have you ever had a film or TV show that your TV wasn't able to play its audio? Well this program is for you. Media Fixer is a program whi

Halit Şimşek 3 May 4, 2022
A concurrent sync tool which works with multiple sources and targets.

Concurrent Sync A concurrent sync tool which works similar to rsync. It supports syncing given sources with multiple targets concurrently. Requirement

Halit Şimşek 2 Jan 11, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 23.2k Dec 30, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 23.3k Dec 31, 2022
An object-oriented approach to Python file/directory operations.

Unipath An object-oriented approach to file/directory operations Version: 1.1 Home page: https://github.com/mikeorr/Unipath Docs: https://github.com/m

Mike Orr 506 Dec 29, 2022
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

Adrian Rosebrock 4.3k Jan 8, 2023
Reference python implementation of Chia pool operations for pool operators

This repository provides a sample server written in python, which is meant to server as a basis for a Chia Pool. While this is a fully functional implementation, it requires some work in scalability and security to run in production.

Chia Network 451 Dec 13, 2022
Performing the following operations using python on PDF.

Python PDF Handling Tutorial Python is a highly versatile language with a huge set of libraries. It is a high level language with simple syntax. Pytho

Prajwol Lamichhane 131 Dec 16, 2022