dsub is a command-line tool that makes it easy to submit and run batch scripts in the cloud.

Related tags

CLI Tools dsub
Overview

dsub: simple batch jobs with Docker

License

Overview

dsub is a command-line tool that makes it easy to submit and run batch scripts in the cloud.

The dsub user experience is modeled after traditional high-performance computing job schedulers like Grid Engine and Slurm. You write a script and then submit it to a job scheduler from a shell prompt on your local machine.

Today dsub supports Google Cloud as the backend batch job runner, along with a local provider for development and testing. With help from the community, we'd like to add other backends, such as a Grid Engine, Slurm, Amazon Batch, and Azure Batch.

Getting started

You can install dsub from PyPI, or you can clone and install from github.

Sunsetting Python 2 support

Python 2 support ended in January 2020. See Python's official Sunsetting Python 2 announcement for details.

Automated dsub tests running on Python 2 have been disabled. Release 0.3.10 is the last version of dsub that supports Python 2.

Use Python 3.6 or greater. For earlier versions of Python 3, use dsub 0.4.1.

Pre-installation steps

This is optional, but whether installing from PyPI or from github, you are encouraged to use a Python virtual environment.

You can do this in a directory of your choosing.

    python3 -m venv dsub_libs
    source dsub_libs/bin/activate

Using a Python virtual environment isolates dsub library dependencies from other Python applications on your system.

Activate this virtual environment in any shell session before running dsub. To deactivate the virtual environment in your shell, run the command:

    deactivate

Alternatively, a set of convenience scripts are provided that activate the virutalenv before calling dsub, dstat, and ddel. They are in the bin directory. You can use these scripts if you don't want to activate the virtualenv explicitly in your shell.

Install dsub

Choose one of the following:

Install from PyPI

  1. If necessary, install pip.

  2. Install dsub

     pip install dsub
    

Install from github

  1. Be sure you have git installed

    Instructions for your environment can be found on the git website.

  2. Clone this repository.

    git clone https://github.com/DataBiosphere/dsub
    cd dsub
    
  3. Install dsub (this will also install the dependencies)

    python setup.py install
    
  4. Set up Bash tab completion (optional).

    source bash_tab_complete
    

Post-installation steps

  1. Minimally verify the installation by running:

    dsub --help
    
  2. (Optional) Install Docker.

    This is necessary only if you're going to create your own Docker images or use the local provider.

Makefile

After cloning the dsub repo, you can also use the Makefile by running:

    make

This will create a Python virtual environment and install dsub into a directory named dsub_libs.

Getting started with the local provider

We think you'll find the local provider to be very helpful when building your dsub tasks. Instead of submitting a request to run your command on a cloud VM, the local provider runs your dsub tasks on your local machine.

The local provider is not designed for running at scale. It is designed to emulate running on a cloud VM such that you can rapidly iterate. You'll get quicker turnaround times and won't incur cloud charges using it.

  1. Run a dsub job and wait for completion.

    Here is a very simple "Hello World" test:

    "${OUT}"' \ --wait ">
     dsub \
       --provider local \
       --logging "${TMPDIR:-/tmp}/dsub-test/logging/" \
       --output OUT="${TMPDIR:-/tmp}/dsub-test/output/out.txt" \
       --command 'echo "Hello World" > "${OUT}"' \
       --wait
    

    Note: TMPDIR is commonly set to /tmp by default on most Unix systems, although it is also often left unset. On some versions of MacOS TMPDIR is set to a location under /var/folders.

    Note: The above syntax ${TMPDIR:-/tmp} is known to be supported by Bash, zsh, ksh. The shell will expand TMPDIR, but if it is unset, /tmp will be used.

  2. View the output file.

     cat "${TMPDIR:-/tmp}/dsub-test/output/out.txt"
    

Getting started on Google Cloud

dsub supports the use of two different APIs from Google Cloud for running tasks. Google Cloud is transitioning from Genomics v2alpha1 to Cloud Life Sciences v2beta.

dsub supports both APIs with the (old) google-v2 and (new) google-cls-v2 providers respectively. google-v2 is the current default provider. dsub will be transitioning to make google-cls-v2 the default in coming releases.

The steps for getting started differ slightly as indicated in the steps below:

  1. Sign up for a Google account and create a project.

  2. Enable the APIs:

    • For the v2alpha1 API (provider: google-v2):

    Enable the Genomics, Storage, and Compute APIs.

    • For the v2beta API (provider: google-cls-v2):

    Enable the Cloud Life Sciences, Storage, and Compute APIs

  3. Install the Google Cloud SDK and run

    gcloud init
    

    This will set up your default project and grant credentials to the Google Cloud SDK. Now provide credentials so dsub can call Google APIs:

    gcloud auth application-default login
    
  4. Create a Google Cloud Storage bucket.

    The dsub logs and output files will be written to a bucket. Create a bucket using the storage browser or run the command-line utility gsutil, included in the Cloud SDK.

    gsutil mb gs://my-bucket
    

    Change my-bucket to a unique name that follows the bucket-naming conventions.

    (By default, the bucket will be in the US, but you can change or refine the location setting with the -l option.)

  5. Run a very simple "Hello World" dsub job and wait for completion.

    • For the v2alpha1 API (provider: google-v2):

      "${OUT}"' \ --wait ">
        dsub \
          --provider google-v2 \
          --project my-cloud-project \
          --regions us-central1 \
          --logging gs://my-bucket/logging/ \
          --output OUT=gs://my-bucket/output/out.txt \
          --command 'echo "Hello World" > "${OUT}"' \
          --wait
      

    Change my-cloud-project to your Google Cloud project, and my-bucket to the bucket you created above.

    • For the v2beta API (provider: google-cls-v2):

      "${OUT}"' \ --wait ">
        dsub \
          --provider google-cls-v2 \
          --project my-cloud-project \
          --regions us-central1 \
          --logging gs://my-bucket/logging/ \
          --output OUT=gs://my-bucket/output/out.txt \
          --command 'echo "Hello World" > "${OUT}"' \
          --wait
      

    Change my-cloud-project to your Google Cloud project, and my-bucket to the bucket you created above.

    The output of the script command will be written to the OUT file in Cloud Storage that you specify.

  6. View the output file.

     gsutil cat gs://my-bucket/output/out.txt
    

Backend providers

Where possible, dsub tries to support users being able to develop and test locally (for faster iteration) and then progressing to running at scale.

To this end, dsub provides multiple "backend providers", each of which implements a consistent runtime environment. The current providers are:

  • local
  • google-v2 (the default)
  • google-cls-v2 (new)

More details on the runtime environment implemented by the backend providers can be found in dsub backend providers.

Differences between google-v2 and google-cls-v2

The google-cls-v2 provider is built on the Cloud Life Sciences v2beta API. This API is very similar to its predecessor, the Genomics v2alpha1 API. Details of the differences can be found in the Migration Guide.

dsub largely hides the differences between the two APIs, but there are a few difference to note:

  • v2beta is a regional service, v2alpha1 is a global service

What this means is that with v2alpha1, the metadata about your tasks (called "operations"), is stored in a global database, while with v2beta, the metadata about your tasks are stored in a regional database. If your operation information needs to stay in a particular region, use the v2beta API (the google-cls-v2 provider), and specify the --location where your operation information should be stored.

  • The --regions and --zones flags can be omitted when using google-cls-v2

The --regions and --zones flags for dsub specify where the tasks should run. More specifically, this specifies what Compute Engine Zones to use for the VMs that run your tasks.

With the google-v2 provider, there is no default region or zone, and thus one of the --regions or --zones flags is required.

With google-cls-v2, the --location flag defaults to us-central1, and if the --regions and --zones flags are omitted, the location will be used as the default regions list.

dsub features

The following sections show how to run more complex jobs.

Defining what code to run

You can provide a shell command directly in the dsub command-line, as in the hello example above.

You can also save your script to a file, like hello.sh. Then you can run:

dsub \
    ... \
    --script hello.sh

If your script has dependencies that are not stored in your Docker image, you can transfer them to the local disk. See the instructions below for working with input and output files and folders.

Selecting a Docker image

To get started more easily, dsub uses a stock Ubuntu Docker image. This default image may change at any time in future releases, so for reproducible production workflows, you should always specify the image explicitly.

You can change the image by passing the --image flag.

dsub \
    ... \
    --image ubuntu:16.04 \
    --script hello.sh

Note: your --image must include the Bash shell interpreter.

For more information on using the --image flag, see the image section in Scripts, Commands, and Docker

Passing parameters to your script

You can pass environment variables to your script using the --env flag.

dsub \
    ... \
    --env MESSAGE=hello \
    --command 'echo ${MESSAGE}'

The environment variable MESSAGE will be assigned the value hello when your Docker container runs.

Your script or command can reference the variable like any other Linux environment variable, as ${MESSAGE}.

Be sure to enclose your command string in single quotes and not double quotes. If you use double quotes, the command will be expanded in your local shell before being passed to dsub. For more information on using the --command flag, see Scripts, Commands, and Docker

To set multiple environment variables, you can repeat the flag:

--env VAR1=value1 \
--env VAR2=value2

You can also set multiple variables, space-delimited, with a single flag:

--env VAR1=value1 VAR2=value2

Working with input and output files and folders

dsub mimics the behavior of a shared file system using cloud storage bucket paths for input and output files and folders. You specify the cloud storage bucket path. Paths can be:

  • file paths like gs://my-bucket/my-file
  • folder paths like gs://my-bucket/my-folder
  • wildcard paths like gs://my-bucket/my-folder/*

See the inputs and outputs documentation for more details.

Transferring input files to a Google Cloud Storage bucket.

If your script expects to read local input files that are not already contained within your Docker image, the files must be available in Google Cloud Storage.

If your script has dependent files, you can make them available to your script by:

  • Building a private Docker image with the dependent files and publishing the image to a public site, or privately to Google Container Registry
  • Uploading the files to Google Cloud Storage

To upload the files to Google Cloud Storage, you can use the storage browser or gsutil. You can also run on data thatโ€™s public or shared with your service account, an email address that you can find in the Google Cloud Console.

Files

To specify input and output files, use the --input and --output flags:

"${OUTPUT_FILE}"' ">
dsub \
    ... \
    --input INPUT_FILE_1=gs://my-bucket/my-input-file-1 \
    --input INPUT_FILE_2=gs://my-bucket/my-input-file-2 \
    --output OUTPUT_FILE=gs://my-bucket/my-output-file \
    --command 'cat "${INPUT_FILE_1}" "${INPUT_FILE_2}" > "${OUTPUT_FILE}"'

In this example:

  • a file will be copied from gs://my-bucket/my-input-file-1 to a path on the data disk
  • the path to the file on the data disk will be set in the environment variable ${INPUT_FILE_1}
  • a file will be copied from gs://my-bucket/my-input-file-2 to a path on the data disk
  • the path to the file on the data disk will be set in the environment variable ${INPUT_FILE_2}

The --command can reference the file paths using the environment variables.

Also in this example:

  • a path on the data disk will be set in the environment variable ${OUTPUT_FILE}
  • the output file will written to the data disk at the location given by ${OUTPUT_FILE}

After the --command completes, the output file will be copied to the bucket path gs://my-bucket/my-output-file

Multiple --input, and --output parameters can be specified and they can be specified in any order.

Folders

To copy folders rather than files, use the --input-recursive and output-recursive flags:

dsub \
    ... \
    --input-recursive FOLDER=gs://my-bucket/my-folder \
    --command 'find ${FOLDER} -name "foo*"'

Multiple --input-recursive, and --output-recursive parameters can be specified and they can be specified in any order.

Mounting "resource data"

If you have one of the following:

  1. A large set of resource files, your code only reads a subset of those files, and the decision of which files to read is determined at runtime, or
  2. A large input file over which your code makes a single read pass or only needs to read a small range of bytes,

then you may find it more efficient at runtime to access this resource data via mounting a Google Cloud Storage bucket read-only or mounting a persistent disk created from a Compute Engine Image read-only.

The google-v2 and google-cls-v2 providers support these two methods of providing access to resource data. The local provider supports mounting a local directory in a similar fashion to support your local development.

To have the google-v2 or google-cls-v2 provider mount a Cloud Storage bucket using Cloud Storage FUSE, use the --mount command line flag:

--mount MYBUCKET=gs://mybucket

The bucket will be mounted into the Docker container running your --script or --command and the location made available via the environment variable ${MYBUCKET}. Inside your script, you can reference the mounted path using the environment variable. Please read Key differences from a POSIX file system and Semantics before using Cloud Storage FUSE.

To have the google-v2 or google-cls-v2 provider mount a persistent disk created from an image, use the --mount command line flag and the url of the source image and the size (in GB) of the disk:

--mount MYDISK="https://www.googleapis.com/compute/v1/projects/your-project/global/images/your-image 50"

The image will be used to create a new persistent disk, which will be attached to a Compute Engine VM. The disk will mounted into the Docker container running your --script or --command and the location made available by the environment variable ${MYDISK}. Inside your script, you can reference the mounted path using the environment variable.

To create an image, see Creating a custom image.

To have the local provider mount a directory read-only, use the --mount command line flag and a file:// prefix:

--mount LOCAL_MOUNT=file://path/to/my/dir

The local directory will be mounted into the Docker container running your --scriptor --command and the location made available via the environment variable ${LOCAL_MOUNT}. Inside your script, you can reference the mounted path using the environment variable.

Setting resource requirements

dsub tasks run using the local provider will use the resources available on your local machine.

dsub tasks run using the google, google-v2, or google-cls-v2 providers can take advantage of a wide range of CPU, RAM, disk, and hardware accelerator (eg. GPU) options.

See the Compute Resources documentation for details.

Submitting a batch job

Each of the examples above has demonstrated submitting a single task with a single set of variables, inputs, and outputs. If you have a batch of inputs and you want to run the same operation over them, dsub allows you to create a batch job.

Instead of calling dsub repeatedly, you can create a tab-separated values (TSV) file containing the variables, inputs, and outputs for each task, and then call dsub once. The result will be a single job-id with multiple tasks. The tasks will be scheduled and run independently, but can be monitored and deleted as a group.

Tasks file format

The first line of the TSV file specifies the names and types of the parameters. For example:

--env SAMPLE_ID
   
    --input VCF_FILE
    
     --output OUTPUT_PATH

    
   

Each addition line in the file should provide the variable, input, and output values for each task. Each line beyond the header represents the values for a separate task.

Multiple --env, --input, and --output parameters can be specified and they can be specified in any order. For example:

--env SAMPLE
   
    --input A
    
     --input B
     
      --env REFNAME
      
       --output O
S1
       
        gs://path/A1.txt
        
         gs://path/B1.txt
         
          R1
          
           gs://path/O1.txt S2
           
            gs://path/A2.txt
            
             gs://path/B2.txt
             
              R2
              
               gs://path/O2.txt 
              
             
            
           
          
         
        
       
      
     
    
   

Tasks parameter

Pass the TSV file to dsub using the --tasks parameter. This parameter accepts both the file path and optionally a range of tasks to process. The file may be read from the local filesystem (on the machine you're calling dsub from), or from a bucket in Google Cloud Storage (file name starts with "gs://").

For example, suppose my-tasks.tsv contains 101 lines: a one-line header and 100 lines of parameters for tasks to run. Then:

dsub ... --tasks ./my-tasks.tsv

will create a job with 100 tasks, while:

dsub ... --tasks ./my-tasks.tsv 1-10

will create a job with 10 tasks, one for each of lines 2 through 11.

The task range values can take any of the following forms:

  • m indicates to submit task m (line m+1)
  • m- indicates to submit all tasks starting with task m
  • m-n indicates to submit all tasks from m to n (inclusive).

Logging

The --logging flag points to a location for dsub task log files. For details on how to specify your logging path, see Logging.

Job control

It's possible to wait for a job to complete before starting another. For details, see job control with dsub.

Retries

It is possible for dsub to automatically retry failed tasks. For details, see retries with dsub.

Labeling jobs and tasks

You can add custom labels to jobs and tasks, which allows you to monitor and cancel tasks using your own identifiers. In addition, with the Google providers, labeling a task will label associated compute resources such as virtual machines and disks.

For more details, see Checking Status and Troubleshooting Jobs

Viewing job status

The dstat command displays the status of jobs:

dstat --provider google-v2 --project my-cloud-project

With no additional arguments, dstat will display a list of running jobs for the current USER.

To display the status of a specific job, use the --jobs flag:

dstat --provider google-v2 --project my-cloud-project --jobs job-id

For a batch job, the output will list all running tasks.

Each job submitted by dsub is given a set of metadata values that can be used for job identification and job control. The metadata associated with each job includes:

  • job-name: defaults to the name of your script file or the first word of your script command; it can be explicitly set with the --name parameter.
  • user-id: the USER environment variable value.
  • job-id: takes the form job-name--userid--timestamp where the job-name is truncated at 10 characters and the timestamp is of the form YYMMDD-HHMMSS-XX, unique to hundredths of a second.
  • task-id: if the job is submitted with the --tasks parameter, each task gets a sequential value of the form "task-n" where n is 1-based.

Note that the job metadata values will be modified to conform with the "Label Restrictions" listed in the Checking Status and Troubleshooting Jobs guide.

Metadata can be used to cancel a job or individual tasks within a batch job.

For more details, see Checking Status and Troubleshooting Jobs

Summarizing job status

By default, dstat outputs one line per task. If you're using a batch job with many tasks then you may benefit from --summary.

$ dstat --provider google-v2 --project my-project --status '*' --summary

Job Name        Status         Task Count
-------------   -------------  -------------
my-job-name     RUNNING        2
my-job-name     SUCCESS        1

In this mode, dstat prints one line per (job name, task status) pair. You can see at a glance how many tasks are finished, how many are still running, and how many are failed/canceled.

Deleting a job

The ddel command will delete running jobs.

By default, only jobs submitted by the current user will be deleted. Use the --users flag to specify other users, or '*' for all users.

To delete a running job:

ddel --provider google-v2 --project my-cloud-project --jobs job-id

If the job is a batch job, all running tasks will be deleted.

To delete specific tasks:

ddel \
    --provider google-v2 \
    --project my-cloud-project \
    --jobs job-id \
    --tasks task-id1 task-id2

To delete all running jobs for the current user:

ddel --provider google-v2 --project my-cloud-project --jobs '*'

Service Accounts and Scope (Google providers only)

When you run the dsub command with the google-v2 or google-cls-v2 provider, there are two different sets of credentials to consider:

  • Account submitting the pipelines.run() request to run your command/script on a VM
  • Account accessing Cloud resources (such as files in GCS) when executing your command/script

The account used to submit the pipelines.run() request is typically your end user credentials. You would have set this up by running:

gcloud auth application-default login

The account used on the VM is a service account. The image below illustrates this:

Pipelines Runner Architecture

By default, dsub will use the default Compute Engine service account as the authorized service account on the VM instance. You can choose to specify the email address of another service acount using --service-account.

By default, dsub will grant the following access scopes to the service account:

In addition, the API will always add this scope:

You can choose to specify scopes using --scopes.

Recommendations for service accounts

While it is straightforward to use the default service account, this account also has broad privileges granted to it by default. Following the Principle of Least Privilege you may want to create and use a service account that has only sufficient privileges granted in order to run your dsub command/script.

To create a new service account, follow the steps below:

  1. Execute the gcloud iam service-accounts create command. The email address of the service account will be [email protected].

     gcloud iam service-accounts create "sa-name"
    
  2. Grant IAM access on buckets, etc. to the service account.

     gsutil iam ch serviceAccount:[email protected]:roles/storage.objectAdmin gs://bucket-name
    
  3. Update your dsub command to include --service-account

     dsub \
       --service-account [email protected]
       ...
    

What next?

Comments
  • NO_JOB eventhough nothing ran

    NO_JOB eventhough nothing ran

    I am trying to submit a dsub job and i am not getting the output. I am getting no_job and i am sure the input and output had run before. Can someone help me wi

    #!/usr/bin/python
    
    PROJECT_PATH="xyz"
    
    # There is a manual step: please create a tab-delimited phenotype file at
    # $PROJECT_PATH/pheno.tsv . Output from this project will ultimately go to
    # $PROJECT_PATH/output/* .
    
    # Leave one chromosome out?
    USE_LOCO="TRUE"
    TASK_DEFINITION_FILE="xyz/task1.tsv"
    
    MAX_PREEMPTION=6
    
    HAIL_DOCKER_IMAGE="gcr.io/jhs-project-243319/hail_latest:latest"
    
    # Enable exit on error
    set -o errexit
    
    # Create the mytasks.tsv from our template
    gsutil cat ${TASK_DEFINITION_FILE} | sed -e "s%gs://%${PROJECT_PATH}/output%g" > my.tasks.tsv
    
    # Check for errors when we can
    echo "Checking to make sure that ${PROJECT_PATH}/pheno.tsv exists"
    gsutil ls ${PROJECT_PATH}/jhs.protOI.batch123.ALL.tab
    
    # Launch step 1 and block until completion
    echo "Test"
    dsub \
       --project jhs-project-243319 \
       --provider google-v2 \
       --use-private-address \
       --regions us-central1 us-east1 us-west1 \
       --disk-type local-ssd \
       --disk-size 375 \
       --min-cores 64 \
       --min-ram 64 \
       --image ${HAIL_DOCKER_IMAGE} \
       --retries 1 \
       --skip \
       --wait \
       --logging ${PROJECT_PATH}/dsub-logs \
       --input PHENO_FILE=${PROJECT_PATH}/jhs.protOI.batch123.ALL.tab \
       --input HAIL_PATH=${PROJECT_PATH}/topmed_6a_pass_2k_minDP10_sQC_vQC_AF01_jhsprot.mt \
       --output-recursive OUTPUT_PATH=${PROJECT_PATH}/logs \
       --env LOCO=${USE_LOCO} \
       --timeout '12w' \
       --name test3 \
       --script /home/akhil/anaconda3/lib/python3.7/site-packages/dsub/commands/phewas_jhs_lmm.py \```
    opened by apampana 13
  • Silent delocalizing failure

    Silent delocalizing failure

    Hello! I'm trying to use dsub with the --tasks option to run an analysis in 20 chunks. Curiously, the *.logs indicate that the script runs to completion for every task, but only some random subset execute the delocalizing. Furthermore, the tasks that don't delocalize don't throw any kind of error captured in the *.logs. dstat -f, however, identifies the tasks that failed.

    Here's an example of a success:

    - create-time: '2019-07-25 02:16:39.297447'
      dsub-version: v0-3-2
      end-time: '2019-07-25 02:32:30.556849'
      envs:
        CHUNK: '3'
      events:
      - name: start
        start-time: 2019-07-25 06:16:42.171100+00:00
      - name: pulling-image
        start-time: 2019-07-25 06:17:32.995391+00:00
      - name: localizing-files
        start-time: 2019-07-25 06:18:34.308943+00:00
      - name: running-docker
        start-time: 2019-07-25 06:18:36.658863+00:00
      - name: delocalizing-files
        start-time: 2019-07-25 06:32:24.497567+00:00
      - name: ok
        start-time: 2019-07-25 06:32:30.556849+00:00
      input-recursives: {}
      inputs:
        INFILE: gs://haddath/sgosai/hff/data/FADS1_rep8detailed.txt
      internal-id: projects/sabeti-encode/operations/1351805964445161078
      job-id: python--sagergosai--190725-021637-18
      job-name: python
      labels: {}
      last-update: '2019-07-25 02:32:30.556849'
      logging: gs://haddath/sgosai/hff/logs/python--sagergosai--190725-021637-18.4.1.log
      mounts: {}
      output-recursives: {}
      outputs:
        OUTFILE: gs://haddath/sgosai/hff/data/FADS1_rep8__3_20.bed
      provider: google-v2
      provider-attributes:
        accelerators: []
        boot-disk-size: 250
        cpu_platform: ''
        disk-size: 200
        disk-type: pd-standard
        enable-stackdriver-monitoring: false
        instance-name: google-pipelines-worker-fae4230d454b3f6e1038535cbcb0da50
        machine-type: n1-standard-8
        network: ''
        preemptible: true
        regions: []
        service-account: default
        subnetwork: ''
        use_private_address: false
        zone: us-west2-c
        zones:
        - us-central1-a
        - us-central1-b
        - us-central1-c
        - us-central1-f
        - us-east1-b
        - us-east1-c
        - us-east1-d
        - us-east4-a
        - us-east4-b
        - us-east4-c
        - us-west1-a
        - us-west1-b
        - us-west1-c
        - us-west2-a
        - us-west2-b
        - us-west2-c
      script: |-
        #!/usr/bin/env bash
        python /app/hcr-ff/call_peaks.py ${INFILE} ${OUTFILE} -ji ${CHUNK} -jr 20 -ws 100 -ss 100
      script-name: python
      start-time: '2019-07-25 02:16:42.171100'
      status: SUCCESS
      status-detail: Success
      status-message: Success
      task-attempt: 1
      task-id: '4'
      user-id: sagergosai
    

    And a failure:

    - create-time: '2019-07-25 02:16:39.576571'
      dsub-version: v0-3-2
      end-time: '2019-07-25 02:52:45.047989'
      envs:
        CHUNK: '4'
      events:
      - name: start
        start-time: 2019-07-25 06:16:42.182994+00:00
      - name: pulling-image
        start-time: 2019-07-25 06:17:41.422799+00:00
      - name: localizing-files
        start-time: 2019-07-25 06:18:41.913631+00:00
      - name: running-docker
        start-time: 2019-07-25 06:18:44.379215+00:00
      - name: The assigned worker has failed to complete the operation
        start-time: 2019-07-25 06:52:43.907976+00:00
      input-recursives: {}
      inputs:
        INFILE: gs://haddath/sgosai/hff/data/FADS1_rep8detailed.txt
      internal-id: projects/sabeti-encode/operations/8834123416523977731
      job-id: python--sagergosai--190725-021637-18
      job-name: python
      labels: {}
      last-update: '2019-07-25 02:52:45.047989'
      logging: gs://haddath/sgosai/hff/logs/python--sagergosai--190725-021637-18.5.1.log
      mounts: {}
      output-recursives: {}
      outputs:
        OUTFILE: gs://haddath/sgosai/hff/data/FADS1_rep8__4_20.bed
      provider: google-v2
      provider-attributes:
        accelerators: []
        boot-disk-size: 250
        cpu_platform: ''
        disk-size: 200
        disk-type: pd-standard
        enable-stackdriver-monitoring: false
        instance-name: google-pipelines-worker-1d27f8b0a26375721946e521a550105a
        machine-type: n1-standard-8
        network: ''
        preemptible: true
        regions: []
        service-account: default
        subnetwork: ''
        use_private_address: false
        zone: us-east1-b
        zones:
        - us-central1-a
        - us-central1-b
        - us-central1-c
        - us-central1-f
        - us-east1-b
        - us-east1-c
        - us-east1-d
        - us-east4-a
        - us-east4-b
        - us-east4-c
        - us-west1-a
        - us-west1-b
        - us-west1-c
        - us-west2-a
        - us-west2-b
        - us-west2-c
      script: |-
        #!/usr/bin/env bash
        python /app/hcr-ff/call_peaks.py ${INFILE} ${OUTFILE} -ji ${CHUNK} -jr 20 -ws 100 -ss 100
      script-name: python
      start-time: '2019-07-25 02:16:42.182994'
      status: FAILURE
      status-detail: The assigned worker has failed to complete the operation
      status-message: The assigned worker has failed to complete the operation
      task-attempt: 1
      task-id: '5'
      user-id: sagergosai
    

    dsub version: 0.3.2

    opened by sjgosai 13
  • dsub 0.4.3 crashes on ubuntu 20.04

    dsub 0.4.3 crashes on ubuntu 20.04

    When I run the hello world example

    dsub \
       --provider local \
       --logging "${TMPDIR:-/tmp}/dsub-test/logging/" \
       --output OUT="${TMPDIR:-/tmp}/dsub-test/output/out.txt" \
       --command 'echo "Hello World" > "${OUT}"' \
       --wait
    

    it crashes as follows:

    ***WARNING: No Docker image specified. The default, `ubuntu:14.04` will be used.
    ***WARNING: For reproducible pipelines, specify an image with the `--image` flag.
    Job properties:
      job-id: echo--hylke--201220-165144-45
      job-name: echo
      user-id: hylke
    Launched job-id: echo--hylke--201220-165144-45
    To check the status, run:
      dstat --provider local --jobs 'echo--hylke--201220-165144-45' --users 'hylke' --status '*'
    To cancel the job, run:
      ddel --provider local --jobs 'echo--hylke--201220-165144-45' --users 'hylke'
    Waiting for job to complete...
    Waiting for: echo--hylke--201220-165144-45.
    Traceback (most recent call last):
      File "/home/hylke/.local/bin/dsub", line 8, in <module>
        sys.exit(main())
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 1106, in main
        dsub_main(prog, argv)
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 1091, in dsub_main
        launched_job = run_main(args)
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 1168, in run_main
        return run(
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 1322, in run
        error_messages = _wait_after(provider, [job_metadata['job-id']],
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 786, in _wait_after
        jobs_left = _wait_for_any_job(provider, job_ids_to_check, poll_interval,
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/commands/dsub.py", line 1015, in _wait_for_any_job
        tasks = provider.lookup_job_tasks({'*'}, job_ids=job_ids)
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/providers/local.py", line 519, in lookup_job_tasks
        task = self._get_task_from_task_dir(j, u, task_id, task_attempt)
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/providers/local.py", line 669, in _get_task_from_task_dir
        end_time = self._get_end_time_from_task_dir(task_dir)
      File "/home/hylke/.local/lib/python3.8/site-packages/dsub/providers/local.py", line 583, in _get_end_time_from_task_dir
        datetime.datetime.strptime(f.readline().strip(),
      File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
      File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
        raise ValueError("time data %r does not match format %r" %
    ValueError: time data '' does not match format '%Y-%m-%d %H:%M:%S.%f
    

    Any ideas how to get the Hello World example to work?

    Versions: dsub: 0.4.3 Ubuntu: 20.04 Python: 3.8.5 Docker: 19.03.11

    Thanks,

    Hylke

    opened by hylkedonker 12
  • Failure message: wrapping host binaries: pulling image: retry budget exhausted (10 attempts): running [

    Failure message: wrapping host binaries: pulling image: retry budget exhausted (10 attempts): running ["docker" "pull" "bash"]: exit status 1 (standard error: "Error response from daemon: Get https://registry-1.docker.io/v2/

    This is a new error message for me, and I checked the GCP status page in case it was transient but don't see any active issues.

    I am launching a bunch of tasks across a bunch of zones (within a single job) using the google-cls-v2 API with dsub 0.4.3. I am getting the following error, but only on some tasks within this job (while others within the same job are launching and succeeding):

    Failure message: wrapping host binaries: pulling image: retry budget exhausted (10 attempts): running ["docker" "pull" "bash"]: exit status 1 (standard error: "Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)\n")
    

    This strikes me as an odd message, because (1) I'm running these machines without a public IP, and so (2) I'm only using gcr.io Docker instances with them. I don't know why they'd be pointing to registry-1.docker.io.

    If this isn't a dsub issue, I can point this to the mailing list. (And if there is a quick way for me to get a flavor for "dsub issue vs not dsub issue", just let me know so I can self-triage.)

    Thanks!

    opened by carbocation 12
  • Add a verbose mode option flag to dsub

    Add a verbose mode option flag to dsub

    I noticed that the providers have a verbose mode object variable, and that object variable is potentially set via the args :

    $ git grep "getattr(args, 'verbose'"
    dsub/providers/provider_base.py:        getattr(args, 'verbose', False),
    dsub/providers/provider_base.py:        getattr(args, 'verbose', False), getattr(args, 'dry_run', False),
    

    but there was no way to toggle that option via the CLI. This change adds a --verbose option to the command line.

    opened by indraniel 12
  • The NVIDIA driver on your system is too old (found version 10020).

    The NVIDIA driver on your system is too old (found version 10020).

    Not actually 100% sure that this is a dsub issue, but I'm trying to run a Docker image which is based on gcr.io/deeplearning-platform-release/pytorch-gpu.1-6:latest. When I execute python, I get the following error in dsub:

    Failure message: Stopped running "user-command": exit status 1: /site-packages/torch/nn/modules/module.py", line 225, in _apply
        module._apply(fn)
      File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 247, in _apply
        param_applied = fn(param)
      File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 463, in convert
        return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
      File "/opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 150, in _lazy_init
        _check_driver()
      File "/opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 63, in _check_driver
        of the CUDA driver.""".format(str(torch._C._cuda_getDriverVersion())))
    AssertionError: 
    The NVIDIA driver on your system is too old (found version 10020).
    Please update your GPU driver by downloading and installing a new
    version from the URL: http://www.nvidia.com/Download/index.aspx
    Alternatively, go to: https://pytorch.org to install
    a PyTorch version that has been compiled with your version
    of the CUDA driver.
    

    I believe this is mapped via dsub and so this isn't something I can fix on my end. Is that accurate?

    opened by carbocation 11
  • Error when trying to use requester pays buckets

    Error when trying to use requester pays buckets

    Testing whether or not I can use dsub on files stored in a requester pays bucket returned the following error:

    [u'Error in job wc--jslagel--171129-194301-06 - code 5: 9: Failed to localize files: failed to copy the following files: "gs://xxx-test-requester-pay/xxx.txt -> /mnt/datadisk/input/gs/xxx-test-requester-pay/xxx.txt (cp failed: gsutil -q -m cp gs://xxx-test-requester-pay/xxx.txt /mnt/datadisk/input/gs/xxx-test-requester-pay/xxx.txt, command failed: BadRequestException: 400 Bucket is requester pays bucket but no user project provided.\nCommandException: 1 file/object could not be transferred.\n)"'] JobExecutionError: One or more jobs finished with status FAILURE or CANCELED during wait.

    On the surface it appears that the gsutil command is missing the new '-u ' argument.

    opened by slagelwa 11
  • Does mounting disk images actually work?

    Does mounting disk images actually work?

    I am trying to mount a disk image (ideally, I'd mount a disk snapshot, but that's a secondary issue). However, I can't seem to get the mount to work. Error message is below. Presumably I'm passing the argument incorrectly or missing something in my incantation, but it's not obvious to me what I've done wrong. It seems that the python code being executed is invalid, but maybe this is an issue with my local python installation. Any pointers?

    $ python --version
    Python 2.7.16 :: Anaconda, Inc.
    
    $ dsub --version
    dsub version: 0.3.5
    
    $ dsub --provider google-v2 --project broad-ml4cvd --image ubuntu:18.04 --command '/bin/ls ${MYDISK}' --mount MYDISK=https://www.googleapis.com/compute/v1/projects/broad-ml4cvd/global/images/dl-image-2019-05-13 1000 --logging gs://ukbb_v2/projects/jamesp/tmp/dsub --regions us-central1 --wait
    Traceback (most recent call last):
      File "/Users/jamesp/anaconda2/bin/dsub", line 11, in <module>
        load_entry_point('dsub==0.3.5', 'console_scripts', 'dsub')()
      File "/Users/jamesp/anaconda2/lib/python2.7/site-packages/dsub-0.3.5-py2.7.egg/dsub/commands/dsub.py", line 998, in main
        dsub_main(prog, argv)
      File "/Users/jamesp/anaconda2/lib/python2.7/site-packages/dsub-0.3.5-py2.7.egg/dsub/commands/dsub.py", line 983, in dsub_main
        launched_job = run_main(args)
      File "/Users/jamesp/anaconda2/lib/python2.7/site-packages/dsub-0.3.5-py2.7.egg/dsub/commands/dsub.py", line 1030, in run_main
        output_file_param_util, mount_param_util)
      File "/Users/jamesp/anaconda2/lib/python2.7/site-packages/dsub-0.3.5-py2.7.egg/dsub/lib/param_util.py", line 681, in args_to_job_params
        mount_data.add(mount_param_util.make_param(name, value, disk_size=None))
      File "/Users/jamesp/anaconda2/lib/python2.7/site-packages/dsub-0.3.5-py2.7.egg/dsub/lib/param_util.py", line 273, in make_param
        if raw_uri.startswith('https://www.googleapis.com/compute'):
    AttributeError: 'NoneType' object has no attribute 'startswith'
    
    opened by carbocation 10
  • dstat returns nothing for jobs submitted from cloud shell

    dstat returns nothing for jobs submitted from cloud shell

    I run jobs with dsub both from the Google Cloud Shell and from the Google Cloud SDK installed on my desktop. When I submit a job from my desktop, dstat works as expected. But when I submit a job from the Cloud Shell, dstat returns nothing (when called from either Cloud Shell or my desktop SDK).

    Below is a screenshot of my cloud console (in which I'd previously installed dsub using sudo pip install dsub), showing job submission and attempt to call dstat. dsub_1

    And below is my desktop command line (iTerm on OSX, with dsub installed via cloning this repo and running python setup.py --install), showing dstat working effectively on a job I'd previously submitted from desktop, but returning nothing on the job I ran from the cloud console: dsub_2

    Thanks in advance for your advice.

    opened by bertozzivill 10
  • dsub with a VPC

    dsub with a VPC

    Is it possible to use dsub with a VPC? I can't seem to find a way to specify network/subnet compute resources to pass along to Google pipelines (which in itself is problematic to specify...).

    google-v1-wontfix google-v2 
    opened by slagelwa 9
  • ddel: AttributeError: type object 'HttpError' has no attribute 'resp'

    ddel: AttributeError: type object 'HttpError' has no attribute 'resp'

    I've created too many jobs "by accident" (or rather I hoped it would re-use compute engines). When I tried to delete all of them using: ddel --provider google-v2 --project my-project-name --jobs '*'

    I am getting the following exception at some point:

    Traceback (most recent call last):
      File "/path/to/venv/bin/ddel", line 11, in <module>
        sys.exit(main())
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/commands/ddel.py", line 137, in main
        create_time_min=create_time)
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/commands/ddel.py", line 184, in ddel_tasks
        user_ids, job_ids, task_ids, labels, create_time_min, create_time_max)
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/providers/google_v2.py",line 1069, in delete_jobs
        tasks)
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/providers/google_base.py", line 445, in cancel
        batch_fn, cancel_fn, ops[first_op:first_op + max_batch])
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/providers/google_base.py", line 409, in _cancel_batch
        batch.execute()
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/providers/google_v2.py",line 400, in execute
        self._response_handler(request_id, response, exception)
      File "/path/to/venv/local/lib/python2.7/site-packages/dsub/providers/google_base.py", line 383, in handle_cancel_response
        msg = 'error %s: %s' % (exception.resp.status, exception.resp.reason)
    AttributeError: type object 'HttpError' has no attribute 'resp'
    

    It could be that there are so many tasks to delete. HttpError should have the resp set in the constructor, not sure why it hasn't in that case. Maybe it's a different object (although the full classname was googleapiclient.errors.HttpError).

    I got around it by putting a try/catch google_base, something like (which is obviously a workaround, not a proper solution, but enough to get everything deleted - which took a while):

          try:
            msg = 'error %s: %s' % (exception.resp.status, exception.resp.reason)
            if exception.resp.status == FAILED_PRECONDITION_CODE:
              detail = json.loads(exception.content)
              status = detail.get('error', {}).get('status')
              if status == FAILED_PRECONDITION_STATUS:
                msg = 'Not running'
          except AttributeError:
            msg = 'error %s' % exception
    
    opened by de-code 8
  • Support request for nvidia-a100-80g

    Support request for nvidia-a100-80g

    I believe this is an issue for the life sciences API devs, however not sure where to ask.

    Seeing this error when I request machine a2-ultrapu-4g with accelerator_type: nvidia-a100-80g

    "Error: validating pipeline: unsupported accelerator: "nvidia-a100-80g"". Details: "Error: validating pipeline: unsupported accelerator: "nvidia-a100-80g""
    

    Can you please either request these machines to be made available or please direct me as to best place to ask and I will do so. Thank you.

    opened by rivershah 1
  • Mounting a writable existing persistent disk?

    Mounting a writable existing persistent disk?

    I've had some good successes with mounting existing read-only persistent disks to the VM running a dsub job, and its very cool that one can do this. However I was wondering about attached writable disks. According to the Life Science API documentation:

    If all Mount references to this disk have the readOnly flag set to true, the disk will be attached in read-only mode and can be shared with other instances. Otherwise, the disk will be available for writing but cannot be shared.

    I'm not exactly sure what they mean by Mount references. Do they mean that the disk is attached to zero or more VMs in read only mode? As that would seem to be what is implied by the description. (I'm not sure how outside of the VM that GCP would explicitly know how the disk is actually mounted). I've done some testing with a persistent disk that's unattached to any VMs, and one that was already attached in read only mode to a VM and in either case when I launch a dsub job the persistent disk is always attached in read only mode regardless.

    opened by slagelwa 2
  • ERROR: gcloud crashed (TypeError): a bytes-like object is required, not 'str'

    ERROR: gcloud crashed (TypeError): a bytes-like object is required, not 'str'

    When I run the "Hello World" test for the local provider, it works. When I run it on a custom image, it also works. But when I try to make it run on on any gcr.io/ image, it does not work. Instead, I get the following message in the runner-log.txt file:

    WARNING: `gcloud docker` will not be supported for Docker client versions above 18.03.
    
    As an alternative, use `gcloud auth configure-docker` to configure `docker` to
    use `gcloud` as a credential helper, then use `docker` as you would for non-GCR
    registries, e.g. `docker pull gcr.io/project-id/my-image`. Add
    `--verbosity=error` to silence this warning: `gcloud docker
    --verbosity=error -- pull gcr.io/project-id/my-image`.
    
    See: https://cloud.google.com/container-registry/docs/support/deprecation-notices#gcloud-docker
    
    ERROR: gcloud crashed (TypeError): a bytes-like object is required, not 'str'
    
    If you would like to report this issue, please run the following command:
      gcloud feedback
    
    To check gcloud for common problems, please run the following command:
      gcloud info --run-diagnostics
    

    Given the contents of the error message, and the fact that I have docker version 20 installed, I think this error is not surprising. But is there a way to bypass dsub calling gcloud docker when running gcr.io images with the --local provider so that those of us with docker > 18.03 can use the --local provider?

    opened by carbocation 1
  • Feature request TPU v4 support

    Feature request TPU v4 support

    With tpu v4, google has really cleaned up the user experience around tpu vms. Does the google-cls-v2 provider allow provisioning of tpu v4 machines? If so, is there any example that can please be shown that illustrates provisioning, and loading up any drivers to make the tpu v4 accelerator types visible to jobs submitted via dsub.

    opened by rivershah 3
  • Use gcloud storage instead of gsutil

    Use gcloud storage instead of gsutil

    It seems that gcloud storage will be substantially faster for localization/delocalization vs gsutil. Seems like it would make sense to either apply the shim or to transition to using gcloud storage in place of gsutil in dsub.

    opened by carbocation 8
  • Upgrade dsub dependencies

    Upgrade dsub dependencies

    A project is having trouble resolving dependencies. Could we please consider relaxing dsub dependencies in the next release:

    The conflict is caused by:
        The user requested google-api-python-client==2.52.0
        dsub 0.4.7 depends on google-api-python-client<=2.47.0
    
    opened by rivershah 1
Releases(v0.4.8)
  • v0.4.8(Dec 22, 2022)

    This release includes:

    • dsub
      • (New) (In Development) Implemented google-batch provider.
        • Note that this is not yet feature parity with the other providers.
        • See Get started with Batch for details on Google Cloud Batch.
      • setup.py: Update dsub dependent libraries to pick up newer versions.
      • Update providers to allow a mount other than /mnt/data.
      • Fix documentation about mounting an existing disk
      • Fix unit test socket.timeout alias issue with Python 3.10
    Source code(tar.gz)
    Source code(zip)
  • v0.4.7(May 18, 2022)

    This release includes:

    • dsub
      • Add support for mounting an existing disk read-only to a pipeline VM
      • setup.py: Update dsub dependent libraries to pick up newer versions
    • Documentation
      • Include documenting the Cloud SDK as a requirement for the local provider
    Source code(tar.gz)
    Source code(zip)
  • v0.4.6(Jan 26, 2022)

    This release includes:

    • dsub
      • Add support for Toronto, Delhi, Melbourne, and Warsaw regions
      • setup.py: Update dsub dependent libraries to pick up newer versions.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.5(Aug 26, 2021)

    This release includes small maintenance updates:

    • dsub
      • Quiet a warning about 'oauth2client' (which dsub no longer uses).
      • Fix one other instance of cache_discovery=True raising ImportError.
      • Add flush method to _Printer.
      • Run pytype on dsub, and fix type errors
      • setup.py: Update dsub dependent libraries to pick up newer versions.
      • Update tenacity version
    Source code(tar.gz)
    Source code(zip)
  • v0.4.4(Feb 18, 2021)

    This release includes:

    • dsub
      • Implement --block-external-network to support network-sandboxed user action.
      • setup.py: Update dsub dependent libraries to pick up newer versions.
      • Check for python or python3 executable in local runner script. Also fix error handling for the first "write_event".
      • Add zones for Seoul, Jakarta, Salt Lake City, and Las Vegas
    • Documentation
      • Rename references to the master branch to the main branch.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.3(Nov 24, 2020)

    This release includes:

    • dsub

      • Update parsing of worker assigned event text to account for changes in Pipelines API.
      • Hide the --nvidia-driver-version flag as it is now ignored by the Pipelines API (since Sept 2020).
    • Documentation

      • Fix link to accelerator API doc in README.
      • Fix documentation that assumed TMPDIR is /tmp.
      • Mention SSH from the browser in troubleshooting docs.
    • Tests

      • Add e2e test that confirms GPU is installed when --accelerator-type and --accelerator-count is used.
      • Removed infrequently used environment variables, CHECK_RESULTS_ONLY and ALLOW_DIRTY_TESTS, from dsub tests.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(Oct 15, 2020)

    Release 0.4.2 of dsub ends support Python 3.5, which reached its "end of life" at the end of September, 2020.

    The last version of dsub that supports Python 3.5 is 0.4.1. Please use Python 3.6 or greater.

    This release includes:

    • Python code health
      • Remove uses of future from dsub
      • Remove six and its usages from dsub
      • Explicitly support Python 3.6 and up.
    • Feature updates
      • Improvements to dstat output
      • Use "tenacity" library instead of "retrying" library for API retries.
      • Add a get_credentials function that Python clients of dsub, dstat, ddel can override for non-standard runtime environments.
    • google-cls-v2 provider updates:
      • Use batch endpoint in google-cls-v2 provider for job deletion (ddel).
      • google-cls-v2: Use the batch endpoint only for --location us-central1.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Aug 27, 2020)

  • v0.4.0(Aug 26, 2020)

    Release 0.4.0 completes the sunsetting of Python 2 support for dsub. The last version of dsub that supports Python 2 is 0.3.10.

    This release also adds a WARNING when the --image flag is omitted from a call to dsub. The default image is available as a getting started convenience, but for ongoing reproducible workflows, the image should be specified by the caller. The current default is ubuntu:14.04 which reached End Of Life in April 2019. The default image will change in future releases and it is likely to be changed on a semi-regular basis, as popular base Docker linux images change.


    This release includes:

    • dsub
      • Update setup.py in dsub to be Python 3 only.
      • Lint dsub source files as Python3 only. Fix a few lint warnings.
      • Emit warning if default image is used.
      • Print full path of exceptions that are retried.
      • Print retry errors for socket timeout error.
      • Add socket.timeout exceptions to the retry list.
      • Fix markdown formatting in dsub README
    Source code(tar.gz)
    Source code(zip)
  • v0.3.10(Aug 4, 2020)

    This release includes:

    • dsub
      • Update Makefile to use Python3 venv
      • Add documentation around Compute Engine Quotas.
      • Have dsub output job-name and user-id in addition to job-id prior to launching job.
      • Fix for --users '*'
      • Update httplib2 for dsub to 0.18.1
      • Retry transient http error codes when checking GCS
      • Fix for yaml 5.3 where timestamps are already loaded as timezone aware.
      • Improve performance of GCS output file checks for --skip
    Source code(tar.gz)
    Source code(zip)
  • v0.3.9(Jul 6, 2020)

    This release includes:

    • dsub
      • Update version of cloudsdk docker image and revert workaround for gsutil/gcloud auth token bug on GCE, which should now be fixed in the updated image.
      • Update google-auth to 1.18.0 and pin google-api-core to 1.21.0
      • Remove leading characters that are not a letter, number, or underscore when auto-creating a job-name from a command string.
      • Move google_v2 arguments to google_common in the --help text.
      • Add a section in the documentation for Google provider-specific command-line flags.
      • Remove support/tests for legacy local provider job metadata file
    • Testing updates:
      • Re-enable the test e2e_errors.sh for all providers.
      • Add unit test for retrying BrokenPipeError
      • Fix ResourceWarning when running python unit tests.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.8(May 27, 2020)

    This release includes:

    • dsub
      • Remove the google provider and its documentation.
      • Document the existence of the google-cls-v2 provider
      • Document using venv for installation
      • Add a flag --credentials-file to pass service account credentials to the provider.
    • google-v2 provider updates:
      • Add --ssh to dstat.
    • google-cls-v2 provider updates:
      • Add default locations for google-cls-v2 in dstat and ddel.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.7(Feb 3, 2020)

    This release includes:

    • dsub
      • (New) (Experimental) Implemented google-cls-v2 provider that passes all tests.
      • Pin all dsub dependencies to a max version
      • Fix broken urls in dsub docs.
      • Add dsub --summary output in wait_and_retry loop.
    • google-v2 provider updates:
      • Enable a shared PID namespace when --ssh is specified.
      • Also retry broken pipe errors
      • Setting --preemptible 0 should not cause an error.
    • Testing
      • Change Travis Python3 version from 3.8 to 3.7.
      • Remove sorting_util_test.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.6(Nov 22, 2019)

    This release includes:

    • dsub
      • Add periodic status update in output (via --summary flag).
      • Update help text to clarify that timeout has a default of 7 days.
      • Replace apiclient with googleapiclient.
      • Emit a message to make it more clear that the dsub process must continue running for retries.
      • Add missing quotes in documentation in example for --mount.
    • google-v2 provider updates:
      • Update gsutil rsync warning message to include command being run.
      • Add workaround for gsutil/gcloud bug that prevents re-authentication.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.5(Oct 23, 2019)

    This release includes:

    • dsub
      • Fix RFC3339 date parsing errors with specific values under Python3
    • google-v2 provider updates:
      • Filter out warning from google-auth.
      • Retry ResponseNotReady error.
      • Add zones for Zurich and Osaka.
      • Move "sleep before retry" messages from INFO in stdout to WARNING in stderr. This ensures that the retry messages bubble up to the stderr output recorded in the pipeline's operation.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.4(Oct 1, 2019)

    This release includes:

    • dsub
      • Explicitly reject launching jobs if blank lines found in tasks file (instead of erroring)
    • google-v2 provider updates:
      • Expose the Pipelines API operation id in dsub launch stderr output.
      • Replace oauth2lib with google-auth
      • Update cloud sdk image to a GCR hosted one, gcr.io/google.com/cloudsdktool/cloud-sdk:264.0.0-slim. This version was specifically chosen as the next version updates gsutil from 4.42 to 4.43. 4.43 includes undesired changes to gsutil cp. See gsutil's changelogs and the regression for details.
    • Test updates:
      • Use fake time in retrying_test.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Sep 3, 2019)

    This release includes:

    • dsub
      • Fix cases where stderr was redirected to stdout instead of vice-versa.
    • google-v2 provider updates:
      • Retry support for starting with preemptible and falling back to non-preemptible VMs
      • Add a delay between gsutil cp retries
    • Documentation updates:
      • Remove note from README about experimental Python3 support.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Jun 13, 2019)

    This release includes:

    • dsub
      • Py3 compliance updates.
      • Fix UnicodeEncodeError when task error message contains non-ascii characters
      • Fix datetime.datetime.max construction used in wait loop to be an offset-aware datetime.
      • Silence retrying messages during the wait loop.
    • google-v2 provider updates:
      • Fix failing final_logging actions.
      • Support stackdriver monitoring
    • Documentation updates:
      • Made it clear that multiple --input, --output, --input-recursive, and --output-recursive parameters may be used.
      • Fix 'lookup' typo.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Apr 16, 2019)

    Python 3 is now part of automated testing. All unit and integration tests are now run with Python 2.7 and Python 3.7.

    The google provider is no longer included in automated tests.

    Specific changes in this release include:

    • dstat
      • Separate recursive input/output from input/output fields..
      • Fix YAMLLoadWarning that was being emitted.
    • dsub
      • Experimental: generate 32 character UUID job ids with --unique-job-ids flag.
    • local provider updates:
      • Ensure runner-log.txt is written as text.
    • google-v2 provider updates:
      • Add timestamps in .log files.
    • Test updates:
      • Remove the google provider from automated tests.
      • Fix the way that the project ID is discovered for Python 3.
      • Increase concurrency and optionally disable Python module tests.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Apr 3, 2019)

  • v0.2.6(Apr 1, 2019)

    Note: This is the last planned release with google as the default provider. The new default provider will be google-v2.

    This release includes:

    • google-v2 provider updates:
      • --disk-type option added.
      • --service-account option added.
      • Fix dstat to find jobs submitted with user-ids containing non-alphanumeric non-hyphen characters.
      • Fix dstat to better handle operations with no actions.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.5(Feb 7, 2019)

    This release includes:

    • dsub

      • When tasks are retried (based on use of the --retries flag), output messages are now more informative.
      • bash image can now be used with the local and google-v2 providers (updated Docker entrypoints to use /usr/bin/env bash instead of /bin/bash)
      • google-v2 provider updates:
        • --nvidia-driver-version parameter can now be used to specifiy the NVIDIA driver version to use when attaching an NVIDIA GPU accelerator.
        • log files are copied locally on the VM before upload to GCS. This avoids ResumableUploadAbortException copying large logs.
    • dstat

      • script and script-name now appears in dstat --full output.
      • pyyaml version updated to latest version allowing Python3.7 support.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.4(Dec 11, 2018)

  • v0.2.3(Nov 20, 2018)

    This release includes:

    • google:
      • A large deprecation WARNING will now be emitted when using the google provider.
    • google-v2
      • Exit with error if logging fails.
      • Better messaging on failures.
      • Bug fixes for exception handling in ddel jobs
    • local:
      • Better messaging on logging failures.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Nov 2, 2018)

  • v0.2.1(Oct 4, 2018)

    This release includes:

    • Documentation updates noting the deprecation of the google provider.
    • google-v2 provider updates:
      • --log-interval flag to configure the amount of time to sleep between copying log files from the pipeline to the logging path.
      • Adding google-v2 specific elements to dstat output.
      • --ssh flag to start an SSH container in the background, allowing you to inspect the runtime environment of your job's container in real time.
      • Experimental support for gcsfuse through the --mount flag.
    • local provider updates:
      • Add support for Docker images with entrypoints
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Sep 20, 2018)

    This release includes:

    • Experimental Python 3 support.
    • Improvements to retry logic on transient HTTP errors.
    • google-v2 provider:
      • dstat fixes:
        • Fix for "ok" event in the wrong order.
        • Decreased operations.list page size to work around quota exception.
        • Fix for 'pulling-image' event discovery.
      • dsub fixes:
        • Fix pipeline hang in localization/delocalization on error on multi-core VMs
    Source code(tar.gz)
    Source code(zip)
  • v0.1.10(Aug 27, 2018)

    This release includes:

    • google-v2 provider:

      • Support for --network, --cpu-platform, --timeout parameters
      • Add events to dstat.list for google V2 provider
      • Using --min-cores or --min-ram now directs users to --machine-type
    • Add Finland and Los Angeles regions and new Singapore zone.

    • Various test suite improvements

    Note that this release includes an important change to the way that log file names are formatted for dsub jobs that use --retries. The task-attempt is now automatically included in the log file name such that logs for each attempt do not overwrite the previous attempt.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.9(Jun 28, 2018)

    All unit and integration tests now pass for the google-v2 provider. Users interested in Google's Pipelines v2 are encouraged to give it a try.

    This release includes:

    • google-v2 provider:

      • dstat fields such as envs, inputs, outputs, and logging now supported.
      • ddel supported
    • General

      • dsub returns task-id even if failure is detected when using --wait.
      • events included in dstat output (local and google providers)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.8(Jun 5, 2018)

    This release includes:

    • Initial support of dsub automatic retries
    • dstat support for the --summary flag, providing a more compact output for --tasks jobs
    • Provider improvements
      • Local provider
        • Use parallel copy for localizing and de-localizing files
        • Deterministic dstat response ordering
      • Google-v2 provider (still in progress)
        • File localization/delocalization
        • v2alpha1 timestamp format support
    • Various test suite fixes
    Source code(tar.gz)
    Source code(zip)
Owner
Data Biosphere
We are creating a vibrant ecosystem of interoperable modules and data environments for the biomedical community.
Data Biosphere
PyArmor is a command line tool used to obfuscate python scripts

PyArmor is a command line tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.

Dashingsoft 2k Jan 7, 2023
AML Command Transfer. A lightweight tool to transfer any command line to Azure Machine Learning Services

AML Command Transfer (ACT) ACT is a lightweight tool to transfer any command from the local machine to AML or ITP, both of which are Azure Machine Lea

Microsoft 11 Aug 10, 2022
Shortcut-Maker - It is a tool that can be set to run any tool with a single command

Shortcut-Maker It is a tool that can be set to run any tool with a single command Coded by Dave Smith(Owner of Sl Cyber Warriors) Command list ?? pkg

Dave Smith 10 Sep 14, 2022
img-proof (IPA) provides a command line utility to test images in the Public Cloud

overview img-proof (IPA) provides a command line utility to test images in the Public Cloud (AWS, Azure, GCE, etc.). With img-proof you can now test c

null 13 Jan 7, 2022
A cd command that learns - easily navigate directories from the command line

NAME autojump - a faster way to navigate your filesystem DESCRIPTION autojump is a faster way to navigate your filesystem. It works by maintaining a d

William Ting 14.5k Jan 3, 2023
Ros command - Unifying the ROS command line tools

Unifying the ROS command line tools One impairment to ROS 2 adoption is that all

null 37 Dec 15, 2022
As easy as /aitch-tee-tee-pie/ ๐Ÿฅง Modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more. https://twitter.com/httpie

HTTPie: human-friendly CLI HTTP client for the API era HTTPie (pronounced aitch-tee-tee-pie) is a command-line HTTP client. Its goal is to make CLI in

HTTPie 25.4k Dec 30, 2022
Powerful yet easy command line calculator.

Powerful yet easy command line calculator.

Cruisen 1 Jul 22, 2022
๐ŸŒˆ Lightweight Python package that makes it easy and fast to print terminal messages in colors. ๐ŸŒˆ

?? Colorist for Python ?? Lightweight Python package that makes it easy and fast to print terminal messages in colors. Prerequisites Python 3.9 or hig

Jakob Bagterp 1 Feb 5, 2022
A lightweight Python module and command-line tool for generating NATO APP-6(D) compliant military symbols from both ID codes and natural language names

Python military symbols This is a lightweight Python module, including a command-line script, to generate NATO APP-6(D) compliant military symbol icon

Nick Royer 5 Dec 27, 2022
Python command line tool and python engine to label table fields and fields in data files.

Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable information (PII).

APICrafter 22 Dec 5, 2022
A command line tool to hide and reveal information inside images (works for both PNGs and JPGs)

Imgrerite A command line tool to hide and reveal information inside images (works for both PNGs and JPGs) Dependencies Python 3 Git Most of the Linux

Jigyasu 10 Jul 27, 2022
gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases.

gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases. gget consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying in a single line of code.

Pachter Lab 570 Dec 29, 2022
Run an FFmpeg command and see the percentage progress and ETA.

Run an FFmpeg command and see the percentage progress and ETA.

null 25 Dec 22, 2022
Command-line tool for looking up colors and palettes.

Colorpedia Colorpedia is a command-line tool for looking up colors, shades and palettes. Supported color models: HEX, RGB, HSL, HSV, CMYK. Requirement

Joohwan Oh 282 Dec 27, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Phil Wang 4.4k Jan 9, 2023
Free and Open-Source Command Line tool for Text Replacement

Sniplet Free and Open Source Text Replacement Tool Description: Sniplet is a work in progress CLI tool which can do text replacement globally in Linux

Veeraraghavan Narasimhan 13 Nov 28, 2022
A command line tool to remove background from video and image

A command line tool to remove background from video and image, brought to you by BackgroundRemover.app which is an app made by nadermx powered by this tool

Johnathan Nader 1.7k Jan 1, 2023