A DSL for data-driven computational pipelines

Last update: Jan 3, 2023

Related tags

Data Analysis docker groovy hello aws cloud bioinformatics pipeline nextflow hpc reproducible-research workflow-engine slurm pipeline-framework sge singularity reproducible-science dataflow singularity-containers

Overview

"Dataflow variables are spectacularly expressive in concurrent programming"
Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum

Quick overview

Nextflow is a bioinformatics workflow manager that enables the development of portable and reproducible workflows. It supports deploying workflows on a variety of execution platforms including local, HPC schedulers, AWS Batch, Google Cloud Life Sciences, and Kubernetes. Additionally, it provides support for manage your workflow dependencies through built-in support for Conda, Docker, Singularity, and Modules.

Rationale
Quick start
Documentation
Tool Management
HPC Schedulers
Cloud Support
Community
Build from source
Contributing
License
Citations
Credits

Rationale

With the rise of big data, techniques to analyse and run experiments on large datasets are increasingly necessary.

Parallelization and distributed computing are the best ways to tackle this problem, but the tools commonly available to the bioinformatics community often lack good support for these techniques, or provide a model that fits badly with the specific requirements in the bioinformatics domain and, most of the time, require the knowledge of complex tools or low-level APIs.

Nextflow framework is based on the dataflow programming model, which greatly simplifies writing parallel and distributed pipelines without adding unnecessary complexity and letting you concentrate on the flow of data, i.e. the functional logic of the application/algorithm.

It doesn't aim to be another pipeline scripting language yet, but it is built around the idea that the Linux platform is the lingua franca of data science, since it provides many simple command line and scripting tools, which by themselves are powerful, but when chained together facilitate complex data manipulations.

In practice, this means that a Nextflow script is defined by composing many different processes. Each process can execute a given bioinformatics tool or scripting language, to which is added the ability to coordinate and synchronize the processes execution by simply specifying their inputs and outputs.

Quick start

Download the package

Nextflow does not require any installation procedure, just download the distribution package by copying and pasting this command in your terminal:

curl -fsSL https://get.nextflow.io | bash

It creates the nextflow executable file in the current directory. You may want to move it to a folder accessible from your $PATH.

Download from Conda

Nextflow can also be installed from Bioconda

conda install -c bioconda nextflow

Documentation

Nextflow documentation is available at this link http://docs.nextflow.io

HPC Schedulers

Nextflow supports common HPC schedulers, abstracting the submission of jobs from the user.

Currently the following clusters are supported:

For example to submit the execution to a SGE cluster create a file named nextflow.config, in the directory where the pipeline is going to be launched, with the following content:

process {
  executor='sge'
  queue='
   
   
    
    '
   
   
}

In doing that, processes will be executed by Nextflow as SGE jobs using the qsub command. Your pipeline will behave like any other SGE job script, with the benefit that Nextflow will automatically and transparently manage the processes synchronisation, file(s) staging/un-staging, etc.

Cloud support

Nextflow also supports running workflows across various clouds and cloud technologies. Managed solutions from major cloud providers are also supported through AWS Batch, Azure Batch and Google Cloud compute services. Additionally, Nextflow can run workflows on either on-prem or managed cloud Kubernetes clusters.

Currently supported cloud platforms:

Tool management

Containers

Nextflow has first class support for containerization. It supports both Docker and Singularity container engines. Additionally, Nextflow can easily switch between container engines enabling workflow portability.

process samtools {
  container 'biocontainers/samtools:1.3.1'

  """
  samtools --version 
  """

}

Conda environments

Conda environments provide another option for managing software packages in your workflow.

Environment Modules

Environment modules commonly found in HPC environments can also be used to manage the tools used in a Nextflow workflow.

Community

You can post questions, or report problems by using the Nextflow discussion forum or the Nextflow channel on Gitter.

Nextflow also hosts a yearly workshop showcasing researcher's workflows and advancements in the langauge. Talks from the past workshops are available on the Nextflow YouTube Channel

The nf-core project is a community effort aggregating high quality Nextflow workflows which can be used by the community.

Build from source

Required dependencies

Compiler Java 8 or later
Runtime Java 8 or later

Build from source

Nextflow is written in Groovy (a scripting language for the JVM). A pre-compiled, ready-to-run, package is available at the Github releases page, thus it is not necessary to compile it in order to use it.

If you are interested in modifying the source code, or contributing to the project, it worth knowing that the build process is based on the Gradle build automation system.

You can compile Nextflow by typing the following command in the project home directory on your computer:

make compile

The very first time you run it, it will automatically download all the libraries required by the build process. It may take some minutes to complete.

When complete, execute the program by using the launch.sh script in the project directory.

The self-contained runnable Nextflow packages can be created by using the following command:

make pack

Once compiled use the script ./launch.sh as a replacement for the usual nextflow command.

The compiled packages can be locally installed using the following command:

make install

A self-contained distribution can be created with the command: make pack. To include support of GA4GH and its dependencies in the binary, use make packGA4GH instead.

IntelliJ IDEA

Nextflow development with IntelliJ IDEA requires the latest version of the IDE (2019.1.2 or later).

If you have it installed in your computer, follow the steps below in order to use it with Nextflow:

Clone the Nextflow repository to a directory in your computer.
Open IntelliJ IDEA and choose "Import project" in the "File" menu bar.
Select the Nextflow project root directory in your computer and click "OK".
Then, choose the "Gradle" item in the "external module" list and click on "Next" button.
Confirm the default import options and click on "Finish" to finalize the project configuration.
When the import process complete, select the "Project structure" command in the "File" menu bar.
In the showed dialog click on the "Project" item in the list of the left, and make sure that the "Project SDK" choice on the right contains Java 8.
Set the code formatting options with setting provided here.

Contributing

Project contribution are more than welcome. See the CONTRIBUTING file for details.

Build servers

License

The Nextflow framework is released under the Apache 2.0 license.

Citations

If you use Nextflow in your research, please cite:

P. Di Tommaso, et al. Nextflow enables reproducible computational workflows. Nature Biotechnology 35, 316–319 (2017) doi:10.1038/nbt.3820

Credits

Nextflow is built on two great pieces of open source software, namely Groovy and Gpars.

YourKit is kindly supporting this open source project with its full-featured Java Profiler. Read more http://www.yourkit.com

Comments

Syntax enhancement aka DLS-2
This is a request for comments for the implementation of modules feature for Nextflow.

This feature allows the definition of NF processes in the main script or a separate library file, that can be invoked, one or multiple times, as any other routine passing the requested input channels as arguments.

Process definition

The syntax for the definition of a process is nearly identical to the usual one, it only requires the use of processDef instead of process and the omission of the from/into declarations. For example:

processDef index { tag "$transcriptome_file.simpleName" input: file transcriptome output: file 'index' script: """ salmon index --threads $task.cpus -t $transcriptome -i index """ }

The semantic and supported features remain identical to current process. See a complete example here.

Process invocation

Once a process is defined it can be invoked like any other function in the pipeline script. For example:

transcriptome = file(params.transcriptome) index(transcriptome)

Since the index defines an output channel its return value can be assigned to a channel variable that can be used as usual eg:

transcriptome = file(params.transcriptome) index_ch = index(transcriptome) index_ch.println()

If the process were producing two (or more) output channels the multiple assignment syntax can be used to get a reference to the output channels.

Process composition

The result of a process invocation can be passed to another process like any other function, eg:

processDef foo { input: val alpha output: val delta val gamma script: delta = alpha gamma = 'world' "some_command_here" } processDef bar { input: val xx val yy output: stdout() script: "another_command_here" } bar(foo('Hello'))

Process chaining

Processes can also be invoked as custom operators. For example a process foo taking one input channel can be invoked as:

ch_input1.foo()

when taking two channels as:

ch_input1.foo(ch_input2)

This allows the chaining of built-in operators and processes together eg:

Channel .fromFilePairs( params.reads, checkIfExists: true ) .into { read_pairs_ch; read_pairs2_ch } index(transcriptome_file) .quant(read_pairs_ch) .mix(fastqc(read_pairs2_ch)) .collect() .multiqc(multiqc_file)

See the complete script here.

Library file

A library is just a NF script containing one or more processDef declarations. Then the library can be imported using the importLibrary statement, eg:

importLibrary 'path/to/script.nf'

Relative paths are resolved against the project baseDir variable.

Test it

You can try to the current implementation using the version 19.0.0.modules-draft2-SNAPSHOT eg.

NXF_VER=19.0.0.modules-draft2-SNAPSHOT nextflow run rnaseq-nf -r modules

Open points

When a process is defined in a library file, should it be possible to access to the params values? Currently it's possible, but I think this is not a good idea because makes the library depending on the script params making it very fragile.

How to pass parameters to a process defined in library files eg. For example memory and cpus settings? It could be done using config file as usual, still I expect there could be the need to parametrise the process definition and specify the parameters at invocation time.

Should a namespace be used when defining the processes in library? What if two or more processes have the same name in different library files?

One or many processes per library file? Currently it can be defined any number of processes, I'm starting to think that it would be better to allow the definition only of one process per file. This would simplify the reuse across different pipelines, the import in tools such as dockstore and it would make the dependencies of the pipeline more intelligible.

Remote library file? Not sure it's a good idea to being able to import remote hosted files e.g. http://somewhere/script.nf. Remote paths tend to change over time.

Should a versioning number be associated with the process definition? how to use or enforce it?

How test process components? ideally it should be possible to include the required contained in the process definition and unit test each process independently.

How chain a process retuning multiple channels?

kind/feature lang/dsl2
opened by pditommaso 114
Nextflow parameter description scheme

TL;DR

A naming scheme to enable meta-data annotation for workflow parameters.

Details

Usually, workflow-specific execution parameters for the single processes are defined in the params scope, a DSL-feature Nextflow provides to access parameter variables from the workflow script during wf execution.

Currently, there is no naming scheme / convention / language feature for annotating parameters with description text, mandatory/optional flags or similar.

This could be useful though for upstream applications in order to build graphical user interfaces and configure a workflow correctly before execution in a dynamic, user-friendly way.

I am very happy for any input here and design suggestions :)

Best, Sven
kind/feature pri/moderate

opened by sven1103 80
wr as new Nextflow backend
New feature

I develop wr which is a workflow runner like Nextflow, but can also just be used as a backend scheduler. It can schedule to LSF and OpenStack right now.

The benefit to Nextflow users of going via wr instead of using Nextflow’s existing LSF or Kubernetes support is:

wr makes more efficient use of LSF: it can pick an appropriate queue, use job arrays, and “reuse” job slots. In a simple test I did, Nextflow using wr in LSF mode was 2 times faster than Nextflow using its own LSF scheduler.

wr’s OpenStack support is incredibly easy to use and set up (basically a single command to run), and provides auto scaling up and down. Kubernetes, by comparison, is really quite complex to get working on OpenStack, doesn’t auto scale, and wastes resources with multiple nodes needed even while no workflows are being operated on. I was able to get Nextflow to work with wr in OpenStack mode (but the shared disk requirement for Nextflow’s state remains a concern).

Usage scenario

Users with access to LSF or OpenStack clusters who want to run their Nextflow workflows efficiently and easily.

Suggest implementation

Since I don’t know Java well enough to understand how to implement this “correctly”, I wrote a simple bsub emulator in wr, which is what my tests so far have been based on. I submit the Nextflow command as a job to wr, turning on the bsub emulation, and configure Nextflow to use its existing LSF scheduler. While running under the emulation, Nextflow’s bsub calls actually call wr.

Of course the proper way to do this would be have Nextflow call wr directly (either the wr command line, or it’s REST API). The possibly tricky thing with regard to having it work in OpenStack mode is having it tell wr about OpenStack-specific things like what image to use, what hardware flavour to use, pass details on how to mount S3 etc. (the bsub emulation handles all of this).

Here's what I did for my LSF test...

echo_1000_sleep.nf:

#!/usr/bin/env nextflow num = Channel.from(1..1000) process echo_sleep { input: val x from num output: stdout result "echo $x && sleep 1" } result.subscribe { println it } workflow.onComplete { println "Pipeline completed at: $workflow.complete" println "Execution status: ${ workflow.success ? 'OK' : 'failed' }" }

nextflow.config:

process { executor='lsf' queue='normal' memory='100MB' }

install wr:

wget https://github.com/VertebrateResequencing/wr/releases/download/v0.17.0/wr-linux-x86-64.zip unzip wr-linux-x86-64.zip mv wr /to/somewhere/in/my/PATH/wr

run:

wr manager start -s lsf echo "nextflow run ./echo_1000_sleep.nf" | wr add --bsub -r 0 -i nextflow --cwd_matters --memory 1GB

Here's what I did to get it to work in OpenStack...

nextflow_install.sh:

sudo apt-get update sudo apt-get install openjdk-8-jre-headless -y wget -qO- https://get.nextflow.io | bash sudo mv nextflow /usr/bin/nextflow

put input files in S3:

s3cmd put nextflow.config s3://sb10/nextflow/nextflow.config s3cmd put echo_1000_sleep.nf s3://sb10/nextflow/echo_1000_sleep.nf

~/.openstack_rc:

[your rc file containing OpenStack environment variables downloaded from Horizon]

run:

source ~/.openstack_rc wr cloud deploy --os 'Ubuntu Xenial' --username ubuntu echo "cp echo_1000_sleep.nf /shared/echo_1000_sleep.nf && cp nextflow.config /shared/nextflow.config && cd /shared && nextflow run echo_1000_sleep.nf" | wr add --bsub -r 0 -o 2 -i nextflow --memory 1GB --mounts 'ur:sb10/nextflow' --cloud_script nextflow_install.sh --cloud_shared

The NFS share at /shared created by the --cloud_shared option is slow and limited in size; a better solution would be to set up your own high performance shared filesystem in OpenStack (eg. GlusterFS), then add to nextflow_install.sh to mount this share. Or even better, is there a way to have Nextflow not store state on disk? If it could just query wr for job completion status, that would be better.
kind/feature pri/low
opened by sb10 79
Introduce HTTP POST feature and broadcast workflow process runtime information
Motivation I thought it would be super cool to have a mechanism to let Nextflow send trace reports from the workflow processes during workflow execution, so one can monitor the workflow status and process on remote target sites (Webportals, API webservice with database logging, etc.).

This idea was already mentioned here https://github.com/nextflow-io/nextflow/pull/454 by @mes5k, but using websockets instead. Websockets are a lot more complex as they are stateful and one would need to be very careful with the implementation not to brake workflow execution. So I desided to go for simple HTTP POST requests, and it is the task of the user to provide a webserver, that consumes the information (JSON).

Mechanism I introduced a -with-messages option, that will trigger this functionality. You can specify the url in a messages-scope:

messages { // example URL url = "http://api.myserver.com/workflow/monitor" }

The logic is contained in the MessageObserver class analogous to the other observers, implementing the interface TraceObserver. HTTP POST requests will send a JSON object with information during the following Nextflow execution steps:

onFlowStart() - when the workflow starts

onFlowComplete() - when the workflow is completed

onProcessSubmit() - when a process is submitted

onProcessStart() - when a process starts

onProcessComplete() - when a process is completed

and new:

onFlowError() - which is invoked now in all observers when the Nextflow session catches an error.

For the latter, I observed, that the TaskHandler object was always null, even when I changed the error strategy to 'finish', which should call Session.cancel(handler) and not Session.abort(task.error). For that I had to change two lines of code https://github.com/nextflow-io/nextflow/commit/8104f6d5ccde600396dfa260eeda0e73a9e69b87, hope that is OK.

Information send via HTTP A JSON object with the following structure:

{ "runName": "<Nextflow run name>", "runID": "<Nextflow run ID>", "runStatus": "<started|running|error|completed>", "trace": "<Nextflows trace record>" }

Note: The "trace": "<Nextflows trace record>" entry is NOT present when onFlowStart() and onFlowComplete() are invoked. When present, "trace" contains all information showed in the Nextflow documentation.

As this is my first PR here, critically remark my code and I am happy for feedback!

Sven
opened by sven1103 66
Workflow report should warn if some task executions were ignored
It would be great if the Workflow Report (and other things) could warn if there were tasks that failed but were ignored. To do this, it would great to have variables with the counts of different task outputs. For example:

workflow.task_counts.success workflow.task_counts.cached workflow.task_counts.failed workflow.task_counts.ignored

..or whatever makes sense.

There can be some complication with tasks that fail but are resubmitted and succeed, but I guess if we can just count the ignored ones then we should be fine (we will get a pipeline error if something fails properly).

Thanks!

Phil
opened by ewels 53
Initial version of K8s Jobs

Preview of initial version of K8s Jobs. Feel free to comment what to change and rework. Tests were not updated yet and not all error cases are managed right now.
platform/k8s

opened by xhejtman 52

Queue status command fail on LSF version 8

Bug report

Hi! After upgrading to nextflow 18.10.1 from 0.32.0, I started seeing this message repeatedly in nextflow output:

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

WARN: [LSF] queue status cannot be fetched > exit status: 255

All cluster jobs, however, seem to be working fine, and the nextflow pipeline is producing all files normally.

This is my nextflow.config file:

process.executor = "lsf"
executor.queueSize = 1000

env.PATH = "/lab/solexa_weng/testtube/trinityrnaseq-Trinity-v2.8.4:/lab/solexa_weng/testtube/TransDecoder-TransDecoder-v5.0.2:/lab/solexa_weng/testtube/transrate-1.0.3-linux-x86_64:/lab/solexa_weng/testtube/signalp-4.1:/lab/solexa_weng/testtube/tmhmm-2.0c/bin:/lab/solexa_weng/testtube/ncbi-blast-2.7.1+/bin:/lab/solexa_weng/testtube/bowtie2-2.3.4.3-linux-x86_64:$PATH"

report.enabled = true

Nextflow version: 18.10.1 build 5003
Java version: 1.8.0_161
Operating system: Linux

kind/enhancement platform/lsf

opened by tomas-pluskal 50

Completed jobs are detected with a big delay
Bug report

After job status transitions to SUCCEEDED (maybe also FAILED) within AWS batch, this status update is delayed available (hours) to nextflow as COMPLETED.

Expected behavior and actual behavior

nextflow job status update should be very close to the AWS Batch job transition.

Program output

See attached screenshot of the S3 workdir bucket contents.

This is the matching nextflow log output.

$ time ~/nextflow/nextflow log loving_avogadro -f name,submit,start,complete,duration,realtime,task_id,workdir -F "name =~ '/.*pattern.*/ '" (pattern) 2018-07-10 01:05:19.143 - 2018-07-10 06:26:22.165 5h 21m 3s 31m 24s 121090 s3://test/work/work/e4/cedf1aa5e367f871b12572b0f4be4e real 0m38.435s user 0m38.599s sys 0m1.378s

Comparing the S3 screenshot and nextflow output there is a major delay between job completed and nextflow updating that status.

Steps to reproduce the problem

Submit (maybe hundreds) of jobs to AWS Batch.

Environment

Nextflow version: 0.31.0-SNAPSHOT build 4911

Java version: Java HotSpot(TM) 64-Bit Server VM 1.8.0_171-b11

Operating system: Linux

platform/aws-batch
opened by tbugfinder 47

slow job submission with awsbatch executor

I would like to submit >20000 jobs (one per each file in a S3 bucket) for parallel processing.

In average nextflow submission to awsbatch only creates ~1 job per second to awsbatch which is too slow for this kind of large scale szenario. Each files is processed within 5-30min. This means in a horizontal scaling that the overall processing time could be executed in ~30min (plus overhead for instance creation and reporting).

nextflow trace:

May-18 14:43:52.559 [Task submitter] TRACE n.executor.AwsBatchFileCopyStrategy - [AWS BATCH] Unstaging file path: [RESULT.gz]

May-18 14:43:52.718 [Task submitter] TRACE n.executor.AwsBatchTaskHandler - [AWS BATCH] new job request > {JobName: myjobname,JobQueue: myjobqueue,JobDefinition: myjobdefinition,ContainerOverrides: {Command: [bash, -o, pipefail, -c, trap "{ ret=$?; aws s3 cp --only-show-errors .command.log s3://testbucket123/work/sum-97eccecc-d7e9-431a-a3bc-20440c5c77e7/b0/f3781d20e330ae13258b7c15157b64/.command.log||true; exit $ret; }" EXIT; aws s3 cp --only-show-errors s3://testbucket123/work/sum-97eccecc-d7e9-431a-a3bc-20440c5c77e7/b0/f3781d20e330ae13258b7c15157b64/.command.run - | bash 2>&1 | tee .command.log],},RetryStrategy: {Attempts: 4},Timeout: {AttemptDurationSeconds: 108000}}

May-18 14:43:52.748 [Task submitter] DEBUG n.executor.AwsBatchTaskHandler - [AWS BATCH] submitted > job=78fd5324-1fee-4503-be6f-676d6606f631; work-dir=s3://testbucket123/work/sum-97eccecc-d7e9-431a-a3bc-20440c5c77e7/b0/f3781d20e330ae13258b7c15157b64

May-18 14:43:52.748 [Task submitter] INFO  nextflow.Session - [b0/f3781d] Submitted process > myjobname

using later version:

  Version: 0.30.0-SNAPSHOT build 4813
  Modified: 18-05-2018 15:51 UTC
  System: Linux 4.9.85-38.58.amzn1.x86_64
  Runtime: Groovy 2.4.15 on OpenJDK 64-Bit Server VM 1.8.0_171-b10
  Encoding: UTF-8 (UTF-8)
  Process: 13655@ip-10-4-13-234 [10.4.13.234]
  CPUs: 8 - Mem: 14.7 GB (8.7 GB) - Swap: 0 (0)

.........

#to be added#

AWS Batch itself doesn't throttle the submission. Test:

i="0"
time (while [ $i -lt 30000 ]
do
date
time  (aws batch submit-job \
                --region eu-west-1 \
                --job-name test_test_test \
                --job-queue spot40-queue \
                --job-definition myjobdefinition:1 \
                --retry-strategy attempts=3 \
                --parameters cmd=/bin/ls,batch_run_id=test > /dev/null )
let i=i+1
done
)

==> finished within 20 minutes.

platform/aws-batch

opened by tbugfinder 46

Inconsistency of memory usage values within the report and timeline
Bug report

Expected behavior and actual behavior

expected behavior

Whatever we look at the Tasks table in Raw values mode or Human readable mode we expect to read the exact same values for the memory usage if the decimal system is used. We also expect these values to be the same in the Timeline.

actual behavior

in the Memory usage plot from the report, the memory value reported is 1051574272 (raw value extracted from Plotly) and displayed as 1.052G in the boxplot

in the Tasks table from the report in the Raw values mode, the memory value reported in the field vmem is 1051574272 which is consistent the previous value in the plot

in the Tasks table from the report in the Human readable mode, the memory value reported in the field vmem is 1.1GB seems to be ~ 1051574272 x (1.024 x 1.024). This value should be 0.97935GB = 1051574272/(1024 x 1024 x 1024) if the binary unit is used. Same behavior holds for peak_vmem and allocated memory fields but not for rss and peak_rss.

in the Timeline, the memory displayed is 1002.9 MB which is ~ 1051574272/(1024 x 1024) = 1002.859375. This is correct if the binary unit is used.

It should be mentioned whether you use decimal or binary unit to display the memory usage.

Steps to reproduce the problem

Install the program stress: sudo apt-get install stress or sudo yum install stress

Create the following nextflow.nf script:

#!/usr/bin/env nextflow process TwoCpus4mn { cpus 1 memory '1 GB' """ /usr/bin/time -v stress -c 2 -t 15 -m 1 --vm-bytes 1000000000 """ }

Launch:

nextflow nextflow.nf -with-report report.html -with-timeline timeline.html

Environment

Nextflow version: [18.10.1]

Java version: [OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4, mixed mode)]

Operating system: [Linux 4.15.0-43-generic x86_64]

kind/bug
opened by phupe 44
Enhance process metrics to avoid usage of ps tool
Hi,

I am using Nextflow with Biocontainers, which are gaining increasing traction (link1, link2). Biocontainers use a minimal BusyBox. That caused issues with coreutils that were previously fixed in Issue #321. I am now noticing another error in the .command.err log for any process started by Nextflow that leverages a Biocontainer:

ps: bad -o argument 'state', supported arguments: user,group,comm,args,pid,ppid,pgid,tty,vsz,stat,rss

It doesn't seem to be critical as processes still complete successfully, but I wonder whether this could still be fixed to get full BusyBox/Biocontainers support?

Thanks in advance.
known issue pri/moderate
opened by rspreafico 43
add support for AWS_SESSION_TOKEN
There's a few issues floating around about this:

https://github.com/nextflow-io/nextflow/issues/1724

https://github.com/nextflow-io/nextflow/issues/2839

https://github.com/nextflow-io/nextflow/pull/1265

Figured I'd take a stab at implementing. Wrote a test, but also ran locally and verified that the session token actually works.

I did combine the AWS_ACCESS_KEY/AWS_SECRET_KEY and AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY creds cases. Sure, technically, it means someone could supply an invalid config and it might work. But also, it unifies the session token logic.

I didn't add this to the config options, because who's putting an ephemeral token in their nextflow config??

Truthfully, I believe we're on the wrong path, and that we should just let the AWS Java SDK resolve credentials. It is opinionated in how it resolves credentials. If we delegated, we wouldn't have to account for every weird possible way of specifying aws creds. Perhaps there's a good reason not to though.

Anyway, let me know what you think. I'd love to have this in.
opened by jchorl 0
Add support for Fusion file system to Slum and LSF executors
This PR adds support for Fusion file system to Slurm and LSF grid executors.

The PR implements the following changes

Mark the GridTaskHandler as FusionAwareTask

Use the NXF_CHDIR variable as the common pattern to specify the job work directory instead of relying on grid-specific directives

When fusion execution is required, submit the job execution by creating an inline job wrapper piped via stdin special file, instead of creating a temporary launcher file

This feature for production usage requires #3513 or #3514
opened by pditommaso 0
Add support for Fusion file system for Sarus containerised task
Context

The Fusion file system allows using S3-compatible object storage as the task work directory.

This feature requires mounting the Fusion driver in the container process via a Fuse device.

When using Docker or Podman this can be achieved using the following options

--device /dev/fuse --cap-add SYS_ADMIN

See #3337 for details.

Goal

The goal of this feature is to allow the use of Fusion driver in containers run via the Sarus container engine.
opened by pditommaso 1
Add support for Fusion file system for Singularity containerised task
Context

The Fusion file system allows using S3-compatible object storage as the task work directory.

This feature requires mounting the Fusion driver in the container process via a Fuse device.

When using docker or podman this can be achieved using the following options

--device /dev/fuse --cap-add SYS_ADMIN

See #3337 for details

Goal

The goal of this feature is to allow the use of Fusion driver in containers run via the Singularity and Apptainer engines.
opened by pditommaso 1
xvfb-run waits forever when used in a nextflow process
Bug report

Expected behavior and actual behavior

We wanted to use xvfb-run igv !{batch_file} to allow igv to create a snapshot of an event on a headless AWS instance. Expected behavior is that xvfb-run will create a virtual framebuffer and igv will use it to paint the snapshot. Actual behavior is that xvfb-run waits forever after creating the Xvfb server.

The reason is that Xvfb communicates back by sending SIGUSR1, which should kill a wait in xvfb-run. But nextflow traps (and ignores) USR1 in .command.run, which causes the wait to last forever.

Steps to reproduce the problem

This process file produces the problem, and it should do so with pretty much any input. (Provide a test case that reproduce the problem either with a self-contained script or GitHub repository)

Program output

Under normal circumstances there is NO output to stderr or stdout. If I call xvfb-run via bash -x, the last executed command is the exec command from this block

trap : USR1 (trap '' USR1; exec Xvfb ":$SERVERNUM" $XVFBARGS $LISTENTCP -auth $AUTHFILE >>"$ERRORFILE" 2>&1) & XVFBPID=$! wait || :

The wait command never ends.

Environment

Nextflow version: 22.04.0

Java version: 1.8.2

Operating system: ubuntu 18.04

Bash version: GNU bash, version 5.1.4(1)-release (x86_64-pc-linux-gnu)
opened by TedBrookings 1

Releases(v22.12.0-edge)

v22.12.0-edge(Dec 13, 2022)
Add fair process directive [60d34cfd]

Add support for singularity registry setting [37c1aeb9]

Add AWS profile config setting [66f4669f]

Add support for AWS profile when resolving region [d8947707]

Add support for Sarus container engine (#3470) [54673f18]

Add support for Fusion ARM64 client [d073c538]

Add allowedLocations option to google batch (#3453) [c619eb81]

Add support for AWS config profile in NF config [37112672]

Add warning on Google Logs failure [bdbcdde9]

Add possible values of status in trace.txt to the documentation (#3451) [2425fcfb]

Add support for AWS Glacier restore [b6110766]

Add support for S3 storageClass to publishDir [066f9203]

Add MathHelper utility class [7eecb266]

Fix Wave layer invalid checksum due to not closed stream [e188bbf9]

Fix Fusion test [2245a1c7]

Fix Run fails when home is a symlink [9ff820f4]

Fix math overflow when copying large AWS S3 files [f32ea0ba]

Fix Quote the logName in the Cloud Logging filter (#3464) [b3975063]

Fix Google Batch cloud logging (#3443) [e2bbcf15]

Fix Tower plugin min nextflow requirement [1713a1cd]

Fix TowerArchiver resolve envar paths relative to baseDir (#3438) [46af18e5]

Error & info messages, code comments language fixes (#3475) [29ae36ca]

Replace egrep with grep -E (#3485) [ac0c3035]

Gradle build optimizations (#3483) [19182a57]

Refactor virtual FS schemes to XPath class [fd59b943]

Update concat operator description (#3426) [e8d8c3b5]

Clarify usage of additional options for path qualifier (#3405) [0b70acb1]

Clarify limitation of -with-docker in the docs (#3408) [79afc85d]

Expose process queue as K8s pod label [4df8c8d2]

Prefix nextflow K8s labels with nextflow.io prefix [9951fcd9]

Remove deprecated code [c0b164f2]

Rewrite fetchIamRole and fetchRegion to use AWS SDK (#3425) [e350f319]

Improve Wave config error reporting [ae502668]

Improve K8s retry on transient failures [d86ddc36]

Remove DSL1 output mode [fa400d5f]

Remove support for DSL1 multi into [f664af45]

Bump [email protected] [ccaab713]

Bump [email protected] [c07dcec2]

Bump [email protected]

Bump [email protected] [652d0880]

Bump fusion version URLs 0.6 [a160a8b1]

Bump AWS sdk version 1.12.351 [4dd82b66]

Source code(tar.gz)
Source code(zip)
nextflow(14.40 KB)
nextflow-22.12.0-edge-all(86.44 MB)
v22.10.4(Dec 9, 2022)
Fix math overflow when copying large AWS S3 files [07f7cb72]

Bump [email protected] [d96ca4c3]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.4-all(85.79 MB)
v22.11.1-edge(Nov 29, 2022)
Fix TowerArchiver resolve envar paths relative to baseDir (#3438) [53e6348c]

Fix tower plugin min nextflow requirement [103dbf74]

Fix typos in the documentation [ci skip] (#3441) [ae95d90d]

Add support for Java 19 [811e7ca8]

Add support for custom Conda channels (#3435) [ci fast] [0884e80e]

Add time directive to AWS Batch, clean language (#3436) [ci skip] [1ed2640a]

Update err message [ci fast] [ab5bd81b]

Fix Flux executor config (#3432) [ci fast] [68b45c92]

Bump [email protected] [fe669152]

Bump [email protected] [2dbf9906]

Source code(tar.gz)
Source code(zip)
nextflow(14.40 KB)
nextflow-22.11.1-edge-all(85.82 MB)
v22.11.0-edge(Nov 23, 2022)
Add support for Apptainer container engine (#3345) [29f98975]

Add Flux executor to nextflow (#3412) [cc9fc3f0] [3711cef0]

Add support for Wave containerPlatform [10d56ca1]

Add CSI ephemeral volume for K8s (#2988) [f18f6e81]

Add support for disk directive and emptyDir to K8s (#2998) [b548e1c7]

Add Fusion support for custom S3 endpoints [fba9b649]

Add support for Tower refresh token for dataset (#3366) [a19e055a]

Prevent infinite loop while fetching git tags and branches [aa974d44]

Improve file porter logging [626420b6]

Improve script err logging [2714770e]

Extend onFilePublish notification adding source path (#3284) [81acc3ef]

Remove cpu limits from K8s pod spec builder (#3338) [dc7f78bf]

Improve task name logging [5ddb7e3f]

Add tower endpoint to wave [ci fast] [b725ddc4]

Add Azure SAS token validation [e2244b48]

Use cpus-shares for container resources (#3383) [b38c3880]

Report full path scheme on error [4089ba65]

Allow identity based authentication on Azure Batch (#3132) [a08611be]

Fix support for remote file mail attachment (#3384) [6b496bb9]

Fix task cache logging [ed37c4fd]

Fix unexpected error on task resume [1c3f4685]

Fix stripIndent failure with java 17 (#3377) [2b115c50]

Fix -dockerized execution #3137 (#3148) [64a81a58]

Improve default value in cli help of nextflow log -s (#3371) [2141f96e]

upgrade jsoup and snakeyaml version (#3374) [6e2ca454]

Bump Java 17 lang version + Java 11 as target [34f133e2]

Bump [email protected] [e307912e]

Bump [email protected] [07391d96]

Bump [email protected] [4d787561]

Source code(tar.gz)
Source code(zip)
nextflow(14.39 KB)
nextflow-22.11.0-edge-all(85.82 MB)
v22.10.3(Nov 21, 2022)
Prevent infinite loop while fetching git tags/branches [73a59d33]

Bump [email protected] [f9b54ce3]

Improve S3 thread pool config [01541b0a]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.3-all(85.79 MB)
v22.10.2(Nov 13, 2022)
Fix initialize the plugin once it's defined (#3360) [dd150b92]

Fix tags typo in docs (#3355) [b82df4e0]

Fix unexpected error on task resume [e02e8c27]

Fix template script in trace record [cf828a68]

Fix ip v6 support for K8s executor [53af5a7c]

Fix refresh token for tower served resources [9dec2b66] #3366

Fix full path scheme on error [1399f451]

Add note to some process implicit variables (#3373) [0374f63a]

Add retry policy on plugin download failure [e8dbec3f]

Add examples of when dynamic output filenames are important (#3275) [72a17306]

Update google batch java sdk, add serviceAccountEmail and installGpuDrivers (#3324) [7f7007a8]

Update github actions to v3 (#3376) [d3b4a837]

Update error messages and docs with new report filename behavior [f5725480]

Bump [email protected] [164edf7c]

Bump [email protected] [30cb118d]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.2-all(85.79 MB)
v22.10.1(Oct 27, 2022)
Fix mount pwd in the container when work dir is a symlink [ca397181] [b5b7d3cd]

Fix secrets command name in the CLI (#3320) [ci fast] [321486df]

Fix ver num rendering [ci fast] #3226 [5312a25e]

Fix K8s config namespace is not applied [b3d33e3b]

Fix log fetching from remote storage [be356939] [3efa1a20]

Update docs about default mail ssl protocol [ci skip] (#3299) [15ffffc1]

Update docs repeated words from documentation (#3311) [ci skip] [d59ea186]

Update docs to clarify the difference between collect and toList (#3276) [7ee2b008]

Update docs [516a7441]

Update docs adding Fusion [11eac707]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.1-all(86.30 MB)
v22.10.0(Oct 13, 2022)
Fix timestamp encoding [47a3a3c4]

Minor type change in Bridge executor [1f446ee1]

Bump [email protected] [326803ff]

Included in previous RC and edge releases:

22.10.0-RC3 - 7 Oct 2022

Fix K8s context selection [58b354e6]

22.10.0-RC2 - 7 Oct 2022

Improve K8s labels/annotation validation [a569afdf]

Bump fusion final url [80398880]

Bump [email protected] [a2b44c4d]

Update docs

22.10.0-RC1 - 3 Oct 2022

Add module binaries enabling flag + docs [c50e178f]

Add timestamp and fingerprint to wave request [a5a7e138]

Add missing inputs to the incremental task "test" (#1442) [ci fast] [f85d59a6]

Add support for refresh token to Wave [ed9f25f1]

Add pretty option to dump operator [ci fast] [4218b299]

Add support for custom S3 content type [02afa332]

Get rid of file name rolling for report files [a762ed59]

Ignore JGit warning when missing git tool [a94fa9c1]

Remove jobname limit to HyperQueue executor (#3251) [99604ccb]

Rename baseImage to mambaImage [ci fast] [50086028]

Fix failing test [ci fast] [e6790003]

Fix K8s cluster token when using serviceAccount [c3364d0f]

Fix hanging test [44c04874]

Improve docs (#3212) [ci skip] [5d80388c]

Bump fusion snapshot [ci skip] [8e03f655]

Bump wave endpoint [a044cc6a]

Bump [email protected] [7424dc4b]

Bump fusion config v0.5.1 [4dbdf112]

Bump [email protected]

Bump [email protected]

Bump [email protected]

22.09.7-edge - 28 Sep 2022

Fix Issue copying file bigger than 5gb to S3 [18fd9a44]

Fix chmod command to accommodate hidden files in bindir (or empty bindir) (#3247) [a0fcc7b0]

Bump [email protected] [f7f96e6f]

22.09.6-edge - 26 Sep 2022

Add SocketTimeoutException to k8s client request retry [527e0d5d]

Add MaxErrorRetry to K8s config [58be2128]

Add tags propagation to AWS Batch [d64eeffc]

Fix task resume when updating fusion layer [f38fd2db]

Fix Channel merge still deprecated for DSL2 (#3220) [d27384d2]

Apply GCP resourceLabels to the VirtualMachine (#3234) [2275c03c]

Update Google Batch mount point with the requirements [5aec28ac]

Improve wave error reporting [73842215]

Bump fusion 0.4.x [26f1f896]

22.09.5-edge - 21 Sep 2022

Use default wave strategy [abbfa7f4]

Handle errors reported by tower report writer [0e814647]

Fix AWS S3 copy object [b3b90d23]

22.09.4-edge - 19 Sep 2022

Add Fusion display name [f789d457]

Add container cleanup [cd2ae7dc]

Add Wave interactive debug session [ce7fa651]

Add support for wave build and cache repositories[692043ff]

Add shutdown to Google Batch client [8f413cf7]

Add native_id to Google Batch handler [352b4239]

Add java sts library to enable use of IRSA in k8s (#3207) [62df42c3]

Add support for module custon bin dirs [77f55262]

Add support for tower token to wave client [928d5b04]

Update CLI docs (#3200) [8acebee6]

Fix issue with empty report file [9cc4f079]

Do not return resource bundle for root module [775c7ed9]

Improve tower config [ee03c243]

Bump groovy 3.0.13 [4a17e198]

22.09.3-edge - 10 Sep 2022

Add fusion support to K8s executor (#3142) [6bb27b32]

Fix shutdown/cleanup hooks invocation [f4185070]

Fix Use smaller buffer size for S3 stream uploader [8c643074] [9926d15d]

Fix Azure NPE on missing pool opts [d5c0aabd]

Fix handling of targetDir when using Fusion fs [2091b272]

Document aws.batch.retryMode config (#3195) [56f75e0c]

22.09.2-edge - 7 Sep 2022

Fix thread pool race condition on shutdown [8d2b0587]

Fix Intermediate multipart upload requires a minimum size (#3193) [0b66aed6]

Fix fusion enable detection [3ef91512]

Add before-afterScript warning to docs (#3167) [09464590]

Add httpReadTimeout and httpConnectTimeout config to K8s client [064f9bc4]

Add support for Wave build & cache repos [98a275ba]

Finalise secrets feature (#3161) [49021b82]

Update executor retry config docs (#3001) [aed6c234]

Change Azure test pool name [0c724504]

Improve Wave error reporting [b11d0f11]

Remove unneeded launcher file remapping [a255118d]

Update Azure vm types [80f5fbe4]

Update docs logos (#3174) [529bad81]

22.09.1-edge - 1 Sep 2022

Add support for Charliecloud v0.28 (#3116) [84f43a33] <Patrick Hüther>

Add Support for EC-encrypted keys for K8s client [fd759d09]

Add support for Bridge batch scheduler (#3106) [343c17e6]

Add fusion support to local executor [17160bb0] [6cfb51e7]

Add getTags and getBranches to BitBucketServer [53bd89cd]

Add retry strategy to httpfs client [55f9c87b]

Add support for project resources [c2ad6566]

Add mamba build options [987a13cb]

Fix Do not override tower endpoint in the config [41fb1ad0]

Fix Hyperqueue job names must be less than 40 chars #3151 [8e43670b]

Fix typo in ConfigBuilder.groovy (#3143) [659e6108]

Fix Resume for dynamic resolved containers [13483ff2]

Improve fusion env handling [10f35b60]

Improve foreign file(s) cache detection logic [3a9352c8]

Rename ModuleBundle to ResourcesBundle [0e51dc0f]

Use quiet output mode for hyperqueue executor (#3103) [70a91fdf]

Wave improve conda settings [6f087fec]

Improve secrets cmd (#3158) [115b2f3d]

Improve Wave resolution strategy [2eb700c6]

Improve Az Batch err handling and testing [85d31e8d]

Bump google-cloud-batch 0.2.2

Bump spock 2.2

22.08.2-edge - 16 Aug 2022

Fix queueSize setting is not honoured by AWS Batch executor (again) #3117 [1733bb2e]

Add files() method to docs (#3123) [00bb8896]

Refactor wave packing [bc876986]

Improve logging [aa380d5f]

Update dockerfile [e6329282]

22.08.1-edge - 11 Aug 2022

Add support for disabling config include [e0859a12]

Add experimental fusion support [1854f1f2]

Add support for plugin provided function extension (#3079) [16230c2b]

Add support for AWS Batch logs group (#3092) [4ef043ac]

Add share identifier to Aws Batch (#3089) [c0253aba]

Improve Tower cache manager [0091afc5]

Improve S3 copy via xfer manager [02d2beae]

Reports a warning when using NXF vars in the config [009ec256]

Make wake token cache duration config [5f955fc9]

Patch unable to start non-core plugin [a55f58ff]

Increase S3 upload chunk size to 100 MB [9c94a080]

Change Google Batch disk directive to override boot disk size (#3097) [7e1c0686]

Fix queueSize setting is not honoured by AWS Batch executor (#3093) [d07bb52b]

Fix Allow disabling scratch with Google Batch [e8e5c721]

Fix Emit relative path when relative parameter is set (#3072) [39797759]

Bump [email protected] [e46d341d]

Bump [email protected] [cdc2be53]

Bump [email protected] [c39935a5]

Bump [email protected] [ccdf62d0]

22.08.0-edge - 1 Aug 2022

Add warning to env config docs (#3083) [ca933c16]

Add -with-conda CLI option (#3073) [98b2ac80]

Add simple wave plugin cli commands [8888b866]

Add default wave plugin [7793a0ec]

Add boot disk, cpu platform to google batch (#3058) [17a8483d]

Add support for GPU accelerator to Google Batch (#3056) [f34ad7f6]

Add support for archive dir to tower plugin [c234681a]

Add support tower cache backup/restore [bc2f9d13]

Add disk directive to google batch (#3057) [ec6e290c]

Add retry when Azure submit fails with OperationTimedOut [6a3f9742]

Add warning when Google Batch quota is exceeded (#3066) [6b9c52ad]

Allow fully disabling history file [0a45f858]

Allow the support function overloading and default parameters (#3011) [042d3857]

Improve S3 file upload/download via Transfer manager [7e8d2a5a]

Prevent overriding container entrypoint [b3a4bf85]

Update FileTransfer pool settings [503aafce]

Remove deprecated commands [93228b4b]

Prevent nextflow config to break tower launch [e059a724]

Refactor Google Batch executor to use Java API (#3044) [31a6e85c]

Fix Unable to disable scratch attribute with AWS Batch [1770f73a]

Fix unit test setting explicit permissions for test files [1c821139]

Fix Default plugins are overriden by config plugins [46cf3bfa]

Fix S3 transfer download directory [b7bf9fe5]

Fix NPE while setting S3 ObjectMetada #3031 [d6163431]

Fix Unable to retrieve AWS batch instance type #1658 [3c4d4d3b]

Fix AWS Batch job definition conflict (#3048) [e5084418]

Fix JAVA_TOOL_OPTIONS breaking launch #1716 [0e7b416d]

Fix add ps shared objects to Dockerfile (#3033) [1c23b40a]

Parallelize build integration tests [807800a3]

Bump google-cloud-nio:0.124.8 [dfaa9d19]

Bump groovy 3.0.12 [5c900b91]

Bump Moment.js 2.29.4 [a9ced868]

Bump [email protected] [12f17176]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Bump [email protected]

22.07.1-edge - 13 Jul 2022

Add support for Google Batch API v1 [4c116d58] [e85d87ee]

Add time directive support for K8s executor (#2948) [2b6f70a8]

Add docs aws.client.s3PathStyleAccess config (#3000) [20005500]

Allow to override lsf.conf settings with nextflow config #2862 [dae191a1]

Allow hybrid containers execution [0af1bcb3]

Improve error msg when script file cannot be read [52c2780e]

Improve error reporting for custom function [877c7931]

Improve error message for missing plugin extension [4a43db84]

Improve test #3019 [7c37e0be]

Rename kuberun -pod-image to -head-image [2576ba62]

Externalise sqldb plugin source code [17e80b4f]

Fix escape unstage outputs with double quotes #2912 #2904 #2790 [49ff02a6]

Fix Exception when settings AWS Batch containerOptions #3019 [89312ad8]

Fix Missing query param in http file (#2918) [43cc8511]

Fix Publish copy mode for S3 based path [085f6b2b]

Fix Fail fast uploads to S3 (#2969) [7fd1a6e1]

Fix null script name in launch info [7118849f]

Bump [email protected] [a06b4442]

Bump [email protected] [3331826f]

Bump [email protected] [de62fd3f]

Bump [email protected] [3234ddd5]

22.07.0-edge - [SKIPPED]

22.06.1-edge - 17 Jun 2022

Fix CodeCommit creds handling + [email protected] [70fc0745]

Fix typo in log message [a8f8529d]

Add more scientists to the list of random names [8d5b36a2]

22.06.0-edge - 9 Jun 2022

Add AWS CodeCommit initial support [80fba6e9]

Add support for 307 and 308 HTTP redirection [92382012]

Add DirWatcher v2 [209c82cd]

Add Moriondo in the list of random names [e0abca58]

Add preview CLI option (#2914) [aa8f1aa4]

Fix Git config resultion error [64436697]

Fix StackOverflowError when dump all profiles (#2922) [28cd11a2]

Fix gradle warning message in nf-sqldb (#2921) [b09ceabe]

Fix log for LsfExecutror perTaskReserve attribute [7c3ec874]

Fix external pod deletion for jobs (#2915) [4dd1af7a]

Prevent function overloading in module definition [c0b522ab]

Improve error message of non sensical include (#2623) [285fe49c]

Mount PWD path only when scratch is used [9b3c6e31]

Stripe sensitive data into strings (#2908) [7fa4c86c]

Dump scm content when trace is enabled [c3117ada]

22.05.0-edge 25 May 2022

Add Hyperqueue executor (#2896) [ffa5712e]

Add support for K8s Job resource [c70eb12d]

Add support for time process directive in GLS executor (#2880) [1402e183]

Add support for priviledge option for K8s containers [7ffe3a02]

Add DSL1 option to docs (#2836) [d30841a5]

Add support for container options to Azure Batch [3f4f00f9]

Add support for move operation to AWS S3 [8c0ddfd5]

Add K8s execution hostname in the trace file (#2828) [ebaef92a]

Add support for AWS S3 encyption using a custom KMS key [c1e45aa9]

Add support for Micromamba [383e023f]

Add jaxb-api dependecy to nf-amazon [c1a09f87]

Add strict mode config setting [ci fast] [696e70b5]

Add -head-prescript option to kuberun (#2830) [9e387055]

Fix missing err message on submit failure [233e67f0] (#2899)

Fix resolve azure devops repositories when projectId is present [2500ff01]

Fix AthenaJdbc into distribution zip [853a1f2a] [4b3579d5] [70ef7ee3]

Fix Inconsistent bool parsing #2881 [40bf2b2a]

Fix Unable to pull pipeline if config file is not in default branch (#2876) [4ee5b04f]

Fix Prevent crash when scratch dir does not exist (#2888) [9ef44ae5]

Fix DSL1 detection to invalid workflow keyword matching [fe0700b0] (#2879)

Fix Aws Batch retry policy on spot reclaim [6e029b79]

Fix 'false' string in config interpreted as true (#2865) [079a18ce]

Improve Git Provider config logging [d7dbca8ec]

Improve K8s task handler [1822b2ca]

Improve missing workflow err message [da101e8f] (#2871)

Include revision in the Azure Repos provider when specified (#2861) [3342c767]

Remove unnecessary change dir echo [372d1f47]

Abort execution when accessing undefined params with strict mode [93836081]

Update docker base image [50cd7956]

Update default SKU for Azure Batch (#2868) [9ea09dba] ]

Update dependencies [405d9545]

Refactoring to prevent name conflict [aba2671b]

Few DSL syntax to explicit declaration of plugin extensions (#2820) [bfc4a067]

Sanitize k8s label and annotation keys, don't sanitize annotation value (#2843) [5287a984]

Docs improvement (#2835) [09e5bca3]

Bump Jgit 6.1 [7186348c]

Bump Spock 2.1 [51100d16]

Bump capsule 1.1.1 [20ec1697]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.0-all(86.30 MB)
v22.10.0-RC3(Oct 7, 2022)
Fix K8s context selection [58b354e6]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.0-RC3-all(86.30 MB)
v22.10.0-RC2(Oct 7, 2022)
22.10.0-RC2 - 7 Oct 2022

Improve K8s labels/annotation validation [a569afdf]

Bump fusion final URL [80398880]

Bump [email protected] [a2b44c4d]

Update docs

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.0-RC2-all(86.30 MB)
v22.10.0-RC1(Oct 3, 2022)
Add module binaries enabling flag + docs [c50e178f]

Add timestamp and fingerprint to wave request [a5a7e138]

Add missing inputs to the incremental task "test" (#1442) [ci fast] [f85d59a6]

Add support for refresh token to Wave [ed9f25f1]

Add pretty option to dump operator [ci fast] [4218b299]

Add support for custom S3 content type [02afa332]

Get rid of file name rolling for report files [a762ed59]

Ignore JGit warning when missing git tool [a94fa9c1]

Remove jobname limit to HyperQueue executor (#3251) [99604ccb]

Rename baseImage to mambaImage [ci fast] [50086028]

Fix failing test [ci fast] [e6790003]

Fix K8s cluster token when using serviceAccount [c3364d0f]

Fix hanging test [44c04874]

Improve docs (#3212) [ci skip] [5d80388c]

Bump fusion snapshot [ci skip] [8e03f655]

Bump wave endpoint [a044cc6a]

Bump [email protected] [7424dc4b]

Bump fusion config v0.5.1 [4dbdf112]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Source code(tar.gz)
Source code(zip)
nextflow(14.37 KB)
nextflow-22.10.0-RC1-all(86.30 MB)
v22.09.7-edge(Sep 28, 2022)
Fix Issue copying file bigger than 5gb to S3 [18fd9a44]

Fix chmod command to accommodate hidden files in bindir (or empty bindir) (#3247) [a0fcc7b0]

Bump [email protected] [f7f96e6f]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.7-edge-all(86.29 MB)
v22.09.6-edge(Sep 26, 2022)
Add SocketTimeoutException to k8s client request retry [527e0d5d]

Add MaxErrorRetry to K8s config [58be2128]

Add tags propagation to AWS Batch [d64eeffc]

Fix task resume when updating fusion layer [f38fd2db]

Fix Channel merge still deprecated for DSL2 (#3220) [d27384d2]

Apply GCP resourceLabels to the VirtualMachine (#3234) [2275c03c]

Update Google Batch mount point with the requirements [5aec28ac]

Improve wave error reporting [73842215]

Bump fusion 0.4.x [26f1f896]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.6-edge-all(86.29 MB)
v22.09.5-edge(Sep 21, 2022)
Use default wave strategy [abbfa7f4]

Handle errors reported by tower report writer [0e814647]

Fix AWS S3 copy object [b3b90d23]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.5-edge-all(86.29 MB)
v22.09.4-edge(Sep 19, 2022)
Add Fusion display name [f789d457]

Add container cleanup [cd2ae7dc]

Add Wave interactive debug session [ce7fa651]

Add support for wave build and cache repositories[692043ff]

Add shutdown to Google Batch client [8f413cf7]

Add native_id to Google Batch handler [352b4239]

Add java sts library to enable use of IRSA in k8s (#3207) [62df42c3]

Add support for module custon bin dirs [77f55262]

Add support for tower token to wave client [928d5b04]

Update CLI docs (#3200) [8acebee6]

Fix issue with empty report file [9cc4f079]

Do not return resource bundle for root module [775c7ed9]

Improve tower config [ee03c243]

Bump groovy 3.0.13 [4a17e198]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.4-edge-all(86.29 MB)
v22.09.3-edge(Sep 10, 2022)
Add Fusion support to K8s executor (#3142) [6bb27b32]

Fix shutdown/cleanup hooks invocation [f4185070]

Fix Use smaller buffer size for S3 stream uploader [8c643074] [9926d15d]

Fix Azure NPE on missing pool opts [d5c0aabd]

Fix handling of targetDir when using Fusion fs [2091b272]

Document aws.batch.retryMode config (#3195) [56f75e0c]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.3-edge-all(86.17 MB)
v22.09.2-edge(Sep 7, 2022)
Fix thread pool race condition on shutdown [8d2b0587]

Fix Intermediate multipart upload requires a minimum size (#3193) [0b66aed6]

Fix fusion enable detection [3ef91512]

Add before-afterScript warning to docs (#3167) [09464590]

Add httpReadTimeout and httpConnectTimeout config to K8s client [064f9bc4]

Add support for Wave build & cache repos [98a275ba]

Finalise secrets feature (#3161) [49021b82]

Update executor retry config docs (#3001) [aed6c234]

Change Azure test pool name [0c724504]

Improve Wave error reporting [b11d0f11]

Remove unneeded launcher file remapping [a255118d]

Update Azure vm types [80f5fbe4]

Update docs logos (#3174) [529bad81]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.2-edge-all(86.16 MB)
v22.09.1-edge(Sep 1, 2022)
Add support for Charliecloud v0.28 (#3116) [84f43a33] <Patrick Hüther>

Add Support for EC-encrypted keys for K8s client [fd759d09]

Add support for Bridge batch scheduler (#3106) [343c17e6]

Add fusion support to local executor [17160bb0] [6cfb51e7]

Add getTags and getBranches to BitBucketServer [53bd89cd]

Add retry strategy to httpfs client [55f9c87b]

Add support for project resources [c2ad6566]

Add mamba build options [987a13cb]

Fix Do not override tower endpoint in the config [41fb1ad0]

Fix Hyperqueue job names must be less than 40 chars #3151 [8e43670b]

Fix typo in ConfigBuilder.groovy (#3143) [659e6108]

Fix Resume for dynamically resolved containers [13483ff2]

Improve fusion env handling [10f35b60]

Improve foreign file(s) cache detection logic [3a9352c8]

Rename ModuleBundle to ResourcesBundle [0e51dc0f]

Use quiet output mode for Hyperqueue executor (#3103) [70a91fdf]

Wave improve Conda settings [6f087fec]

Improve secrets cmd (#3158) [115b2f3d]

Improve Wave resolution strategy [2eb700c6]

Improve Az Batch err handling and testing [85d31e8d]

Bump google-cloud-batch 0.2.2

Bump Spock 2.2

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.09.1-edge-all(86.16 MB)
v22.08.2-edge(Aug 16, 2022)
Fix queueSize setting is not honoured by AWS Batch executor (again) #3117 [1733bb2e]

Add files() method to docs (#3123) [00bb8896]

Refactor wave packing [bc876986]

Improve logging [aa380d5f]

Update dockerfile [e6329282]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.08.2-edge-all(85.71 MB)
v22.08.1-edge(Aug 11, 2022)
Add support for disabling config include [e0859a12]

Add experimental fusion support [1854f1f2]

Add support for plugin provided function extension (#3079) [16230c2b]

Add support for AWS Batch logs group (#3092) [4ef043ac]

Add share identifier to Aws Batch (#3089) [c0253aba]

Improve Tower cache manager [0091afc5]

Improve S3 copy via xfer manager [02d2beae]

Reports a warning when using NXF vars in the config [009ec256]

Make wake token cache duration config [5f955fc9]

Patch unable to start non-core plugin [a55f58ff]

Increase S3 upload chunk size to 100 MB [9c94a080]

Change Google Batch disk directive to override boot disk size (#3097) [7e1c0686]

Fix queueSize setting is not honoured by AWS Batch executor (#3093) [d07bb52b]

Fix Allow disabling scratch with Google Batch [e8e5c721]

Fix Emit relative path when relative parameter is set (#3072) [39797759]

Bump [email protected] [e46d341d]

Bump [email protected] [cdc2be53]

Bump [email protected] [c39935a5]

Bump [email protected] [ccdf62d0]

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.08.1-edge-all(85.71 MB)
v22.08.0-edge(Aug 1, 2022)
22.08.0-edge - 1 Aug 2022

Add warning to env config docs (#3083) [ca933c16]

Add -with-conda CLI option (#3073) [98b2ac80]

Add simple wave plugin cli commands [8888b866]

Add default wave plugin [7793a0ec]

Add boot disk, cpu platform to Google Batch (#3058) [17a8483d]

Add support for GPU accelerator to Google Batch (#3056) [f34ad7f6]

Add support for archive dir to tower plugin [c234681a]

Add support tower cache backup/restore [bc2f9d13]

Add disk directive to google batch (#3057) [ec6e290c]

Add retry when Azure submit fails with OperationTimedOut [6a3f9742]

Add warning when Google Batch quota is exceeded (#3066) [6b9c52ad]

Allow fully disabling history file [0a45f858]

Allow the support function overloading and default parameters (#3011) [042d3857]

Improve S3 file upload/download via Transfer manager [7e8d2a5a]

Prevent overriding container entrypoint [b3a4bf85]

Update FileTransfer pool settings [503aafce]

Remove deprecated commands [93228b4b]

Prevent nextflow config to break tower launch [e059a724]

Refactor Google Batch executor to use Java API (#3044) [31a6e85c]

Fix Unable to disable scratch attribute with AWS Batch [1770f73a]

Fix unit test setting explicit permissions for test files [1c821139]

Fix Default plugins are overriden by config plugins [46cf3bfa]

Fix S3 transfer download directory [b7bf9fe5]

Fix NPE while setting S3 ObjectMetada #3031 [d6163431]

Fix Unable to retrieve AWS batch instance type #1658 [3c4d4d3b]

Fix AWS Batch job definition conflict (#3048) [e5084418]

Fix JAVA_TOOL_OPTIONS breaking launch #1716 [0e7b416d]

Fix add ps shared objects to Dockerfile (#3033) [1c23b40a]

Parallelize build integration tests [807800a3]

Bump google-cloud-nio:0.124.8 [dfaa9d19]

Bump groovy 3.0.12 [5c900b91]

Bump Moment.js 2.29.4 [a9ced868]

Bump [email protected] [12f17176]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Bump [email protected]

Breaking changes

The container entrypoint is not overridden anymore with /bin/bash by Nextflow when using Local, Kubernetes and batch scheduler executors. This has been made for consistency with the AWS, Google and Azure Batch executors that do not set it either. Make sure the containers used in your pipeline use sh or bash as the default entry point. If you want to continue to use the old behaviour set the variable NXF_CONTAINER_ENTRYPOINT_OVERRIDE=true in the launch environment

The use of Conda environment defined in the process definition via the conda directive needs to be enabled in an explicit manner using either the CLI option -with-conda or using the config setting conda.enabled=true or setting environment variable NXF_CONDA_ENABLED=true. See https://github.com/nextflow-io/nextflow/pull/3073 for details.

Source code(tar.gz)
Source code(zip)
nextflow(14.38 KB)
nextflow-22.08.0-edge-all(85.69 MB)
v22.04.5(Jul 15, 2022)
Allow fully disabling history file [1a36c9bc]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.04.5-all(51.42 MB)
v22.07.1-edge(Jul 13, 2022)
Add support for Google Batch API v1 [4c116d58] [e85d87ee]

Add time directive support for K8s executor (#2948) [2b6f70a8]

Add docs aws.client.s3PathStyleAccess config (#3000) [20005500]

Allow to override lsf.conf settings with nextflow config #2862 [dae191a1]

Allow hybrid containers execution [0af1bcb3]

Improve error msg when script file cannot be read [52c2780e]

Improve error reporting for custom function [877c7931]

Improve error message for missing plugin extension [4a43db84]

Improve test #3019 [7c37e0be]

Rename kuberun -pod-image to -head-image [2576ba62]

Externalise sqldb plugin source code [17e80b4f]

Fix escape unstage outputs with double quotes #2912 #2904 #2790 [49ff02a6]

Fix Exception when settings AWS Batch containerOptions #3019 [89312ad8]

Fix Missing query param in http file (#2918) [43cc8511]

Fix Publish copy mode for S3 based path [085f6b2b]

Fix Fail fast uploads to S3 (#2969) [7fd1a6e1]

Fix null script name in launch info [7118849f]

Bump [email protected] [a06b4442]

Bump [email protected] [3331826f]

Bump [email protected] [de62fd3f]

Bump [email protected] [3234ddd5]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.07.1-edge-all(83.83 MB)
v22.04.4(Jun 19, 2022)
Fix Publish copy mode for s3 based path [bb510ce6]

Add strict mode config setting [b0567e62]

Update docker base image [b00c1418]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.04.4-all(51.42 MB)
v22.06.1-edge(Jun 17, 2022)

null
Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.06.1-edge-all(83.98 MB)
v22.06.0-edge(Jun 9, 2022)
Add AWS CodeCommit initial support [80fba6e9]

Add support for 307 and 308 HTTP redirection [92382012]

Add DirWatcher v2 [209c82cd]

Add Moriondo in the list of random names [e0abca58]

Add preview CLI option (#2914) [aa8f1aa4]

Fix Git config resultion error [64436697]

Fix StackOverflowError when dump all profiles (#2922) [28cd11a2]

Fix gradle warning message in nf-sqldb (#2921) [b09ceabe]

Fix log for LsfExecutror perTaskReserve attribute [7c3ec874]

Fix external pod deletion for jobs (#2915) [4dd1af7a]

Prevent function overloading in module definition [c0b522ab]

Improve error message of non sensical include (#2623) [285fe49c]

Mount PWD path only when scratch is used [9b3c6e31]

Stripe sensitive data into strings (#2908) [7fa4c86c]

Dump scm content when trace is enabled [c3117ada]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.06.0-edge-all(83.98 MB)
v22.05.0-edge(May 25, 2022)
Add Hyperqueue executor (#2896) [ffa5712e]

Add support for K8s Job resource [c70eb12d]

Add support for time process directive in GLS executor (#2880) [1402e183]

Add support for priviledge option for K8s containers [7ffe3a02]

Add DSL1 option to docs (#2836) [d30841a5]

Add support for container options to Azure Batch [3f4f00f9]

Add support for move operation to AWS S3 [8c0ddfd5]

Add K8s execution hostname in the trace file (#2828) [ebaef92a]

Add support for AWS S3 encyption using a custom KMS key [c1e45aa9]

Add support for Micromamba [383e023f]

Add jaxb-api dependecy to nf-amazon [c1a09f87]

Add strict mode config setting [ci fast] [696e70b5]

Add -head-prescript option to kuberun (#2830) [9e387055]

Fix missing err message on submit failure [233e67f0] (#2899)

Fix resolve azure devops repositories when projectId is present [2500ff01]

Fix AthenaJdbc into distribution zip [853a1f2a] [4b3579d5] [70ef7ee3]

Fix Inconsistent bool parsing #2881 [40bf2b2a]

Fix Unable to pull pipeline if config file is not in default branch (#2876) [4ee5b04f]

Fix Prevent crash when scratch dir does not exist (#2888) [9ef44ae5]

Fix DSL1 detection to invalid workflow keyword matching [fe0700b0] (#2879)

Fix Aws Batch retry policy on spot reclaim [6e029b79]

Fix 'false' string in config interpreted as true (#2865) [079a18ce]

Improve Git Provider config logging [d7dbca8ec]

Improve K8s task handler [1822b2ca]

Improve missing workflow err message [da101e8f] (#2871)

Include revision in the Azure Repos provider when specified (#2861) [3342c767]

Remove unnecessary change dir echo [372d1f47]

Abort execution when accessing undefined params with strict mode [93836081]

Update docker base image [50cd7956]

Update default SKU for Azure Batch (#2868) [9ea09dba] ]

Update dependencies [405d9545]

Refactoring to prevent name conflict [aba2671b]

Few DSL syntax to explicit declaration of plugin extensions (#2820) [bfc4a067]

Sanitize k8s label and annotation keys, don't sanitize annotation value (#2843) [5287a984]

Docs improvement (#2835) [09e5bca3]

Bump Jgit 6.1 [7186348c]

Bump Spock 2.1 [51100d16]

Bump capsule 1.1.1 [20ec1697]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.05.0-edge-all(49.55 MB)
v22.04.3(May 18, 2022)
Fix dsl1 detection (#2879) [1a7ea0d1]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.04.3-all(51.41 MB)
v22.04.2(May 16, 2022)
Fix stackoverflow error when probe dsl [a05fcbea]

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.04.2-all(51.41 MB)
v22.04.1(May 15, 2022)
Improve dsl detection [739b959f]

Improve missing workflow err message [f3fc081b] (#2871)

Fix Aws Batch retry policy on spot reclaim [d855f0d9]

Update default SKU for Azure Batch (#2868) [be60fc14]

Bump nf-amazon 1.7.2

Bump nf-azure 0.13.2

Source code(tar.gz)
Source code(zip)
nextflow(14.28 KB)
nextflow-22.04.1-all(51.41 MB)