DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

Overview

DeepConsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

DeepConsensus overview diagram

Installation

From pip package

pip install deepconsensus==0.1.0

You can ignore errors regarding google-nucleus installation, such as ERROR: Failed building wheel for google-nucleus.

From source

git clone https://github.com/google/deepconsensus.git
cd deepconsensus
source install.sh

(Optional) After source install.sh, if you want to run all unit tests, you can do:

./run_all_tests.sh

Usage

See the quick start.

Where does DeepConsensus fit into my pipeline?

After a PacBio sequencing run, DeepConsensus is meant to be run on the CCS reads and subreads to create new corrected reads in FASTQ format that can take the place of the CCS reads for downstream analyses.

See the quick start for an example of inputs and outputs.

NOTE: This initial release of DeepConsensus (v0.1) is not yet optimized for speed, and only runs on CPUs. We anticipate this version to be too slow for many uses. We are now prioritizing speed improvements, which we anticipate can achieve acceptable runtimes.

How to cite

If you are using DeepConsensus in your work, please cite:

DeepConsensus: Gap-Aware Sequence Transformers for Sequence Correction

Disclaimer

This is not an official Google product.

NOTE: the content of this research code repository (i) is not intended to be a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

Comments
  • Error: ValueError: Shapes (2640, 280) and (560, 280) are incompatible

    Error: ValueError: Shapes (2640, 280) and (560, 280) are incompatible

    Hi, I'm getting the below error:

    Total params: 9,525,067
    Trainable params: 9,525,067
    Non-trainable params: 0
    _________________________________________________________________
    Traceback (most recent call last):
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 130, in restore
        assigned_variable = resource_variable_ops.shape_safe_assign_variable_handle(
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 308, in shape_safe_assign_variable_handle
        shape.assert_is_compatible_with(value_tensor.shape)
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py", line 1291, in assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))
    ValueError: Shapes (2640, 280) and (560, 280) are incompatible
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
        sys.exit(run())
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 111, in run
        app.run(main, flags_parser=parse_flags)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 102, in main
        app.run(quick_inference.main, argv=passed)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 814, in main
        outcome_counter = run()
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 734, in run
        loaded_model, model_params = initialize_model(
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 476, in initialize_model
        checkpoint.restore(
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2537, in restore
        status = self.read(save_path, options=options)
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 2417, in read
        result = self._saver.restore(save_path=save_path, options=options)
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1468, in restore
        base.CheckpointPosition(
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 295, in restore
        restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 1060, in _restore_from_checkpoint_position
        current_position.checkpoint.restore_saveables(tensor_saveables,
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 349, in restore_saveables
        new_restore_ops = functional_saver.MultiDeviceSaver(
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 415, in restore
        restore_ops = restore_fn()
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 398, in restore_fn
        restore_ops.update(saver.restore(file_prefix, options))
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 112, in restore
        restore_ops[saveable.name] = saveable.restore(
      File "/home/user01/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/saveable_object_util.py", line 133, in restore
        raise ValueError(
    ValueError: Received incompatible tensor with shape (560, 280) when attempting to restore variable with shape (2640, 280) and name model/transformer_input_condenser/kernel/.ATTRIBUTES/VARIABLE_VALUE.
    
    opened by gevro 15
  • Running deepconsensus results in

    Running deepconsensus results in "free(): invalid pointer" error

    I installed deepconsensus via pip in a virtualenv like this:

    virtualenv /apps/deepconsensus/1.0.0/python-3.8.2_cpu
    source /apps/deepconsensus/1.0.0/python-3.8.2_cpu/bin/activate
    pip install pyyaml==5.4.1 'deepconsensus[cpu]==1.0.0'
    

    I used pyyaml==5.4.1 since tf-models-official 2.10.0 requires pyyaml<6.0,>=5.1. I'm using Python 3.8.2.

    When I run deepconsensus, even just for the help message, it fails--deepconsensus -h resulted in this error message:

    2022-11-10 12:51:17.016037: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
    To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
    *** Error in `/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python': free(): invalid pointer: 0x00007f075c296c80 ***
    ======= Backtrace: =========
    /lib64/libc.so.6(+0x81329)[0x7f078b91d329]
    /lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f075c000192]
    /lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f075c0005e3]
    /lib64/libstdc++.so.6(+0x71555)[0x7f075c001555]
    /lib64/libpthread.so.0(+0x620b)[0x7f078c37920b]
    /lib64/libstdc++.so.6(+0x715a1)[0x7f075c0015a1]
    /lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f075c0015e3]
    /lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f075bffe43c]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so(+0xb1150)[0x7f075bdd0150]
    /lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f078c7d59c3]
    /lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f078c7da59e]
    /lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
    /lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f078c7d9b8b]
    /lib64/libdl.so.2(+0xfab)[0x7f078c16ffab]
    /lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
    /lib64/libdl.so.2(+0x15ad)[0x7f078c1705ad]
    /lib64/libdl.so.2(dlopen+0x31)[0x7f078c170041]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_FindSharedFuncptr+0x16b)[0x539abb]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_LoadDynamicModuleWithSpec+0x159)[0x503e69]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x501a23]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x5f91)[0x428bc1]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x1fb5)[0x424be5]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x3fd)[0x502c8d]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5ee426]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437c24]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x4e6)[0x502d76]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x6e78)[0x429aa8]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyEval_EvalCode+0x23)[0x4e1b43]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5efe34]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
    /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
    ======= Memory map: ========
    00400000-006f3000 r-xp 00000000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
    008f2000-008f3000 r--p 002f2000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
    008f3000-0092b000 rw-p 002f3000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
    0092b000-0094c000 rw-p 00000000 00:00 0 
    01dd2000-03209000 rw-p 00000000 00:00 0                                  [heap]
    7f0754000000-7f0754021000 rw-p 00000000 00:00 0 
    7f0754021000-7f0758000000 ---p 00000000 00:00 0 
    7f075bd1f000-7f075bdbc000 r--p 00000000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
    7f075bdbc000-7f075bf0b000 r-xp 0009d000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
    7f075bf0b000-7f075bf7f000 r--p 001ec000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
    7f075bf7f000-7f075bf80000 ---p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
    7f075bf80000-7f075bf85000 r--p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
    7f075bf85000-7f075bf8f000 rw-p 00265000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.sozsh: abort      deepconsensus -h
    

    Is there something different I should have done when installing?

    opened by mjg0 14
  • Error (ccs software) - No space left on device (tmp file).

    Error (ccs software) - No space left on device (tmp file).

    Dear @pichuan,

    Using the docker system, I installed and run the tests successfully on the second version of the software in our cluster. Now, during the tests with real data, we found some issues in the ccs software. Usually, we run the software in the nodes, and the output it's printed in the front-end. The front-end has more than 1PB of space, while the nodes only have +-60gb. The tmp files seem to be saved in node, right? It's possible to relocate these temp files to another path?

    Below, you can consult the error.

    | 20220123 11:27:44.871 | FATAL | Could not write BAM record to /tmp/13552.1.all.q/thread.7_0.ccs.bam | 20220123 11:27:44.982 | FATAL | Caught existing deep IO exception, ignoring thread 13 | 20220123 11:27:44.985 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.985 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.986 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.987 | FATAL | Previous exception in Stage DraftPolish. Pumping buffers empty! | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Exception thrown in CCSWF | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.988 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.989 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:44.990 | FATAL | Previous exception in DraftStage, aborting thread 13 | 20220123 11:27:45.283 | FATAL | Previous exception in DraftStage, aborting thread 9 | 20220123 11:27:45.285 | FATAL | Previous exception in DraftStage, aborting thread 9 | 20220123 11:27:45.405 | FATAL | Previous exception in DraftStage, aborting thread 5 | 20220123 11:27:45.406 | FATAL | Previous exception in DraftStage, aborting thread 5 | 20220123 11:27:45.669 | FATAL | Previous exception in DraftStage, aborting thread 0 | 20220123 11:27:45.672 | FATAL | Previous exception in DraftStage, aborting thread 0 | 20220123 11:27:45.684 | FATAL | Previous exception in DraftStage, aborting thread 14 | 20220123 11:27:45.686 | FATAL | Previous exception in DraftStage, aborting thread 14 | 20220123 11:27:45.864 | FATAL | Previous exception in DraftStage, aborting thread 4 | 20220123 11:27:45.868 | FATAL | Previous exception in DraftStage, aborting thread 4 | 20220123 11:27:45.955 | FATAL | Previous exception in DraftStage, aborting thread 28 | 20220123 11:27:45.957 | FATAL | Previous exception in DraftStage, aborting thread 28 | 20220123 11:27:46.128 | FATAL | Previous exception in DraftStage, aborting thread 29 | 20220123 11:27:46.130 | FATAL | Previous exception in DraftStage, aborting thread 29 | 20220123 11:27:46.157 | FATAL | Previous exception in DraftStage, aborting thread 3 | 20220123 11:27:46.159 | FATAL | Previous exception in DraftStage, aborting thread 3 | 20220123 11:27:46.223 | FATAL | Previous exception in DraftStage, aborting thread 6 | 20220123 11:27:46.224 | FATAL | Previous exception in DraftStage, aborting thread 6 | 20220123 11:27:46.293 | FATAL | Previous exception in DraftStage, aborting thread 24 | 20220123 11:27:46.296 | FATAL | Previous exception in DraftStage, aborting thread 24 | 20220123 11:27:46.549 | FATAL | Previous exception in DraftStage, aborting thread 16 | 20220123 11:27:46.551 | FATAL | Previous exception in DraftStage, aborting thread 16 | 20220123 11:27:46.867 | FATAL | Previous exception in DraftStage, aborting thread 7 | 20220123 11:27:46.868 | FATAL | Previous exception in DraftStage, aborting thread 7 | 20220123 11:27:46.894 | FATAL | Previous exception in DraftStage, aborting thread 15 | 20220123 11:27:46.897 | FATAL | Previous exception in DraftStage, aborting thread 15 | 20220123 11:27:46.959 | FATAL | Previous exception in DraftStage, aborting thread 20 | 20220123 11:27:46.963 | FATAL | Previous exception in DraftStage, aborting thread 20 | 20220123 11:27:47.092 | FATAL | Previous exception in DraftStage, aborting thread 8 | 20220123 11:27:47.095 | FATAL | Previous exception in DraftStage, aborting thread 8 | 20220123 11:27:47.176 | FATAL | Previous exception in DraftStage, aborting thread 27 | 20220123 11:27:47.177 | FATAL | Previous exception in DraftStage, aborting thread 27 | 20220123 11:27:47.538 | FATAL | Previous exception in DraftStage, aborting thread 23 | 20220123 11:27:47.542 | FATAL | Previous exception in DraftStage, aborting thread 23 | 20220123 11:27:47.653 | FATAL | Previous exception in DraftStage, aborting thread 21 | 20220123 11:27:47.655 | FATAL | Previous exception in DraftStage, aborting thread 19 | 20220123 11:27:47.657 | FATAL | Previous exception in DraftStage, aborting thread 19 | 20220123 11:27:47.658 | FATAL | Previous exception in DraftStage, aborting thread 21 | 20220123 11:27:47.731 | FATAL | Previous exception in DraftStage, aborting thread 10 | 20220123 11:27:47.733 | FATAL | Previous exception in DraftStage, aborting thread 10 | 20220123 11:27:47.781 | FATAL | Previous exception in DraftStage, aborting thread 1 | 20220123 11:27:47.783 | FATAL | Previous exception in DraftStage, aborting thread 1 | 20220123 11:27:47.789 | FATAL | Previous exception in DraftStage, aborting thread 17 | 20220123 11:27:47.791 | FATAL | Previous exception in DraftStage, aborting thread 17 | 20220123 11:27:47.915 | FATAL | Previous exception in DraftStage, aborting thread 18 | 20220123 11:27:47.916 | FATAL | Previous exception in DraftStage, aborting thread 18 | 20220123 11:27:47.933 | FATAL | Previous exception in DraftStage, aborting thread 22 | 20220123 11:27:47.934 | FATAL | Previous exception in DraftStage, aborting thread 22 | 20220123 11:27:48.426 | FATAL | Previous exception in DraftStage, aborting thread 12 | 20220123 11:27:48.427 | FATAL | Previous exception in DraftStage, aborting thread 12 | 20220123 11:27:48.475 | FATAL | Previous exception in DraftStage, abortinccsg thread 26 | 20220123 11:27:48.478 | FATAL | Previous exception in DraftStage, aborting thread 26 | 20220123 11:27:50.215 | FATAL | Previous exception in DraftStage, aborting thread 25 | 20220123 11:27:50.218 | FATAL | Previous exception in DraftStage, aborting thread 25 | 20220123 11:27:50.302 | FATAL | Previous exception in DraftStage, aborting thread 11 | 20220123 11:27:50.305 | FATAL | Previous exception in DraftStage, aborting thread 11 | 20220123 11:27:51.195 | FATAL | Previous exception in DraftStage, aborting thread 2 | 20220123 11:27:51.199 | FATAL | Previous exception in DraftStage, aborting thread 2 | 20220123 11:27:52.068 | FATAL | ccs ERROR: [pbbam] BAM writer ERROR: could not write record: file: /tmp/13552.1.all.q/thread.7_0.ccs.bam.tmp reason: No space left on device

    Best Regard

    André

    opened by AMMMachado 12
  • Training tutorial?

    Training tutorial?

    Curious if there will be a tutorial (sorry if I missed it somewhere in repo) for training to make a custom DeepConsensus model for other than human PacBio HiFi reads? I tried DeepConsensus on Bacterial PacBio HiFi/CCS reads, and as expected, it does not perform as well as it does for Human.

    opened by jelber2 11
  • Chunking with deepconsensus

    Chunking with deepconsensus

    Hi, Is there a way to do chunking with deepconsensus? For datasets where ccs was previously run and then chunks were merged, it would help, because then deepconsensus can do some form of chunking rather than having to redo ccs chunking back on the original subreads.

    opened by gevro 7
  • CPU installation on HPC, no sudo, no docker/singularity

    CPU installation on HPC, no sudo, no docker/singularity

    Hello! :wave:

    I am trying to install DeepConsensus on an HPC environment (no GPU) without root permissions, and without Docker/Singularity access. I am approaching this by making a deepconsensus venv and trying to install with a modified install script. I have done the following steps:

    1. python3 -m venv $SCRATCH/venvs/deepconsensus_venv_1
    2. source deepconsensus_venv_1/bin/activate
    3. source install_edit.sh , where install_edit.sh skips the apt-get steps and just upgrades pip + installs requirements.txt & intel-tensorflow for the venv

    Here are the contents of install_edit.sh:

    #!/bin/bash
    # Copyright (c) 2021, Google Inc.
    # All rights reserved.
    # 
    # Redistribution and use in source and binary forms, with or without modification,
    # are permitted provided that the following conditions are met:
    # 
    # 1. Redistributions of source code must retain the above copyright notice, this
    #    list of conditions and the following disclaimer.
    # 
    # 2. Redistributions in binary form must reproduce the above copyright notice,
    #    this list of conditions and the following disclaimer in the documentation
    #    and/or other materials provided with the distribution.
    # 
    # 3. Neither the name of Google Inc. nor the names of its contributors
    #    may be used to endorse or promote products derived from this software without
    #    specific prior written permission.
    # 
    # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
    # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
    # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
    # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
    # ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
    # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
    # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
    # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
    # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
    # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
    # Usage:  source install.sh
    #
    # This script installs all the packages required to build DeepConsensus.
    #
    # This script will run as-is on Ubuntu 20.04.
    #
    # We also assume that apt-get is already installed and available.
    
    function note_build_stage {
      echo "========== [$(date)] Stage '${1}' starting"
    }
    
    # Update package list
    ################################################################################
    
    # Install pip
    ################################################################################
    python3 -m pip install --upgrade pip
    
    # Update PATH so that newly installed pip is the one we actually use.
    export PATH="$SCRATCH/venvs/deepconsensus_venv_1/bin:$PATH"
    echo "$(pip --version)"
    
    # Install python packages used by DeepConsensus.
    ################################################################################
    python3 -m pip install -r requirements.txt
    python3 -m pip install "intel-tensorflow>=2.4.0,<=2.7.0"
    

    And here is the output from running that install script:

    (deepconsensus_venv_1) [labueg@login04 /lustre/fs5/vgl/scratch/labueg/deepconsensus]$ source install_edit.sh 
    Collecting pip
      Using cached pip-22.1.2-py3-none-any.whl (2.1 MB)
    Installing collected packages: pip
      Attempting uninstall: pip
        Found existing installation: pip 20.2.3
        Uninstalling pip-20.2.3:
          Successfully uninstalled pip-20.2.3
    Successfully installed pip-22.1.2
    pip 22.1.2 from /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages/pip (python 3.8)
    Collecting numpy>=1.19
      Using cached numpy-1.23.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
    Collecting pandas>=1.1
      Using cached pandas-1.4.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
    Collecting tf-models-official<=2.7.0,>=2.4.0
      Using cached tf_models_official-2.7.0-py2.py3-none-any.whl (1.8 MB)
    Collecting ml_collections>=0.1.0
      Using cached ml_collections-0.1.1.tar.gz (77 kB)
      Preparing metadata (setup.py) ... done
    Collecting absl-py>=0.13.0
      Using cached absl_py-1.1.0-py3-none-any.whl (123 kB)
    Collecting pysam
      Using cached pysam-0.19.1.tar.gz (3.9 MB)
      Preparing metadata (setup.py) ... done
    Collecting python-dateutil>=2.8.1
      Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
    Collecting pytz>=2020.1
      Using cached pytz-2022.1-py2.py3-none-any.whl (503 kB)
    Collecting py-cpuinfo>=3.3.0
      Using cached py-cpuinfo-8.0.0.tar.gz (99 kB)
      Preparing metadata (setup.py) ... done
    Collecting opencv-python-headless
      Using cached opencv_python_headless-4.6.0.66-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (48.3 MB)
    Collecting tensorflow-datasets
      Using cached tensorflow_datasets-4.6.0-py3-none-any.whl (4.3 MB)
    Collecting gin-config
      Using cached gin_config-0.5.0-py3-none-any.whl (61 kB)
    Collecting tensorflow-hub>=0.6.0
      Using cached tensorflow_hub-0.12.0-py2.py3-none-any.whl (108 kB)
    Collecting tensorflow-text>=2.7.0
      Using cached tensorflow_text-2.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB)
    Collecting Cython
      Using cached Cython-0.29.30-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
    Collecting oauth2client
      Using cached oauth2client-4.1.3-py2.py3-none-any.whl (98 kB)
    Collecting scipy>=0.19.1
      Using cached scipy-1.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (41.6 MB)
    Collecting six
      Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
    Collecting tensorflow-model-optimization>=0.4.1
      Using cached tensorflow_model_optimization-0.7.2-py2.py3-none-any.whl (237 kB)
    Collecting pycocotools
      Using cached pycocotools-2.0.4-cp38-cp38-linux_x86_64.whl
    Collecting kaggle>=1.3.9
      Using cached kaggle-1.5.12.tar.gz (58 kB)
      Preparing metadata (setup.py) ... done
    Collecting tensorflow>=2.7.0
      Using cached tensorflow-2.9.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.7 MB)
    Collecting matplotlib
      Using cached matplotlib-3.5.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
    Collecting seqeval
      Using cached seqeval-1.2.2.tar.gz (43 kB)
      Preparing metadata (setup.py) ... done
    Collecting tensorflow-addons
      Using cached tensorflow_addons-0.17.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
    Collecting pyyaml>=5.1
      Using cached PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)
    Collecting google-api-python-client>=1.6.7
      Using cached google_api_python_client-2.51.0-py2.py3-none-any.whl (8.6 MB)
    Collecting psutil>=5.4.3
      Using cached psutil-5.9.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (284 kB)
    Collecting sacrebleu
      Using cached sacrebleu-2.1.0-py3-none-any.whl (92 kB)
    Collecting Pillow
      Using cached Pillow-9.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
    Collecting sentencepiece
      Using cached sentencepiece-0.1.96-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
    Collecting tf-slim>=1.1.0
      Using cached tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
    Collecting contextlib2
      Using cached contextlib2-21.6.0-py2.py3-none-any.whl (13 kB)
    Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5
      Using cached google_api_core-2.8.2-py3-none-any.whl (114 kB)
    Collecting google-auth-httplib2>=0.1.0
      Using cached google_auth_httplib2-0.1.0-py2.py3-none-any.whl (9.3 kB)
    Collecting google-auth<3.0.0dev,>=1.16.0
      Using cached google_auth-2.8.0-py2.py3-none-any.whl (164 kB)
    Collecting httplib2<1dev,>=0.15.0
      Using cached httplib2-0.20.4-py3-none-any.whl (96 kB)
    Collecting uritemplate<5,>=3.0.1
      Using cached uritemplate-4.1.1-py2.py3-none-any.whl (10 kB)
    Collecting certifi
      Using cached certifi-2022.6.15-py3-none-any.whl (160 kB)
    Collecting requests
      Using cached requests-2.28.0-py3-none-any.whl (62 kB)
    Collecting tqdm
      Using cached tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
    Collecting python-slugify
      Using cached python_slugify-6.1.2-py2.py3-none-any.whl (9.4 kB)
    Collecting urllib3
      Using cached urllib3-1.26.9-py2.py3-none-any.whl (138 kB)
    Collecting grpcio<2.0,>=1.24.3
      Using cached grpcio-1.47.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.5 MB)
    Collecting tensorboard<2.10,>=2.9
      Using cached tensorboard-2.9.1-py3-none-any.whl (5.8 MB)
    Collecting tensorflow-io-gcs-filesystem>=0.23.1
      Using cached tensorflow_io_gcs_filesystem-0.26.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.4 MB)
    Collecting libclang>=13.0.0
      Using cached libclang-14.0.1-py2.py3-none-manylinux1_x86_64.whl (14.5 MB)
    Collecting protobuf<3.20,>=3.9.2
      Using cached protobuf-3.19.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
    Collecting keras-preprocessing>=1.1.1
      Using cached Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
    Collecting opt-einsum>=2.3.2
      Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
    Collecting google-pasta>=0.1.1
      Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
    Collecting flatbuffers<2,>=1.12
      Using cached flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
    Collecting h5py>=2.9.0
      Using cached h5py-3.7.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (4.5 MB)
    Collecting gast<=0.4.0,>=0.2.1
      Using cached gast-0.4.0-py3-none-any.whl (9.8 kB)
    Collecting wrapt>=1.11.0
      Using cached wrapt-1.14.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (81 kB)
    Collecting packaging
      Using cached packaging-21.3-py3-none-any.whl (40 kB)
    Collecting tensorflow-estimator<2.10.0,>=2.9.0rc0
      Using cached tensorflow_estimator-2.9.0-py2.py3-none-any.whl (438 kB)
    Requirement already satisfied: setuptools in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorflow>=2.7.0->tf-models-official<=2.7.0,>=2.4.0->-r requirements.txt (line 3)) (49.2.1)
    Collecting keras<2.10.0,>=2.9.0rc0
      Using cached keras-2.9.0-py2.py3-none-any.whl (1.6 MB)
    Collecting astunparse>=1.6.0
      Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
    Collecting termcolor>=1.1.0
      Using cached termcolor-1.1.0.tar.gz (3.9 kB)
      Preparing metadata (setup.py) ... done
    Collecting typing-extensions>=3.6.6
      Using cached typing_extensions-4.2.0-py3-none-any.whl (24 kB)
    Collecting dm-tree~=0.1.1
      Using cached dm_tree-0.1.7-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (142 kB)
    Collecting kiwisolver>=1.0.1
      Using cached kiwisolver-1.4.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.2 MB)
    Collecting pyparsing>=2.2.1
      Using cached pyparsing-3.0.9-py3-none-any.whl (98 kB)
    Collecting cycler>=0.10
      Using cached cycler-0.11.0-py3-none-any.whl (6.4 kB)
    Collecting fonttools>=4.22.0
      Using cached fonttools-4.33.3-py3-none-any.whl (930 kB)
    Collecting pyasn1-modules>=0.0.5
      Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
    Collecting pyasn1>=0.1.7
      Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
    Collecting rsa>=3.1.4
      Using cached rsa-4.8-py3-none-any.whl (39 kB)
    Collecting colorama
      Using cached colorama-0.4.5-py2.py3-none-any.whl (16 kB)
    Collecting regex
      Using cached regex-2022.6.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (764 kB)
    Collecting portalocker
      Using cached portalocker-2.4.0-py2.py3-none-any.whl (16 kB)
    Collecting tabulate>=0.8.9
      Using cached tabulate-0.8.10-py3-none-any.whl (29 kB)
    Collecting scikit-learn>=0.21.3
      Using cached scikit_learn-1.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.2 MB)
    Collecting typeguard>=2.7
      Using cached typeguard-2.13.3-py3-none-any.whl (17 kB)
    Collecting toml
      Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
    Collecting etils[epath]
      Using cached etils-0.6.0-py3-none-any.whl (98 kB)
    Collecting importlib-resources
      Using cached importlib_resources-5.8.0-py3-none-any.whl (28 kB)
    Collecting promise
      Using cached promise-2.3.tar.gz (19 kB)
      Preparing metadata (setup.py) ... done
    Collecting tensorflow-metadata
      Using cached tensorflow_metadata-1.9.0-py3-none-any.whl (51 kB)
    Collecting dill
      Using cached dill-0.3.5.1-py2.py3-none-any.whl (95 kB)
    Collecting wheel<1.0,>=0.23.0
      Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
    Collecting googleapis-common-protos<2.0dev,>=1.56.2
      Using cached googleapis_common_protos-1.56.3-py2.py3-none-any.whl (211 kB)
    Collecting cachetools<6.0,>=2.0.0
      Using cached cachetools-5.2.0-py3-none-any.whl (9.3 kB)
    Collecting idna<4,>=2.5
      Using cached idna-3.3-py3-none-any.whl (61 kB)
    Collecting charset-normalizer~=2.0.0
      Using cached charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
    Collecting joblib>=1.0.0
      Using cached joblib-1.1.0-py2.py3-none-any.whl (306 kB)
    Collecting threadpoolctl>=2.0.0
      Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
    Collecting werkzeug>=1.0.1
      Using cached Werkzeug-2.1.2-py3-none-any.whl (224 kB)
    Collecting tensorboard-data-server<0.7.0,>=0.6.0
      Using cached tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)
    Collecting markdown>=2.6.8
      Using cached Markdown-3.3.7-py3-none-any.whl (97 kB)
    Collecting google-auth-oauthlib<0.5,>=0.4.1
      Using cached google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
    Collecting tensorboard-plugin-wit>=1.6.0
      Using cached tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
    Collecting zipp
      Using cached zipp-3.8.0-py3-none-any.whl (5.4 kB)
    Collecting text-unidecode>=1.3
      Using cached text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
    Collecting requests-oauthlib>=0.7.0
      Using cached requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
    Collecting importlib-metadata>=4.4
      Using cached importlib_metadata-4.12.0-py3-none-any.whl (21 kB)
    Collecting oauthlib>=3.0.0
      Using cached oauthlib-3.2.0-py3-none-any.whl (151 kB)
    Using legacy 'setup.py install' for ml_collections, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for pysam, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for kaggle, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for py-cpuinfo, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for seqeval, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for termcolor, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for promise, since package 'wheel' is not installed.
    Installing collected packages: text-unidecode, termcolor, tensorboard-plugin-wit, sentencepiece, pytz, pysam, pyasn1, py-cpuinfo, libclang, keras, gin-config, flatbuffers, dm-tree, zipp, wrapt, wheel, werkzeug, urllib3, uritemplate, typing-extensions, typeguard, tqdm, toml, threadpoolctl, tensorflow-io-gcs-filesystem, tensorflow-estimator, tensorboard-data-server, tabulate, six, rsa, regex, pyyaml, python-slugify, pyparsing, pyasn1-modules, psutil, protobuf, portalocker, Pillow, oauthlib, numpy, kiwisolver, joblib, idna, gast, fonttools, etils, dill, Cython, cycler, contextlib2, colorama, charset-normalizer, certifi, cachetools, absl-py, tf-slim, tensorflow-model-optimization, tensorflow-hub, scipy, sacrebleu, requests, python-dateutil, promise, packaging, opt-einsum, opencv-python-headless, ml_collections, keras-preprocessing, importlib-resources, importlib-metadata, httplib2, h5py, grpcio, googleapis-common-protos, google-pasta, google-auth, astunparse, tensorflow-metadata, tensorflow-addons, scikit-learn, requests-oauthlib, pandas, oauth2client, matplotlib, markdown, kaggle, google-auth-httplib2, google-api-core, tensorflow-datasets, seqeval, pycocotools, google-auth-oauthlib, google-api-python-client, tensorboard, tensorflow, tensorflow-text, tf-models-official
      Running setup.py install for termcolor ... done
      Running setup.py install for pysam ... done
      Running setup.py install for py-cpuinfo ... done
      Running setup.py install for promise ... done
      Running setup.py install for ml_collections ... done
      Running setup.py install for kaggle ... done
      Running setup.py install for seqeval ... done
    Successfully installed Cython-0.29.30 Pillow-9.1.1 absl-py-1.1.0 astunparse-1.6.3 cachetools-5.2.0 certifi-2022.6.15 charset-normalizer-2.0.12 colorama-0.4.5 contextlib2-21.6.0 cycler-0.11.0 dill-0.3.5.1 dm-tree-0.1.7 etils-0.6.0 flatbuffers-1.12 fonttools-4.33.3 gast-0.4.0 gin-config-0.5.0 google-api-core-2.8.2 google-api-python-client-2.51.0 google-auth-2.8.0 google-auth-httplib2-0.1.0 google-auth-oauthlib-0.4.6 google-pasta-0.2.0 googleapis-common-protos-1.56.3 grpcio-1.47.0 h5py-3.7.0 httplib2-0.20.4 idna-3.3 importlib-metadata-4.12.0 importlib-resources-5.8.0 joblib-1.1.0 kaggle-1.5.12 keras-2.9.0 keras-preprocessing-1.1.2 kiwisolver-1.4.3 libclang-14.0.1 markdown-3.3.7 matplotlib-3.5.2 ml_collections-0.1.1 numpy-1.23.0 oauth2client-4.1.3 oauthlib-3.2.0 opencv-python-headless-4.6.0.66 opt-einsum-3.3.0 packaging-21.3 pandas-1.4.3 portalocker-2.4.0 promise-2.3 protobuf-3.19.4 psutil-5.9.1 py-cpuinfo-8.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycocotools-2.0.4 pyparsing-3.0.9 pysam-0.19.1 python-dateutil-2.8.2 python-slugify-6.1.2 pytz-2022.1 pyyaml-6.0 regex-2022.6.2 requests-2.28.0 requests-oauthlib-1.3.1 rsa-4.8 sacrebleu-2.1.0 scikit-learn-1.1.1 scipy-1.8.1 sentencepiece-0.1.96 seqeval-1.2.2 six-1.16.0 tabulate-0.8.10 tensorboard-2.9.1 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.9.1 tensorflow-addons-0.17.1 tensorflow-datasets-4.6.0 tensorflow-estimator-2.9.0 tensorflow-hub-0.12.0 tensorflow-io-gcs-filesystem-0.26.0 tensorflow-metadata-1.9.0 tensorflow-model-optimization-0.7.2 tensorflow-text-2.9.0 termcolor-1.1.0 text-unidecode-1.3 tf-models-official-2.7.0 tf-slim-1.1.0 threadpoolctl-3.1.0 toml-0.10.2 tqdm-4.64.0 typeguard-2.13.3 typing-extensions-4.2.0 uritemplate-4.1.1 urllib3-1.26.9 werkzeug-2.1.2 wheel-0.37.1 wrapt-1.14.1 zipp-3.8.0
    Collecting intel-tensorflow<=2.7.0,>=2.4.0
      Using cached intel_tensorflow-2.7.0-cp38-cp38-manylinux2010_x86_64.whl (186.4 MB)
    Requirement already satisfied: absl-py>=0.4.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.0)
    Requirement already satisfied: flatbuffers<3.0,>=1.12 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.12)
    Requirement already satisfied: google-pasta>=0.1.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.2.0)
    Requirement already satisfied: numpy>=1.14.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.23.0)
    Requirement already satisfied: grpcio<2.0,>=1.24.3 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.47.0)
    Collecting tensorflow-estimator<2.8,~=2.7.0rc0
      Using cached tensorflow_estimator-2.7.0-py2.py3-none-any.whl (463 kB)
    Requirement already satisfied: six>=1.12.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.16.0)
    Requirement already satisfied: protobuf>=3.9.2 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.19.4)
    Requirement already satisfied: opt-einsum>=2.3.2 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.3.0)
    Requirement already satisfied: tensorboard~=2.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (2.9.1)
    Requirement already satisfied: h5py>=2.9.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (3.7.0)
    Requirement already satisfied: wheel<1.0,>=0.32.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.37.1)
    Requirement already satisfied: termcolor>=1.1.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.0)
    Requirement already satisfied: gast<0.5.0,>=0.2.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.4.0)
    Requirement already satisfied: libclang>=9.0.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (14.0.1)
    Requirement already satisfied: wrapt>=1.11.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.14.1)
    Requirement already satisfied: astunparse>=1.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.6.3)
    Requirement already satisfied: keras-preprocessing>=1.1.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (1.1.2)
    Collecting keras<2.8,>=2.7.0rc0
      Using cached keras-2.7.0-py2.py3-none-any.whl (1.3 MB)
    Requirement already satisfied: typing-extensions>=3.6.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (4.2.0)
    Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.21.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from intel-tensorflow<=2.7.0,>=2.4.0) (0.26.0)
    Requirement already satisfied: markdown>=2.6.8 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.3.7)
    Requirement already satisfied: requests<3,>=2.21.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.28.0)
    Requirement already satisfied: setuptools>=41.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (49.2.1)
    Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.6.1)
    Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.8.1)
    Requirement already satisfied: werkzeug>=1.0.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.1.2)
    Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.4.6)
    Requirement already satisfied: google-auth<3,>=1.6.3 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.8.0)
    Requirement already satisfied: cachetools<6.0,>=2.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (5.2.0)
    Requirement already satisfied: rsa<5,>=3.1.4 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (4.8)
    Requirement already satisfied: pyasn1-modules>=0.2.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.2.8)
    Requirement already satisfied: requests-oauthlib>=0.7.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.3.1)
    Requirement already satisfied: importlib-metadata>=4.4 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from markdown>=2.6.8->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (4.12.0)
    Requirement already satisfied: charset-normalizer~=2.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2.0.12)
    Requirement already satisfied: certifi>=2017.4.17 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (2022.6.15)
    Requirement already satisfied: urllib3<1.27,>=1.21.1 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (1.26.9)
    Requirement already satisfied: idna<4,>=2.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.3)
    Requirement already satisfied: zipp>=0.5 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.8.0)
    Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (0.4.8)
    Requirement already satisfied: oauthlib>=3.0.0 in /lustre/fs5/vgl/scratch/labueg/venvs/deepconsensus_venv_1/lib/python3.8/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->intel-tensorflow<=2.7.0,>=2.4.0) (3.2.0)
    Installing collected packages: tensorflow-estimator, keras, intel-tensorflow
      Attempting uninstall: tensorflow-estimator
        Found existing installation: tensorflow-estimator 2.9.0
        Uninstalling tensorflow-estimator-2.9.0:
          Successfully uninstalled tensorflow-estimator-2.9.0
      Attempting uninstall: keras
        Found existing installation: keras 2.9.0
        Uninstalling keras-2.9.0:
          Successfully uninstalled keras-2.9.0
    ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    tensorflow 2.9.1 requires keras<2.10.0,>=2.9.0rc0, but you have keras 2.7.0 which is incompatible.
    tensorflow 2.9.1 requires tensorflow-estimator<2.10.0,>=2.9.0rc0, but you have tensorflow-estimator 2.7.0 which is incompatible.
    Successfully installed intel-tensorflow-2.7.0 keras-2.7.0 tensorflow-estimator-2.7.0
    

    I then ran run_all_tests.sh on a compute node via slurm and the full output is in this gist: https://gist.github.com/abueg/421ca972563b5c32825cde17525a49bf

    There is this exception in the output, which leads me to think the installation failed:

    Exception ignored in: <function Pool.__del__ at 0x7fe60613b820>
    Traceback (most recent call last):
      File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/pool.py", line 268, in __del__
        self._change_notifier.put(None)
      File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/queues.py", line 368, in put
        self._writer.send_bytes(obj)
      File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
        self._send_bytes(m[offset:offset + size])
      File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
        self._send(header + buf)
      File "/vggpfs/fs3/vgl/store/labueg/anaconda3/lib/python3.8/multiprocessing/connection.py", line 368, in _send
        n = write(self._handle, buf)
    OSError: [Errno 9] Bad file descriptor
    

    Would this error arise from the tensorflow dependency problem, in which case should I try to install tensorflow 2.9.1? Or have I gone wrong somewhere else?

    There is also this error before the exception: [E::idx_find_and_load] Could not retrieve index file for 'deepconsensus/testdata/human_1m/subreads_to_ccs.bam', but there is no index file for that file in the deepconsensus/testdata/human_1m/ directory, so should I index it prior to running the tests?

    Any help would be appreciated, thank you in advance!

    opened by abueg 7
  • Error: ModuleNotFoundError: No module named 'pandas._libs.interval'

    Error: ModuleNotFoundError: No module named 'pandas._libs.interval'

    Hi, Getting this error with new deepconsensus 1.0.0. I'm running the exact same command that worked for the previous deepconsensus version. I'm running from the docker.

    2022-10-11 16:17:35.711032: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
    To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
    Traceback (most recent call last):
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/__init__.py", line 30, in <module>
        from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/_libs/__init__.py", line 13, in <module>
        from pandas._libs.interval import Interval
    ModuleNotFoundError: No module named 'pandas._libs.interval'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
        sys.exit(run())
      File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 111, in run
        app.run(main, flags_parser=parse_flags)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 99, in main
        from deepconsensus.inference import quick_inference
      File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 53, in <module>
        import pandas as pd
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/pandas/__init__.py", line 34, in <module>
        raise ImportError(
    ImportError: C extension: No module named 'pandas._libs.interval' not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.
    
    opened by gevro 6
  • An tutorial for running deepconsensus

    An tutorial for running deepconsensus

    My computer hardware equipment look like this:

    OS: Ubuntu 20.04.3 LTS (x86_64)
    Python version: Python 3.8.10
    CPUs: i7 10700k(8c16t, SkyLake)
    Memory: 32G
    GPU: 1 NVIDIA RTX A4000 8G
    

    Install the requirement packages

    Create an environment for deepconsensus using conda

    mamba create -n deepconsensus -c bioconda -c conda-forge python=3.8 pbcore pbbam pbccs pbmm2 parallel jq gcc pycocotools bioconda::seqtk bioconda::unimap bioconda::bedtools bioconda::minimap2 bioconda::extracthifi bioconda::zmwfilter bioconda::pysam bioconda::samtools=1.10 bioconda::pyfastx=0.8.4
    

    Download the ACTC for reads mapping

    wget https://github.com/PacificBiosciences/align-clr-to-ccs/releases/download/0.1.0/actc 
    chmod u+x actc
    mv actc PATH/miniconda3/envs/deepconsensus/bin
    

    Install the Deepconsensus[GPU] by using pip

    conda activate deepconsensus
    pip install deepconsensus[gpu]==0.2.0
    

    Prepare all the needed input file for Deepconsensus

    Get the ccs.bam

    ccs --all -j 15 raw.subreads.bam out.ccs.bam
    

    Get the subreads_to_ccs.bam

    Tips

    If you use the actc to map the subreads to ccs without chunks, then you may encounter this error when running the deepconsensus.

    I0324 19:48:00.776319 140117319313216 quick_inference.py:492] Processed a batch of 100 ZMWs in 62.39794731140137 seconds
    I0324 19:48:00.808807 140117319313216 quick_inference.py:570] Processed 7000 ZMWs in 4584.726703 seconds
    Process ForkPoolWorker-1061:
    Traceback (most recent call last):
      File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/pool.py", line 131, in worker
        put((job, i, result))
      File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/queues.py", line 368, in put
        self._writer.send_bytes(obj)
      File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
        self._send_bytes(m[offset:offset + size])
      File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 405, in _send_bytes
        self._send(buf)
      File "/home/wanglab/miniconda3/envs/deepconsensus/lib/python3.8/multiprocessing/connection.py", line 368, in _send
        n = write(self._handle, buf)
    BrokenPipeError: [Errno 32] Broken pipe
    

    This error is caused by the number of stream processors reaching an upper limit as the iteration process increases. To avoid this mistake, the right way is chunking the data when using actc.

    Chunking your subreads.bam

    ### Generating all command lines using shell
    for i in {1..1000}; do echo 'actc -j 1 raw.subreads.bam out.ccs.bam subreads_to_ccs.'${i}'.bam --chunk '${i}'/1000' ; done > actc_chunk.job
    
    ### Submiting all scripts in parallel using parallel
    parallel -j 15 < actc_chunk.job
    
    ### Index all the subreads_to_ccs.${i}.fasta
    for i in {1..1000}; do echo 'samtools faidx subreads_to_ccs.'${i}'.fasta' ; done > samtools_index.job
    
    parallel -j 15 < samtools_index.job
    

    Get the model for Deepconsensus

    mkdir deepconsensus_model && cd deepconsensus_model
    wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/params.json
    wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/checkpoint-50.index
    wget https://storage.googleapis.com/brain-genomics-public/research/deepconsensus/models/v0.2/checkpoint-50.data-00000-of-00001
    

    Run the Deepconsensus

    for i in {1..1000};
    do
    deepconsensus run \
      --subreads_to_ccs=subreads_to_ccs.${i}.bam  \
      --ccs_fasta=subreads_to_ccs.${i}.fasta \
      --checkpoint=deepconsensus_model/checkpoint-50 \
      --output=output.${i}.fastq \
      --batch_zmws=100
    done
    

    Merge the output

    cat output.*.fastq > total.fastq
    
    opened by shengxinzhuan 6
  • can deepconsensus run on an arm machine?

    can deepconsensus run on an arm machine?

    hello,deepconsensus team! Thanks for make an amazing job for hifi sequencing. It is useful to work on an x86 cpu or gpu machine. However when I try to install it on an arm hpc, I was fail for installing the base requirements pacakage. Do you have the plan to move the deepconsensus on the arm machine ?

    opened by shengxinzhuan 6
  • Error: StopIteration

    Error: StopIteration

    Hi, I'm getting this error. What is the cause of this? Thanks.

    singularity run -W /data -B /scratch/projects/bin/deepconsensus/model:/model -B pwd /scratch/projects/bin/deepconsensus/deepconsensus_0.3.1.sif deepconsensus run --batch_size=1024 --batch_zmws=100 --cpus 1 --max_passes 20 --subreads_to_ccs=blah.subreads_to_ccs.0018.bam --ccs_bam=blah.ccs.0018.bam --checkpoint=/model/checkpoint --output=blah.fastq

    =================================================================
    Total params: 8,942,667
    Trainable params: 8,942,667
    Non-trainable params: 0
    _________________________________________________________________
    I0809 10:21:26.338892 140397309962048 model_utils.py:231] Setting hidden size to transformer_input_size.
    I0809 10:21:26.339057 140397309962048 quick_inference.py:484] Finished initialize_model.
    I0809 10:21:26.339549 140397309962048 quick_inference.py:738] Model setup took 1.790560245513916 seconds.
    Traceback (most recent call last):
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/preprocess/utils.py", line 981, in proc_feeder
        ccs_bam_read = next(ccs_bam_h)
      File "pysam/libcalignmentfile.pyx", line 1874, in pysam.libcalignmentfile.AlignmentFile.__next__
    StopIteration
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in <module>
        sys.exit(run())
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 111, in run
        app.run(main, flags_parser=parse_flags)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/cli.py", line 102, in main
        app.run(quick_inference.main, argv=passed)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/share/apps/python/3.8.6/intel/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 814, in main
        outcome_counter = run()
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 762, in run
        for zmw, subreads, dc_config in input_file_generator:
      File "/opt/conda/envs/bio/lib/python3.8/site-packages/deepconsensus/inference/quick_inference.py", line 428, in stream_bam
        for input_data in proc_feeder():
    RuntimeError: generator raised StopIteration
    
    opened by gevro 5
  • Docker or Singularity

    Docker or Singularity

    Hi,

    First of all thank you for this amazing program. I'm working in a cluster with the centos7. I tried to install the software several times, always with errors in the pip/Python distributions. Could you make available some docker and/or singularity for the deep consensus? Best Regards Andre

    opened by AMMMachado 5
  • Error detecting params.json using docker in debian (10) HPC

    Error detecting params.json using docker in debian (10) HPC

    docker run google/deepconsensus:1.1.0 deepconsensus run --subreads_to_ccs=m54274Ue_220814_163631.aligned.subreads.bam --ccs_bam=m54274Ue_220814_163631.hifi_S3_reads.bam --checkpoint=model/checkpoint --output=m54274Ue_220814_163631_deepcon.output.fastq Traceback (most recent call last): File "/opt/conda/envs/bio/bin/deepconsensus", line 8, in sys.exit(run()) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 111, in run app.run(main, flags_parser=parse_flags) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/cli.py", line 102, in main app.run(quick_inference.main, argv=passed) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/opt/conda/envs/bio/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 842, in main outcome_counter = run() File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/inference/quick_inference.py", line 703, in run params = model_utils.read_params_from_json(checkpoint_path=FLAGS.checkpoint) File "/opt/conda/envs/bio/lib/python3.9/site-packages/deepconsensus/models/model_utils.py", line 405, in read_params_from_json json.load(tf.io.gfile.GFile(json_path, 'r'))) File "/opt/conda/envs/bio/lib/python3.9/json/init.py", line 293, in load return loads(fp.read(), File "/opt/conda/envs/bio/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 116, in read self._preread_check() File "/opt/conda/envs/bio/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 77, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( tensorflow.python.framework.errors_impl.NotFoundError: model/params.json; No such file or directory

    I am getting this error even though i have all the files from model checkpoint.data-00000-of-00001 checkpoint.index params.json in the model dir.

    opened by ap1438 1
  • Lower number of >Q30 average quality reads for v1.1 compared to v0.3

    Lower number of >Q30 average quality reads for v1.1 compared to v0.3

    Hi all,

    I am assembling a genome of a land snail that has extreme repeat content (~85%) and large genome size (6.6 gb). My mean insert size is 8kb and I have data from six SMRT cells.

    I have ran DeepConsensus (cpu only) on my six SMRT cells using v0.3 and on two of the SMRT cells using v1.1. I have noticed I have gotten more reads with >Q20 average quality using v1.1 but a lower number of reads that are >Q30 compared to v.0.3. Histograms of average read quality attached. v0.3.qchist.txt v1.1.qchist.txt

    Manual inspection of the same reads from either version confirmed that most reads had longer regions of lower quality in v1.1 than v0.3. First 100 reads for v0.3 and v1.1 attached (.txt extension for github upload). v0.3_smrtcell1_100.fastq.txt v1.1_smrtcell1_100.fastq.txt

    This was surprising to me as my expectation was that the >Q20 yield would remain relatively constant between versions but >Q30 yield would increase.

    Perhaps this is the result of lower insert length or high repeat content of this library? I would appreciate hearing the DeepConsensus team's thoughts on this discrepancy. Any help would be appreciated!

    Thanks! Mason

    opened by mason-linscott 3
  • Missing majority of ZMWs after running lima to search for adapters

    Missing majority of ZMWs after running lima to search for adapters

    Hi, I'm not sure if you can help me on this but just want to raise this issue I've encountered after filtering adapters with lima. Not entirely sure why lima filtered out majority of the 'deepconsensus hifi reads'. Below is the script for the deepconsensus and lima:

    DC

    cmd4="module purge && module load deepconsensus/0.3.1 && deepconsensus run --checkpoint=/cluster/home/dc_model_0.3/checkpoint --ccs_bam=${outDir}/${outFilePrefix}.${SLURM_ARRAY_TASK_ID}.bam --subreads_to_ccs=${outDir}/${outFilePrefix%.ccs}.subreads_to_ccs.${SLURM_ARRAY_TASK_ID}.bam --output=${outDir}/${outFilePrefix%.ccs}.deepconsensus.${SLURM_ARRAY_TASK_ID}.fastq --cpus ${THREAD}"

    Lima

    lima --num-threads 84 --split-bam-named --same --ccs ${ID}.deepconsensus.fastq /cluster/home/lima_pbmarkdup/pb_pcr_adapter.fa ${ID}.deepconsensus.lima.fastq

    Here is the output summary from lima:

    ZMWs input (A) : 1925775 ZMWs above all thresholds (B) : 13042 (0.68%) ZMWs below any threshold (C) : 1912733 (99.32%) ZMW marginals for (C): Below min length : 199 (0.01%) Below min score : 648496 (33.90%) Below min end score : 648496 (33.90%) Below min passes : 0 (0.00%) Below min score lead : 648496 (33.90%) Below min ref span : 1912733 (100.00%) Without SMRTbell adapter : 0 (0.00%) ZMWs for (B): With same pair : 13042 (100.00%) Coefficient of correlation : 0.00% ZMWs for (A): Allow diff pair : 1925775 (100.00%) Allow same pair : 1925775 (100.00%) Reads for (B): Above length : 13042 (100.00%) Below length : 0 (0.00%)

    Thank you, I appreciate your help!

    opened by rosspdu 3
Releases(v1.1.0)
  • v1.1.0(Dec 16, 2022)

    • DeepConsensus v1.1 introduces a new model that improves coverage of telomere regions achieved through improved filtering of the training data with CHM13 high confidence regions.
    • Improved yield at empirical Q30 from 187.1% in v1.0 to 194.4% in v1.1, relative to ccs baseline of 100%. This was achieved through improvements to the attention layer in the model.
    • Updated the training tutorial for training on TPUs that users can use as a proof-of-concept to develop a training setup.
    • This release evaluates performance using an updated HG002 truth assembly. We have re-evaluated previous releases with this updated dataset and updated Q30 yields accordingly.
    • Thanks to Sergey Koren (@skoren) from NIH, NHGRI and the T2T consortium for invaluable feedback on the coverage of telomeric regions.
    • Thanks to Daniel Liu (@Daniel-Liu-c0deb0t) for incorporating prior knowledge/sparsity in the attention layer of the model, which significantly improved the accuracy and Q30 yield.
    • Thanks to Armin Töpfer (@armintoepfer), Aaron Wenger (@amwenger), and William Rowell (@williamrowell) at PacBio for advice and collaboration.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Oct 11, 2022)

    • DeepConsensus v1.0 introduces a new model that greatly improves the empirical Q30 yield across chemistries and the insert sizes we tested. For example, using our chem2.2_24kb dataset we observe an increase in Q30 yield from 149% to 176%.
    • We reduced the size of our model (using distillation) and the size of the model inputs to lower runtime by approximately 10%, while still improving accuracy over v0.3.
    • DeepConsensus can now output a BAM file. BAM output can be used to examine the effective coverage (ec), number of passes (np), or predicted average read accuracy (rq).
    • v1.0 introduces a training tutorial that users can use as a proof-of-concept to develop a training setup.
    • Models introduced previously (v0.1, v0.2, v0.3) are not compatible with v1.0 and vice versa.
    • --max_passes and --example_width are now defined by the model params.json file. Users do not need to set these flags when running inference. The --padding flag has been removed. Padding is no longer added to model inputs.

    Acknowledgements

    • Thanks to Armin Töpfer (@armintoepfer), Aaron Wenger (@amwenger), and William Rowell (@williamrowell) at PacBio for advice and collaboration.
    • Thanks to Lucas Brambrink (@lucasbrambrink) for model experiments and analysis.
    • Thanks to Daniel Liu (@Daniel-Liu-c0deb0t) for model experiments, analysis, and advice.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Jul 19, 2022)

  • v0.3.0(Jul 6, 2022)

    Change Log

    • Runtime speedup of 4.9X compared to v0.2.
    • Improved yield at empirical Q30 from 141% in v0.2 to 149%, relative to ccs baseline of 100%. This was achieved through improvements to training data including use of new CHM13 T2T assembly (chm13v2.0_noY) and sequencing.
    • Added a documentation page with yield metrics for 3 SMRT Cells with different read length distributions.
    • Updated recommendation for ccs settings to skip very low-quality reads, saving runtime.
    • Model input condenser layer added, saving runtime.
    • To save significant runtime, added an option to skip running the model on windows that are already likely to be correct with --skip_windows_above a certain predicted quality from CCS, with a default Q45.
    • Memory profiling with batch option recommendations.
    • Add support for TensorFlow SavedModel for portability.
    • Added base quality calibration tuned for v0.3 model, customizable with --dc_calibration option.
    • The --min-quailty flag default was changed from 20 to 0 in this version. This change was reverted in v0.3.1.

    Acknowledgement

    • Thanks to Armin Töpfer, Aaron Wenger, and William Rowell at PacBio for advice and collaboration.
    • Thanks to Felipe Llinares for contributing a new alignment training metric.
    • Thanks to Moshe Wagner for adding a multiprocessing speedup to the preprocessing stage.
    • Thanks to Joel Shor for model advice and code reviews.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Jan 18, 2022)

    Change Log

    • Substantial (>10x) speed increase relative to v0.1.
    • DeepConsensus now supports GPU execution. In our tests, using a NVIDIA V100 GPU is ~3.3x faster than CPU alone.
    • Reduced installation complexity by removing Nucleus and Apache Beam dependencies. Added support for newer TensorFlow versions.
    • CPU and GPU pip packages are now available alongside corresponding Docker images.
    • A more user-friendly command-line interface has been added and can be invoked using deepconsensus.
    • A simplified one-step solution for running DeepConsensus has been developed and can be invoked using deepconsensus run.
    • Small improvements to accuracy by better mapping repetitive subreads with actc, increasing Q30 yield by 31.3 % relative to pbccs, compared to 30.6% for DeepConsensus v0.1.

    Thanks to Armin Töpfer for actc support and Jeremy Schmutz for evaluations and feedback.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jan 15, 2022)

Owner
Google
Google ❤️ Open Source
Google
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

zhaohu xing 112 Dec 16, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 5, 2023
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2

Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh

Jesse Bloom 4 Feb 9, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Sequence to Sequence Models with PyTorch

Sequence to Sequence models with PyTorch This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch At present it ha

Sandeep Subramanian 708 Dec 19, 2022
Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

Elad Hoffer 514 Nov 17, 2022
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 490 Dec 15, 2022
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 8, 2023
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022
[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Guangrui Li 84 Dec 26, 2022
A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

Zain 1 Feb 1, 2022
modelvshuman is a Python library to benchmark the gap between human and machine vision

modelvshuman is a Python library to benchmark the gap between human and machine vision. Using this library, both PyTorch and TensorFlow models can be evaluated on 17 out-of-distribution datasets with high-quality human comparison data.

Bethge Lab 244 Jan 3, 2023
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

S2-BNN (Self-supervised Binary Neural Networks Using Distillation Loss) This is the official pytorch implementation of our paper: "S2-BNN: Bridging th

Zhiqiang Shen 52 Dec 24, 2022
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

David Mascharka 351 Nov 18, 2022
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

Elad Hoffer 145 Sep 16, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 6, 2023