This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

Debora Marks Lab

Last update: Sep 18, 2022

Related tags

Overview

BEAR

Overview

This repository contains code associated with the preprint A generative nonparametric Bayesian model for whole genomes (2021), which proposes Bayesian embedded autoregresive (BEAR) models. The repository provides example BEAR models as well as tools for implementing new models. It enables building, training and evaluating BEAR models on large scale sequencing datasets, including whole genome, transcriptomic and metagenomic data.

Documentation

For instructions on running examples and deploying the BEAR model, consult the documentation at https://bear-model.readthedocs.io/en/latest/.

Authors

This is a project of the Marks Lab in the Systems Biology Department at Harvard Medical School. It was developed by

Alan Amin, <[email protected]>
Eli Weinstein, <[email protected]>
Debora Marks, <[email protected]>

License

This project is available under the MIT license.

Reference

Preprint: A. N. Amin*, E. N. Weinstein*, D. S. Marks, A generative nonparametric Bayesian model for whole genomes, 2021 (* equal contribution). https://www.biorxiv.org/content/10.1101/2021.05.30.446360v1

You might also like...

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

962 Dec 23, 2022

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

76 Dec 22, 2022

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

SMSR Reposity for "Exploring Sparsity in Image Super-Resolution for Efficient Inference" [arXiv] Highlights Locate and skip redundant computation in S

225 Dec 26, 2022

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

Hierarchical Pretraining: Research Repository This is a research repository for reproducing the results from the project "Self-supervised pretraining

53 Nov 9, 2022

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

44 Oct 20, 2022

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Comments

Installation on the cluster
Hi,

I'm trying to install the python package using a Python 3 environment on the cluster. Since I wasn't able to install tensorflow-io-nightly , I changed it to tensorflow-io in the requirements.txt file. When I run pytest to test the installation, there are some errors:

ERROR bear_model/tests/test_core.py - ValueError: numpy.ndarray size changed, may indicate binary incompati... ERROR bear_model/tests/test_dataloader.py - ValueError: numpy.ndarray size changed, may indicate binary inc... ERROR bear_model/tests/test_run.py ERROR bear_model/tests/test_summarize.py

How can I solve it? Thank you in advance
opened by FarzanehRah 7

Tutorial: Error when running summary.py for the first time without specifying tmp folder

Hi! This error occurs when following the tutorial step by step (after downloading BEAR and installing the required packages).

Start: Stage 1... 2021-09-11 11:12:47.502770
Stage 2... 2021-09-11 11:12:49.422647
Traceback (most recent call last):
  File "summarize.py", line 607, in <module>
    main(args)
  File "summarize.py", line 570, in main
    n_bins = run(args)
  File "summarize.py", line 553, in run
    total_size = stage2(unit2is, args)
  File "summarize.py", line 304, in stage2
    out_size += unit2i.get_size()
  File "summarize.py", line 289, in get_size
    out_size = os.path.getsize(self.out_file)
  File "/home/remita/pyenvs/bear/lib/python3.7/genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: 'data/ysd1_kmc_out_0_full_6.tsv'

After investigation, it appears that KMC cannot find the default tmp folder tests/exdata/tmp/ and throws an error in the stderr data/ysd1_kmc_stderr_0_full_6.txt

--- kmc ---
Error: Cannot create file in specified working directory: tests/exdata/tmp/
--- kmc dump ---
Error: Cannot open file data/ysd1_kmc_inter_0_full_6.res.kmc_pre

So in the tutorial, it is better to add that the user must create the default tmp folder tests/exdata/tmp/ or specify an existing folder with the option -t of the script summarize.py.

Thank you.

opened by maremita 2

Python version

I see that this uses Python 3, but it might be nice to specify the specific version that will work - especially for when making conda environments compatible.

I also noticed this comment in the setup.py:

NOTE: This file must remain Python 2 compatible for the foreseeable future,

to ensure that we error out properly for people with outdated setuptools

and/or pip.

Not sure if this has any affect on users python version.

opened by csheare 0

This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

Related tags

Overview

BEAR

Overview

Documentation

Authors

License

Reference

You might also like...

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Comments

Installation on the cluster

Tutorial: Error when running summary.py for the first time without specifying tmp folder

Python version

NOTE: This file must remain Python 2 compatible for the foreseeable future,

to ensure that we error out properly for people with outdated setuptools

and/or pip.

Owner

Debora Marks Lab

[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Repository to run object detection on a model trained on an autonomous driving dataset.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

Related tags

Overview

BEAR

Overview

Documentation

Authors

License

Reference

You might also like...

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Comments

Installation on the cluster

Tutorial: Error when running summary.py for the first time without specifying tmp folder

Python version

NOTE: This file must remain Python 2 compatible for the foreseeable future,

to ensure that we error out properly for people with outdated setuptools

and/or pip.

Owner

Debora Marks Lab

[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Repository to run object detection on a model trained on an autonomous driving dataset.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang