69 Repositories
Python fasta-sequences Libraries
SARS-Cov-2 Recombinant Finder for fasta sequences
Sc2rf - SARS-Cov-2 Recombinant Finder Pronounced: Scarf What's this? Sc2rf can search genome sequences of SARS-CoV-2 for potential recombinants - new
A library built upon PyTorch for building embeddings on discrete event sequences using self-supervision
pytorch-lifestream a library built upon PyTorch for building embeddings on discrete event sequences using self-supervision. It can process terabyte-si
Contains the code and data for our #ICSE2022 paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences"
CodeFill This repository contains the code for our paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Namin
Persian Bert For Long-Range Sequences
ParsBigBird: Persian Bert For Long-Range Sequences The Bert and ParsBert algorithms can handle texts with token lengths of up to 512, however, many ta
ProtFeat is protein feature extraction tool that utilizes POSSUM and iFeature.
Description: ProtFeat is designed to extract the protein features by employing POSSUM and iFeature python-based tools. ProtFeat includes a total of 39
allow windows programs to call dssp/mkdssp command from wsl; rework biopython on windows (PDB - dssp - fasta)
dssp-wsl Converting PDB (Protein Data Bank) file format to DSSP file format is required for generating datasets of peptides and their secondary struct
Tuple-sum-filter - Library to play with filtering numeric sequences by sums of their pairs, triplets, etc. With a bonus CLI demo
Tuple Sum Filter A library to play with filtering numeric sequences by sums of t
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor
Doughskript interpreter for converting simple command sequences into executable Arduino C++ code.
Doughskript interpreter for converting simple command sequences into executable Arduino C++ code.
This repository contains the implementation of the paper Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans
Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans This repository contains the implementation of the pap
Extract longest transcript or longest CDS transcript from GTF annotation file or gencode transcripts fasta file.
Extract longest transcript or longest CDS transcript from GTF annotation file or gencode transcripts fasta file.
A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.
A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.
Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a local folder
Ingestinator Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a
Linux GUI app to codon optimize many single-fasta files with coding sequences , using many taxonomy ids
codon_optimize_cds_with_many_taxids_singlefasta Linux GUI app to codon optimize many single-fasta files with coding sequences, using many taxonomy ids
'rl_UK' is an open-source command-line tool in Python for calculating the shortest path between BUS stop sequences in the UK
'rl_UK' is an open-source command-line tool in Python for calculating the shortest path between BUS stop sequences in the UK. As input files, it uses an ATCO-CIF file and 'OS Open Roads' dataset from Ordnance Survey Data Hub.
Deep learning transformer model that generates unique music sequences.
music-ai Deep learning transformer model that generates unique music sequences. Abstract In 2017, a new state-of-the-art was published for natural lan
Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"
Memory Compressed Attention Implementation of the Self-Attention layer of the proposed Memory-Compressed Attention, in Pytorch. This repository offers
Continual Learning of Long Topic Sequences in Neural Information Retrieval
ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you
The official code of "SCROLLS: Standardized CompaRison Over Long Language Sequences".
SCROLLS This repository contains the official code of the paper: "SCROLLS: Standardized CompaRison Over Long Language Sequences". Links Official Websi
LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data
LSTM Neural Network for Time Series Prediction LSTM built using the Keras Python package to predict time series steps and sequences. Includes sine wav
Linux GUI app to codon optimize a directory with fasta files using taxonomy ids imported as a 1-column txt file (1 taxonomy id for each file)
codon optimize cds paired with taxids singlefastas gui Linux GUI app to codon optimize a directory with fasta files using taxonomy ids imported as a 1
Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction
Generating Smooth Pose Sequences for Diverse Human Motion Prediction This is official implementation for the paper Generating Smooth Pose Sequences fo
Python code snippets for extracting PDB codes from .fasta files
Python_snippets_for_bioinformatics Python code snippets for extracting PDB codes from .fasta files If you have a single .fasta file for all protein se
Retrieve annotated intron sequences and classify them as minor (U12-type) or major (U2-type)
(intron I nterrogator and C lassifier) intronIC is a program that can be used to classify intron sequences as minor (U12-type) or major (U2-type), usi
An easy FASTA object handler, reader, writer and translator for small to medium size projects without dependencies.
miniFASTA An easy FASTA object handler, reader, writer and translator for small to medium size projects without dependencies. Installation Using pip /
Annotates sequences with Eggnog-mapper and hhblits against PDB70
Annotating "hypothetical" proteins with the PDB See config/ for configuration information. This workflow takes as input a set of protein sequences. It
SeqLike - flexible biological sequence objects in Python
SeqLike - flexible biological sequence objects in Python Introduction A single object API that makes working with biological sequences in Python more
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
Status: Archive (code is provided as-is, no updates expected) Update August 2020: For an example repository that achieves state-of-the-art modeling pe
GebPy is a Python-based, open source tool for the generation of geological data of minerals, rocks and complete lithological sequences.
GebPy is a Python-based, open source tool for the generation of geological data of minerals, rocks and complete lithological sequences. The data can be generated randomly or with respect to user-defined constraints, for example a specific element concentration within minerals and rocks or the order of units within a complete lithological profile.
A minimal, standalone viewer for 3D animations stored as stop-motion sequences of individual .obj mesh files.
ObjSequenceViewer V0.5 A minimal, standalone viewer for 3D animations stored as stop-motion sequences of individual .obj mesh files. Installation: pip
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences
Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point
This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.
AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences
Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
COCO-LM This repository contains the scripts for fine-tuning COCO-LM pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: COCO-LM: Correcting an
Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.
AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco
Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana
DeepGeneAnnotator: A tool to annotate the gene in the genome The master thesis of the "Using deep learning to predict gene structures of the coding ge
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.
Sequence clustering and database creation using mmseqs, from local fasta files
Sequence clustering and database creation using mmseqs, from local fasta files
A Protein-RNA Interface Predictor Based on Semantics of Sequences
PRIP PRIP:A Protein-RNA Interface Predictor Based on Semantics of Sequences installation gensim==3.8.3 matplotlib==3.1.3 xgboost==1.3.3 prettytable==2
ARAE-Tensorflow for Discrete Sequences (Adversarially Regularized Autoencoder)
ARAE Tensorflow Code Code for the paper Adversarially Regularized Autoencoders for Generating Discrete Structures by Zhao, Kim, Zhang, Rush and LeCun
MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.
Introduction MASS allows you to search a time series for a subquery resulting in an array of distances. These array of distances enable you to identif
Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module.
Import Subtitles for Blender VSE Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module. Supported formats by py
Replace sequence_IDs in gff3 based on given genome.fasta
gff-rename Replace the sequence IDs in a gff3 file with a set of provided sequence IDs from a genom.fasta. This is useful when a gff3 file is retrieve
A pipeline that creates consensus sequences from a Nanopore reads. I
A pipeline that creates consensus sequences from a Nanopore reads. It clusters reads that are similar to each other and creates a consensus that is then identified using BLAST.
A tool for batch processing large fasta files and accompanying metadata table to upload to repositories via API
Fasta Uploader A tool for batch processing large fasta files and accompanying metadata table to repositories via API The python fasta_uploader.py scri
Script For Importing Image sequences into scrap mechanic via blueprints
To use dowload and extract "video makes.zip" Python has to be installed https://www.python.org/ (may not work on version lower than 3.9) Has to be run
Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch
Enformer - Pytorch (wip) Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch. The original tensorflow
Physicochemical properties and indices for amino-acid sequences (ported from R).
peptides.py Physicochemical properties and indices for amino-acid sequences. 🗺️ Overview peptides.py is a pure-Python package to compute common descr
peptides.py is a pure-Python package to compute common descriptors for protein sequences
peptides.py Physicochemical properties and indices for amino-acid sequences. 🗺️ Overview peptides.py is a pure-Python package to compute common descr
produces PCA on genotypes from fasta files (popPhyl's ID format)
popPhyl_PCA Performs PCA of genotypes. Works in two steps. 1. Input file A single fasta file containing different loci, in different populations/speci
Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.
Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.
Beyond Paragraphs: NLP for Long Sequences
Beyond Paragraphs: NLP for Long Sequences
Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.
RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi
Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch
Neural Distance Embeddings for Biological Sequences Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTo
Python library for generating sequences with uniform stimulus history
Sampling Euler tours for uniform stimulus history Table of Contents About Examples Experiment 1 Experiment 2 Experiment 3 Experiment 4 Experiment 5 Co
A simple yet powerful TUI framework for your Python (3.7+) applications
A simple yet powerful TUI framework for your Python (3.7+) applications
Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences
Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences 1. Introduction This project is for paper Model-free Vehicle Tracking and St
Implementation of ProteinBERT in Pytorch
ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc
Explore related sequences in the OEIS
OEIS explorer This is a tool for exploring two different kinds of relationships between sequences in the OEIS: mentions (links) of other sequences on
Implementation of ProteinBERT in Pytorch
ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc
Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"
M4Depth This is the reference TensorFlow implementation for training and testing depth estimation models using the method described in M4Depth: A moti
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen
Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch
COCO LM Pretraining (wip) Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch. They were a
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt
Big Bird: Transformers for Longer Sequences
BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !
Yase Yet Another Sequence Encoder - encode sequences to vector of vectors in python ! Why Yase ? Yase enable you to encode any sequence which can be r