Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

MT Schmitz

Last update: Feb 11, 2022

Related tags

Deep Learning py-multi-seq

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

The scripts in this repository are roughly analogous to the provided MULTI-seq R package deMULTIplex at https://github.com/chris-mcginnis-ucsf/MULTI-seq. This script loads read data from paired end reads, performs fuzzy string matching from paired end reads to the provided MULTIseq barcode file, then counts the reads mapping to each barcode. Next, Expectation Maximization is used to fit Gaussian Mixture Models for each barcode, which assigns each cell a most likely barcode, no barcode or doublet barcodes.

Installation

Clone this repository. The scripts within also depend on python >= 3.7 and the following packages which can be installed with: pip install pandas numpy scipy fuzzywuzzy tqdm sparse_dot_topn scanpy natsort

You will need the cellranger cell barcodes file before running. You can in theory modify the MultiseqIndices.txt along with the read length parameters for custom barcodes in the reads.

Usage example for 10X scRNAseq or Multiome + MULTIseq:

python BarcodeFuzzyMatching.py /path/to/this/repo/MultiseqSamplesExample.txt /path/to/this/repo/MultiseqIndices.txt /path/to/sampleMULTIseq_R1.fastq /path/to/cellranger/outs/filtered_feature_bc_matrix/barcodes.tsv.gz /path/to/output/dir/ 16 8 0

python RunDemuxEM.py /path/to/output/dir/ /path/to/cellranger/outs/filtered_feature_bc_matrix/

Running this pipeline will output a matrix of barcodes by reads_counts, as well as a csv listing cell barcodes and their assigned barcode(s).

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Related tags

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

Installation

Usage example for 10X scRNAseq or Multiome + MULTIseq:

You might also like...

Introduction to AI assignment 1 HCM University of Technology, term 211

It is the assignment for COMP 576 in Rice University

Vector.ai assignment

ML-PersonalWork - Big assignment PersonalWork in Machine Learning, 2021 autumn BUAA.

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Job Assignment System by Real-time Emotion Detection

:fire: 2D and 3D Face alignment library build using pytorch

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

Owner

MT Schmitz

A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University