GMHI: Gut Microbiome Health Index
Description
Gut Microbiome Health Index (GMHI) is a robust index for evaluating health status based on the species-level taxonomic profile of a stool shotgun metagenome (gut microbiome) sample.
Installation
To avoid dependency conflicts, create an isolated conda environment and install gmhi
- Install via conda
conda create --name gmhi_env -c danielchang2002 python=3.7 gmhi
- Activate environment
conda activate gmhi_env
Usage
usage: gmhi [-h] [-o OUTPUT] --fastq1 FASTQ1 --fastq2 FASTQ2
DESCRIPTION:
GMHI version 1.0
Gut Microbiome Health Index (GMHI) is a robust index for evaluating
health status based on the species-level taxonomic profile of a stool
shotgun metagenome (gut microbiome) sample.
AUTHORS:
Daniel Chang, Vinod Gupta, Jaeyun Sung
USAGE:
GMHI is a pipeline that takes as input two raw fastq files generated
from a paired end sequence, performs quality control, estimates microbial
abundances, and returns as output a health index score.
* Profiling a metagenome from raw reads:
$ gmhi --fastq1 metagenome1.fastq --fastq2 metagenome2.fastq
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output file name
required named arguments:
--fastq1 FASTQ1 first input fastq file
--fastq2 FASTQ2 second input fastq file
Example
Directory structure:
.
├── metagenome1.fastq
└── metagenome2.fastq
Command:
gmhi --fastq1 metagenome1.fastq --fastq2 metagenome2.fastq -o GMHI2.txt
Result:
.
├── GMHI.txt
├── abundance.txt
├── metagenome1.fastq
└── metagenome2.fastq
where GMHI2.txt is a text file with a single line containing the health index score of the metagenome, and abundance.txt is a tsv containing the estimated microbial abundances.
Runtime
Runtime depends on the size of the input metagenome and the system specs.
On a 2019 MacBook Pro with a 2.3 GHz 8-Core Intel Core i9 processor and 16GB of RAM, a single run of gmhi on an input metagenome of 4 GB takes 29 minutes.
Note: the initial run on any machine will take extra time because databases will need to be downloaded and installed before the actual computation.