zEpid
zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is to provide a toolset to make epidemiology e-z. A variety of calculations and plots can be generated through various functions. For a sample walkthrough of what this library is capable of, please look at the tutorials available at https://github.com/pzivich/Python-for-Epidemiologists
A few highlights: basic epidemiology calculations, easily create functional form assessment plots, easily create effect measure plots, and causal inference tools. Implemented estimators include; inverse probability of treatment weights, inverse probability of censoring weights, inverse probabilitiy of missing weights, augmented inverse probability of treatment weights, time-fixed g-formula, Monte Carlo g-formula, Iterative conditional g-formula, and targeted maximum likelihood (TMLE). Additionally, generalizability/transportability tools are available including; inverse probability of sampling weights, g-transport formula, and doubly robust generalizability/transportability formulas.
If you have any requests for items to be included, please contact me and I will work on adding any requested features. You can contact me either through GitHub (https://github.com/pzivich), email (gmail: zepidpy), or twitter (@zepidpy).
Installation
Installing:
You can install zEpid using pip install zepid
Dependencies:
pandas >= 0.18.0, numpy, statsmodels >= 0.7.0, matplotlib >= 2.0, scipy, tabulate
Module Features
Measures
Calculate measures directly from a pandas dataframe object. Implemented measures include; risk ratio, risk difference, odds ratio, incidence rate ratio, incidence rate difference, number needed to treat, sensitivity, specificity, population attributable fraction, attributable community risk
Measures can be directly calculated from a pandas DataFrame object or using summary data.
Other handy features include; splines, Table 1 generator, interaction contrast, interaction contrast ratio, positive predictive value, negative predictive value, screening cost analyzer, counternull p-values, convert odds to proportions, convert proportions to odds
For guided tutorials with Jupyter Notebooks: https://github.com/pzivich/Python-for-Epidemiologists/blob/master/3_Epidemiology_Analysis/a_basics/1_basic_measures.ipynb
Graphics
Uses matplotlib in the background to generate some useful plots. Implemented plots include; functional form assessment (with statsmodels output), p-value function plots, spaghetti plot, effect measure plot (forest plot), receiver-operator curve, dynamic risk plots, and L'Abbe plots
For examples see: http://zepid.readthedocs.io/en/latest/Graphics.html
Causal
The causal branch includes various estimators for causal inference with observational data. Details on currently implemented estimators are below:
G-Computation Algorithm
Current implementation includes; time-fixed exposure g-formula, Monte Carlo g-formula, and iterative conditional g-formula
Inverse Probability Weights
Current implementation includes; IP Treatment W, IP Censoring W, IP Missing W. Diagnostics are also available for IPTW. IPMW supports monotone missing data
Augmented Inverse Probability Weights
Current implementation includes the augmented-IPTW estimator described by Funk et al 2011 AJE
Targeted Maximum Likelihood Estimator
TMLE can be estimated through standard logistic regression model, or through user-input functions. Alternatively, users can input machine learning algorithms to estimate probabilities. Supported machine learning algorithms include sklearn
Generalizability / Transportability
For generalizing results or transporting to a different target population, several estimators are available. These include inverse probability of sampling weights, g-transport formula, and doubly robust formulas
Tutorials for the usage of these estimators are available at: https://github.com/pzivich/Python-for-Epidemiologists/tree/master/3_Epidemiology_Analysis/c_causal_inference
G-estimation of Structural Nested Mean Models
Single time-point g-estimation of structural nested mean models are supported.
Sensitivity Analyses
Includes trapezoidal distribution generator, corrected Risk Ratio
Tutorials are available at: https://github.com/pzivich/Python-for-Epidemiologists/tree/master/3_Epidemiology_Analysis/d_sensitivity_analyses