Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Overview

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial"

image

This repository is complementary to the publication:

M.D. Luecken, F.J. Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial", Molecular Systems Biology 15(6) (2019): e8746

The paper was recommended on F1000 prime as being of special significance in the field.

Access the recommendation on F1000Prime

The repository contains:

  • scripts to generate the paper figures
  • a case study which complements the manuscript
  • the code for the marker gene detection study from the supplementary material

The main part of this repository is a case study where the best-practices established in the manuscript are applied to a mouse intestinal epithelium regions dataset from Haber et al., Nature 551 (2018) available from the GEO under GSE92332. This case study can be found in different versions in the latest_notebook/ and old_releases/ directories.

The scripts in the plotting_scripts/ folder reproduce the figures that are shown in the manuscript and the supplementary materials. These scripts contain comments to explain each step. Each figure that does not have a corresponding script in the plotting_scripts/ folder was taken from the case study or the marker gene study.

In case of questions or issues, please get in touch by posting an issue in this repository.

If the materials in this repo are of use to you, please consider citing the above publication.

Environment set up

A docker container with a working sc-tutorial environment is now available here thanks to Leander Dony. If you would like to set up the environment via conda or manually outside of the docker container, please follow the instructions below.

To run the tutorial case study, several packages must be installed. As both R and python packages are required, we prefer using a conda environment. To facilitate the setup of a conda environment, we have provided the sc_tutorial_environment.yml file, which contains all conda and pip installable dependencies. R dependencies, which are not already available as conda packages, must be installed into the environment itself.

To set up a conda environment, the following instructions must be followed.

  1. Set up the conda environment from the sc_tutorial_environment.yml file.

    conda env create -f sc_tutorial_environment.yml
    
  2. Ensure that the environment can find the gsl libraries from R. This is done by setting the CFLAGS and LDFLAGS environment variables (see https://bit.ly/2CjJsgn). Here we set them so that they are correctly set every time the environment is activated.

    cd YOUR_CONDA_ENV_DIRECTORY
    mkdir -p ./etc/conda/activate.d
    mkdir -p ./etc/conda/deactivate.d
    touch ./etc/conda/activate.d/env_vars.sh
    touch ./etc/conda/deactivate.d/env_vars.sh
    

    Where YOUR_CONDA_ENV_DIRECTORY can be found by running conda info --envs and using the directory that corresponds to your conda environment name (default: sc-tutorail).

    WHILE NOT IN THE ENVIRONMENT(!!!!) open the env_vars.sh file at ./etc/conda/activate.d/env_vars.sh and enter the following into the file:

    #!/bin/sh
    
    CFLAGS_OLD=$CFLAGS
    export CFLAGS_OLD
    export CFLAGS="`gsl-config --cflags` ${CFLAGS_OLD}"
     
    LDFLAGS_OLD=$LDFLAGS
    export LDFLAGS_OLD
    export LDFLAGS="`gsl-config --libs` ${LDFLAGS_OLD}"
    

    Also change the ./etc/conda/deactivate.d/env_vars.sh file to:

    #!/bin/sh
     
    CFLAGS=$CFLAGS_OLD
    export CFLAGS
    unset CFLAGS_OLD
     
    LDFLAGS=$LDFLAGS_OLD
    export LDFLAGS
    unset LDFLAGS_OLD
    

    Note again that these files should be written WHILE NOT IN THE ENVIRONMENT. Otherwise you may overwrite the CFLAGS and LDFLAGS environment variables in the base environment!

  3. Enter the environment by conda activate sc-tutorial or conda activate ENV_NAME if you changed the environment name in the sc_tutorial_environment.yml file.

  4. Open R and install the dependencies via the commands:

    install.packages(c('devtools', 'gam', 'RColorBrewer', 'BiocManager'))
    update.packages(ask=F)
    BiocManager::install(c("scran","MAST","monocle","ComplexHeatmap","slingshot"), version = "3.8")
    

These steps should set up an environment to perform single cell analysis with the tutorial workflow on a Linux system. Please note that we have encountered issues with conda environments on Mac OS. When using Mac OS we recommend installing the packages without conda using separately installed python and R versions. Alternatively, you can try using the base conda environment and installing all packages as described in the conda_env_instructions_for_mac.txt file. In the base environment, R should be able to find the relevant gsl libraries, so LDFLAGS and CFLAGS should not need to be set.

Also note that conda and pip doesn't always play nice together. Conda developers have suggested first installing all conda packages and then installing pip packages on top of this where conda packages are not available. Thus, installing further conda packages into the environment may cause issues. Instead, start a new environment and reinstall all conda packages first.

If you prefer to set up an environment manually, a list of all package requirements are given at the end of this document.

Downloading the data

As mentioned above the data for the case study comes from GSE92332. To run the case study as shown, you must download this data and place it in the correct folder. Unpacking the data requires tar and gunzip, which should already be available on most systems. If you are cloning the github repository and have the case study script in a latest_notebook/ folder, then from the location where you store the case study ipynb file, this can be done via the following commands:

cd ../  #To get to the main github repo folder
mkdir -p data/Haber-et-al_mouse-intestinal-epithelium/
cd data/Haber-et-al_mouse-intestinal-epithelium/
wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE92nnn/GSE92332/suppl/GSE92332_RAW.tar
mkdir GSE92332_RAW
tar -C GSE92332_RAW -xvf GSE92332_RAW.tar
gunzip GSE92332_RAW/*_Regional_*

The annotated dataset with which we briefly compare the results at the end of the notebook, is available from the same GEO accession ID (GSE92332). It can be obtained using the following command:

cd data/Haber-et-al_mouse-intestinal-epithelium/
wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE92nnn/GSE92332/suppl/GSE92332_Regional_UMIcounts.txt.gz
gunzip GSE92332_Regional_UMIcounts.txt.gz

Case study notes

We have noticed that results such as visualization, dimensionality reduction, and clustering (and hence all downstream results as well) can give slightly different results on different systems. This has to do with the numerical libraries that are used in the backend. Thus, we cannot guarantee that a rerun of the notebook will generate exactly the same clusters.

While all results are qualitatively similar, the assignment of cells to clusters especialy for stem cells, TA cells, and enterocyte progenitors can differ between runs across systems. To show the diversity that can be expected, we have uploaded shortened case study notebooks to the alternative_clustering_results/ folder.

Note that running sc.pp.pca() with the parameter svd_solver='arpack' drastically reduces the variability between systems, however the output is not exactly the same.

Adapting the pipeline for other datasets:

The pipeline was designed to be easily adaptable to new datasets. However, there are several limitations to the general applicability of the current workflow. When adapting the pipeline for your own dataset please take into account the following:

  1. Sparse data formats are not supported by rpy2 and therefore do not work with any of the integrated R commands. Datasets can be turned into a dense format using the code: adata.X = adata.X.toarray()

  2. The case study assumes that the input data is count data obtained from a single-cell protocol with UMIs. If the input data is full-length read data, then one could consider replacing the normalization method with another method that includes gene length normalization (e.g., TPM).

Manual installation of package requirements

The following packages are required to run the first version of the case study notebook. For further versions see the README.md in the latest_notebook/ and old_releases/ folders.

General:

  • Jupyter notebook
  • IRKernel
  • rpy2
  • R >= 3.4.3
  • Python >= 3.5

Python:

  • scanpy
  • numpy
  • scipy
  • pandas
  • seaborn
  • louvain>=0.6
  • python-igraph
  • gprofiler-official (from Case study notebook 1906 version)
  • python-gprofiler from Valentine Svensson's github (vals/python-gprofiler)
    • only needed for notebooks before version 1906
  • ComBat python implementation from Maren Buettner's github (mbuttner/maren_codes/combat.py)
    • only needed for scanpy versions before 1.3.8 which don't include sc.pp.combat()

R:

  • scater
  • scran
  • MAST
  • gam
  • slingshot (change DESCRIPTION file for R version 3.4.3)
  • monocle 2
  • limma
  • ComplexHeatmap
  • RColorBrewer
  • clusterExperiment
  • ggplot2
  • IRkernel

Possible sources of error in the manual installation:

For R 3.4.3:

When using Slingshot in R 3.4.3, you must pull a local copy of slingshot via the github repository and change the DESCRIPTION file to say R>=3.4.3 instead of R>=3.5.0.

For R >= 3.5 and bioconductor >= 3.7:

The clusterExperiment version that comes for bioconductor 3.7 has slightly changed naming convention. clusterExperiment() is now called ClusterExperiment(). The latest version of the notebook includes this change, but when using the original notebook, please note that this may throw an error.

For rpy2 < 3.0.0:

Pandas 0.24.0 is not compatible with rpy2 < 3.0.0. When using old versions of rpy2, please downgrade pandas to 0.23.X. Please also note that Pandas 0.24.0 requires anndata version 0.6.18 and scanpy version > 1.37.0.

For enrichment analysis with g:profiler:

Ensure that the correct g:profiler package is used for the notebook. Notebooks until 1904 use python-gprofiler from valentine svensson's github, and Notebooks from 1906 use the gprofiler-official package from the g:profiler team.

If not R packages can be found:

Ensure that IRkernel has linked the correct version of R with your jupyter notebook. Check instructions at https://github.com/IRkernel/IRkernel.

Comments
  • Concatenate files error

    Concatenate files error

    I'm following the notebook but I'm getting an error in concatenating the files:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-16-b1c46f8469f7> in <module>
         28 
         29     # Concatenate to main adata object
    ---> 30     adata = adata.concatenate(adata_tmp)
         31     adata.var['gene_id'] = adata.var['gene_id-1']
         32     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in concatenate(self, join, batch_key, batch_categories, index_unique, *adatas)
       2018             # var
       2019             for c in ad.var.columns:
    -> 2020                 new_c = c + (index_unique if index_unique is not None else '-') + categories[i]
       2021                 var.loc[vars_intersect, new_c] = ad.var.loc[vars_intersect, c]
       2022 
    
    TypeError: unsupported operand type(s) for +: 'int' and `'str'`
    

    I'm using this parameters:

    #Data files
    sample_strings = ['bystander', 'uninfected', 'infected']
    sample_id_strings = ['1', '2', '3']
    file_base = '/home/ec2-user/LSHIV/LSHIV8'
    exp_string = '_Regional_'
    data_file_end = '_matrix.mtx'
    barcode_file_end = '_barcodes.tsv'
    gene_file_end = '_genes.tsv'
    cc_genes_file = '~/pipeline/Macosko_cell_cycle_genes.txt'`
    
    opened by davidepisu 22
  • R libraries not accessible

    R libraries not accessible

    When using R in anaconda, a warning that the libraries cannot be installed occurs. These libraries can be added only locally. In order to overcome this, it helps to change the R_LIB path in the activate script

    opened by mumichae 17
  • Get errors when performing sc.pp.highly_variable_genes!

    Get errors when performing sc.pp.highly_variable_genes!

    Hi, I am following this excellent workflow to analyze my single-cell sequencing data sets. I have calculated the size factor using the scran package and did not perform the batch correction step as I have only one sample. Then, I intended to extract highly variable genes by using the function sc.pp.highly_variable_genes. Unfortunately, I got an error:

    'LinAlgError: Last 2 dimensions of the array must be square'

    LinAlgError Traceback (most recent call last) in ----> 1 sc.pp.highly_variable_genes(adata)

    ~/miniconda3/lib/python3.6/site-packages/scanpy/preprocessing/highly_variable_genes.py in highly_variable_genes(adata, min_disp, max_disp, min_mean, max_mean, n_top_genes, n_bins, flavor, subset, inplace) 94 X = np.expm1(adata.X) if flavor == 'seurat' else adata.X 95 ---> 96 mean, var = materialize_as_ndarray(_get_mean_var(X)) 97 # now actually compute the dispersion 98 mean[mean == 0] = 1e-12 # set entries equal to zero to small value

    ~/miniconda3/lib/python3.6/site-packages/scanpy/preprocessing/utils.py in _get_mean_var(X) 16 mean_sq = np.multiply(X, X).mean(axis=0) 17 # enforece R convention (unbiased estimator) for variance ---> 18 var = (mean_sq - mean**2) * (X.shape[0]/(X.shape[0]-1)) 19 else: 20 from sklearn.preprocessing import StandardScaler

    ~/miniconda3/lib/python3.6/site-packages/numpy/matrixlib/defmatrix.py in pow(self, other) 226 227 def pow(self, other): --> 228 return matrix_power(self, other) 229 230 def ipow(self, other):

    ~/miniconda3/lib/python3.6/site-packages/numpy/linalg/linalg.py in matrix_power(a, n) 600 a = asanyarray(a) 601 _assertRankAtLeast2(a) --> 602 _assertNdSquareness(a) 603 604 try:

    ~/miniconda3/lib/python3.6/site-packages/numpy/linalg/linalg.py in _assertNdSquareness(*arrays) 213 m, n = a.shape[-2:] 214 if m != n: --> 215 raise LinAlgError('Last 2 dimensions of the array must be square') 216 217 def _assertFinite(*arrays):

    This is my adata.X looks like right now: matrix([[0. , 0. , 0. , ..., 0. , 0. , 0. ], [0. , 0. , 1.203, ..., 0. , 0. , 0. ], [0. , 1.096, 0. , ..., 0. , 0. , 0. ], ..., [0. , 0. , 2.042, ..., 0. , 0. , 0. ], [0. , 0. , 0. , ..., 0.926, 0. , 0. ], [0. , 0. , 2.951, ..., 0. , 0. , 0. ]])

    also versions of my modules: scanpy==1.3.7 anndata==0.6.17 numpy==1.15.4 scipy==1.2.0 pandas==0.24.0 scikit-learn==0.20.2 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

    Looking forward your response! Thank you !

    opened by jipeifeng 17
  • computeSumFactors(data_mat, clusters=input_groups, min.mean=0.1) gives error

    computeSumFactors(data_mat, clusters=input_groups, min.mean=0.1) gives error

    I can run everything without problem until these lines:

    %%R -i data_mat -i input_groups -o size_factors
    size_factors = computeSumFactors(data_mat, clusters=input_groups, min.mean=0.1)
    

    where I get this error:

    Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘assay’ for signature ‘"matrix", "character"’

    Any idea whats going on here?

    Im using the original code except for these lines that I commented: #adata.var['gene_id'] = adata.var['gene_id-1'] #adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

    opened by ariberar 16
  • Exception: Data must be 1-dimensional when plotting new marker genes in jupyter notebook

    Exception: Data must be 1-dimensional when plotting new marker genes in jupyter notebook

    I am following the notebook to understand the steps of single-cell RNA seq.

    I had the same issue #21 for the version of scanpy, so I followed what is said in the answer. It worked perfectly beside some problems that I solved and I want to write here so everyone that has the same issues can resolve them:

    1. also adata = adata.concatenate(adata_tmp, batch_key='sample_id')and adata.obs.drop(columns=['sample_id'], inplace=True) generated errors so I commented out also those lines

    2. I had errors when a graph is plotted using sc.pl.palettes.godsnot_64 or sc.pl.palettes.default_64 so I use sc.pl.palettes.vega_20_scanpy instead

    3. In the step Marker genes & cluster annotation I replaced:

    adata.rename_categories('louvain_r0.5', ['TA', 'EP (early)', 'Stem', 'Goblet', 'EP (stress)', 'Enterocyte', 'Paneth', 'Enteroendocrine', 'Tuft']) 
    

    with

    adata.rename_categories('louvain_r0.5', ['TA', 'EP (early)', 'Stem', 'Goblet', 'EP (stress)', 'Enterocyte', 'Paneth'])
    

    because the number of old and new categories don't match, so that way it works.

    1. When I run
    #Plot the new marker genes
    sc.pl.rank_genes_groups(adata, key='rank_genes_r0.5_entero_sub', groups=['Enterocyte,0','Enterocyte,1','Enterocyte,2'], fontsize=12)
    

    I get that no field of name Enterocyte,2, so I commented out that group and with the other two it works.

    I get now to my problem. I am running the notebook with the case study data and with scanpy==1.4.5.1 anndata==0.7.1 umap==0.3.10 numpy==1.18.1 scipy==1.4.1 pandas==0.25.3 scikit-learn==0.22.1 statsmodels==0.11.1 python-igraph==0.8.0 louvain==0.6.1. (Note: I use pandas 0.25.3 because previously, when I try to run it with 1.0.1 it had incompatibility problems)

    Now I have a problem in the steps of subclustering, when I try to run:

    entero_clusts = [clust for clust in adata.obs['louvain_r0.5_entero_sub'].cat.categories if clust.startswith('Enterocyte')]
     for clust in entero_clusts:
        sc.pl.rank_genes_groups_violin(adata, use_raw=True, key='rank_genes_r0.5_entero_sub', groups=[clust], gene_names=adata.uns['rank_genes_r0.5_entero_sub']['names'][clust][90:100]) 
    
    

    the error I get is

    
     ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-58-4453b5691a91> in <module>
          2 
          3 for clust in entero_clusts:
    ----> 4     sc.pl.rank_genes_groups_violin(adata, use_raw=True, key='rank_genes_r0.5_entero_sub', groups=[clust], gene_names=adata.uns['rank_genes_r0.5_entero_sub']['names'][clust][90:100])
          5 
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/scanpy/plotting/_tools/__init__.py in rank_genes_groups_violin(adata, groups, n_genes, gene_names, gene_symbols, use_raw, key, split, scale, strip, jitter, size, ax, show, save)
        727             if issparse(X_col): X_col = X_col.toarray().flatten()
        728             new_gene_names.append(g)
    --> 729             df[g] = X_col
        730         df['hue'] = adata.obs[groups_key].astype(str).values
        731         if reference == 'rest':
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
       3485         else:
       3486             # set column
    -> 3487             self._set_item(key, value)
       3488 
       3489     def _setitem_slice(self, key, value):
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in _set_item(self, key, value)
       3561         """
       3562 
    -> 3563         self._ensure_valid_index(value)
       3564         value = self._sanitize_column(key, value)
       3565         NDFrame._set_item(self, key, value)
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in _ensure_valid_index(self, value)
       3538         if not len(self.index) and is_list_like(value):
       3539             try:
    -> 3540                 value = Series(value)
       3541             except (ValueError, NotImplementedError, TypeError):
       3542                 raise ValueError(
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
        312                     data = data.copy()
        313             else:
    --> 314                 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
        315 
        316                 data = SingleBlockManager(data, index, fastpath=True)
    
    ~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
        727     elif subarr.ndim > 1:
        728         if isinstance(data, np.ndarray):
    --> 729             raise Exception("Data must be 1-dimensional")
        730         else:
        731             subarr = com.asarray_tuplesafe(data, dtype=dtype)
    
    Exception: Data must be 1-dimensional
    

    I thought it was an issue for the cache but I cleaned it, restarted the kernel and cleared all outputs and the problem remains. How can I fix this?

    opened by federicaress 15
  • ComBat error

    ComBat error "TypeError: data type not understood"

    Issue report for the issue posted in #1: ComBat gives the following error: TypeError: data type not understood.

    @jphe Could you clarify whether you are still using a sparse data matrix? The current ComBat implementation does not work with the sparse matrix format.

    The ComBat function from www.github.com/mbuttner/maren_codes/ was designed to take pandas Dataframes as input, so the pandas dataframe is not the problem. The code does have issues when your data has 0 variance in the expression values of a gene. So you should filter out genes with constant gene expression values (usually genes with 0 expression).

    It would also be good to know the output of type(data.T).

    opened by LuckyMD 14
  • Problem with adata concatenation

    Problem with adata concatenation

    Hello, I am working with the jupyter notebook on macos and followed the Environment set up instructions. I am aware that the conda build may be problematic on macos, but as far as I can tell that was not an issue for me. To confirm, interactive cell 2 returns

    scanpy==1.5.1 anndata==0.7.3 umap==0.4.3 numpy==1.18.4 scipy==1.4.1 pandas==1.0.3 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.8.2 louvain==0.6.1

    When running the notebook (inside the conda environment) I encounter the following error message apparently triggered by the adata method concatenate:

     ... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad
    
    ---------------------------------------------------------------------------
    InvalidIndexError                         Traceback (most recent call last)
    <ipython-input-6-01aaadebece1> in <module>
         30 
         31     # Concatenate to main adata object
    ---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
         33     adata.var['gene_id'] = adata.var['gene_id-1']
         34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
       1696         all_adatas = (self,) + tuple(adatas)
       1697 
    -> 1698         out = concat(
       1699             all_adatas,
       1700             join=join,
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, join, batch_key, batch_categories, uns_merge, index_unique, fill_value)
        454 
        455     var_names = resolve_index([a.var_names for a in adatas], join=join)
    --> 456     reindexers = [
        457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
        458     ]
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
        455     var_names = resolve_index([a.var_names for a in adatas], join=join)
        456     reindexers = [
    --> 457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
        458     ]
        459 
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in gen_reindexer(new_var, cur_var, fill_value)
        255     new_size = len(new_var)
        256     old_size = len(cur_var)
    --> 257     new_pts = new_var.get_indexer(cur_var)
        258     cur_pts = np.arange(len(new_pts))
        259 
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
       2731 
       2732         if not self.is_unique:
    -> 2733             raise InvalidIndexError(
       2734                 "Reindexing only valid with uniquely valued Index objects"
       2735             )
    
    InvalidIndexError: Reindexing only valid with uniquely valued Index objects
    
    

    By title, this seems potentially related to #25 that in turn took me to #21 and #28, where it is stated that commenting out adata = adata.concatenate(adata_tmp, batch_key='sample_id') and adata.obs.drop(columns=['sample_id'], inplace=True) may be required. However, this in turn generated the error message

    ... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad
    
    ---------------------------------------------------------------------------
    KeyError                                  Traceback (most recent call last)
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
       2645             try:
    -> 2646                 return self._engine.get_loc(key)
       2647             except KeyError:
    
    pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
    
    pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
    
    pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
    
    pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
    
    KeyError: 'gene_id-1'
    
    During handling of the above exception, another exception occurred:
    
    KeyError                                  Traceback (most recent call last)
    <ipython-input-7-6ee374501f9d> in <module>
         31     # Concatenate to main adata object
         32     #adata = adata.concatenate(adata_tmp, batch_key='sample_id')
    ---> 33     adata.var['gene_id'] = adata.var['gene_id-1']
         34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
         35     #adata.obs.drop(columns=['sample_id'], inplace=True)
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
       2798             if self.columns.nlevels > 1:
       2799                 return self._getitem_multilevel(key)
    -> 2800             indexer = self.columns.get_loc(key)
       2801             if is_integer(indexer):
       2802                 indexer = [indexer]
    
    ~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
       2646                 return self._engine.get_loc(key)
       2647             except KeyError:
    -> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
       2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
       2650         if indexer.ndim > 1 or indexer.size > 1:
    
    pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
    
    pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
    
    pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
    
    pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
    
    KeyError: 'gene_id-1'
    
    

    that brought me back to comment on #21 . I have been asked to open as a separate issue.

    I think even if there is an already established solution, it may be confusing chasing through the issues having to find it- sorry!

    opened by davidtourigny 12
  • Best practices for using regress_out?

    Best practices for using regress_out?

    I am adapting the current best practices workflow (epithelial cells) from @LuckyMD with my own data set, and am running into an issue/question. I am subsetting my data to include a few clusters of interest. Once I have those clusters isolated, I am selecting highly variable genes, regressing out effects of cell cycle, ribo genes and mito genes, scaling the data, and embedding a new UMAP, all in preparation for some downstream trajectory analysis (I opted not to regress out anything in my main data set, but want to regress out some confounding factors in my subset specifically for trajectory analysis). Is it better to first run pp.highly_variable_genes and then use pp.regress_out, or is better to run pp.regress_out followed by pp.highly_variable_genes? My code currently looks like the following:

    Subset to highly variable genes sc.pp.highly_variable_genes(adata_sub, flavor='cell_ranger', n_top_genes=4000, subset=True)

    Regress out effects of cell cycle, mito genes, and ribo genes sc.pp.regress_out(adata_sub, ['S_score', 'G2M_score', 'percent_mt', 'percent_ribo'])

    Scale sc.pp.scale(adata_sub, max_value=10)

    Calculate the visualization

    sc.pp.pca(adata_sub, n_comps=50, use_highly_variable=True, svd_solver='arpack')
    sc.pp.neighbors(adata_sub)
    sc.tl.umap(adata_sub)
    

    If I run pp.regress_out before pp.highly_variable_genes, I have to include the line pp.filter_genes(adata_sub, min_counts=1 or else I get ValueError: The first guess on the deviance function returned a nan. This could be a boundary problem and should be reported. However, after doing some trial and error runs, I believe that including pp.filter_genes(adata_sub, min_counts=1 is excluding some genes of interest from my downstream trajectory analysis. I am able to recover these genes by reverting back to running pp.highly_variable_genes before pp.regress_out and excluding pp.filter_genes.

    Intuitively I feel like it makes more sense to run pp.regress_out before pp.highly_variable_genes, but considering I am having issues using that order for downstream analysis, is it OK to run pp.regress_out after pp.highly_variable_genes? What is the best practice?

    opened by oligomyeggo 11
  • Unable to deploy the .yml - Docker enhancement request

    Unable to deploy the .yml - Docker enhancement request

    Creating the environment with the yml file provided generate error:

    WARNING: The conda.compat module is deprecated and will be removed in a future release. WARNING: The conda.compat module is deprecated and will be removed in a future release.

    >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda/exceptions.py", line 1003, in __call__
        return func(*args, **kwargs)
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/cli/main.py", line 73, in do_call
        exit_code = getattr(module, func_name)(args, parser)
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/cli/main_create.py", line 77, in execute
        directory=os.getcwd())
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/specs/__init__.py", line 40, in detect
        if spec.can_handle():
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/specs/yaml_file.py", line 18, in can_handle
        self._environment = env.from_file(self.filename)
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/env.py", line 143, in from_file
        return from_yaml(yamlstr, filename=filename)
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/env.py", line 128, in from_yaml
        data = yaml_load_standard(yamlstr)
      File "/home/xma/anaconda3/lib/python3.7/site-packages/conda/common/serialize.py", line 76, in yaml_load_standard
        return yaml.load(string, Loader=yaml.Loader, version="1.2")
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/main.py", line 640, in load
        return loader._constructor.get_single_data()  # type: ignore
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/constructor.py", line 102, in get_single_data
        node = self.composer.get_single_node()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/composer.py", line 75, in get_single_node
        document = self.compose_document()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/composer.py", line 99, in compose_document
        self.parser.get_event()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/parser.py", line 166, in get_event
        self.current_event = self.state()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/parser.py", line 244, in parse_document_end
        token = self.scanner.peek_token()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 173, in peek_token
        self.fetch_more_tokens()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 273, in fetch_more_tokens
        return self.fetch_value()
      File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 626, in fetch_value
        self.reader.get_mark())
    ruamel_yaml.scanner.ScannerError: mapping values are not allowed here
      in "<unicode string>", line 32, column 187:
         ...  in single-cell RNA-seq analysis: a tutorial&quot;  - theislab/s ... 
                                             ^ (line: 32)
    

    $ /home/xma/anaconda3/bin/conda-env create -f /home/xma/Downloads/sc_tutorial_environment.yml

    environment variables: CIO_TEST= CONDA_AUTO_UPDATE_CONDA=false CONDA_DEFAULT_ENV=base CONDA_EXE=/home/xma/anaconda3/bin/conda CONDA_PREFIX=/home/xma/anaconda3 CONDA_PROMPT_MODIFIER=(base) CONDA_ROOT=/home/xma/anaconda3 CONDA_SHLVL=1 PATH=/home/xma/anaconda3/bin:/home/xma/anaconda3/condabin:/home/xma/anacond a3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/u sr/games:/usr/local/games:/snap/bin REQUESTS_CA_BUNDLE= SSL_CERT_FILE= WINDOWPATH=2

     active environment : base
    active env location : /home/xma/anaconda3
            shell level : 1
       user config file : /home/xma/.condarc
    

    populated config files : /home/xma/.condarc conda version : 4.6.11 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : /home/xma/anaconda3 (read only) channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/linux-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/xma/anaconda3/pkgs /home/xma/.conda/pkgs envs directories : /home/xma/.conda/envs /home/xma/anaconda3/envs platform : linux-64 user-agent : conda/4.6.11 requests/2.21.0 CPython/3.7.3 Linux/4.18.0-20-generic ubuntu/18.04.2 glibc/2.27 UID:GID : 1000:1000 netrc file : None offline mode : False

    An unexpected error has occurred. Conda has prepared the above report.

    If submitted, this report will be used by core maintainers to improve future releases of conda. Would you like conda to send this report to the core maintainers?

    enhancement 
    opened by marshelma 11
  • ValueError: cannot reindex from a duplicate axis

    ValueError: cannot reindex from a duplicate axis

    Hi,

    I followed the instruction step by step, and have successfully set up the environment and load the data. But when I ran Case-study_Mouse-intestinal-epithelium_1904.ipynb in linux, I got errors:

    ValueError: cannot reindex from a duplicate axis

    The full error message is like this:
    ==============================================================================
    Traceback (most recent call last):
      File "/home/lwang/miniconda3/bin/jupyter-nbconvert", line 11, in <module>
        sys.exit(main())
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/jupyter_core/application.py", line 254, in launch_instance
        return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/traitlets/config/application.py", line 845, in launch_instance
        app.start()
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 350, in start
        self.convert_notebooks()
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
        self.convert_single_notebook(notebook_filename)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
        output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
        output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
        return self.from_file(f, resources=resources, **kw)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
        return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
        nb_copy, resources = super().from_notebook_node(nb, resources, **kw)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
        nb_copy, resources = self._preprocess(nb_copy, resources)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
        nbc, resc = preprocessor(nbc, resc)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
        return self.preprocess(nb, resources)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
        self.execute()
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
        return just_run(coro(*args, **kwargs))
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
        return loop.run_until_complete(coro)
      File "/home/lwang/miniconda3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
        return future.result()
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 540, in async_execute
        await self.async_execute_cell(
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
        cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
        cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
        return just_run(coro(*args, **kwargs))
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
        return loop.run_until_complete(coro)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nest_asyncio.py", line 98, in run_until_complete
        return f.result()
      File "/home/lwang/miniconda3/lib/python3.8/asyncio/futures.py", line 178, in result
        raise self._exception
      File "/home/lwang/miniconda3/lib/python3.8/asyncio/tasks.py", line 280, in __step
        result = coro.send(None)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 832, in async_execute_cell
        self._check_raise_for_error(cell, exec_reply)
      File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 740, in _check_raise_for_error
        raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
    nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
    ------------------
    # Loop to load rest of data sets
    for i in range(len(sample_strings)):
        #Parse Filenames
        sample = sample_strings[i]
        sample_id = sample_id_strings[i]
        data_file = file_base+sample_id+exp_string+sample+data_file_end
        barcode_file = file_base+sample_id+exp_string+sample+barcode_file_end
        gene_file = file_base+sample_id+exp_string+sample+gene_file_end
        
        #Load data
        adata_tmp = sc.read(data_file, cache=True)
        adata_tmp = adata_tmp.transpose()
        adata_tmp.X = adata_tmp.X.toarray()
    
        barcodes_tmp = pd.read_csv(barcode_file, header=None, sep='\t')
        genes_tmp = pd.read_csv(gene_file, header=None, sep='\t')
        
        #Annotate data
        barcodes_tmp.rename(columns={0:'barcode'}, inplace=True)
        barcodes_tmp.set_index('barcode', inplace=True)
        adata_tmp.obs = barcodes_tmp
        adata_tmp.obs['sample'] = [sample]*adata_tmp.n_obs
        adata_tmp.obs['region'] = [sample.split("_")[0]]*adata_tmp.n_obs
        adata_tmp.obs['donor'] = [sample.split("_")[1]]*adata_tmp.n_obs
        
        genes_tmp.rename(columns={0:'gene_id', 1:'gene_symbol'}, inplace=True)
        genes_tmp.set_index('gene_symbol', inplace=True)
        adata_tmp.var = genes_tmp
        adata_tmp.var_names_make_unique()
    
        # Concatenate to main adata object
        adata = adata.concatenate(adata_tmp, batch_key='sample_id')
        adata.var['gene_id'] = adata.var['gene_id-1']
        adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
        adata.obs.drop(columns=['sample_id'], inplace=True)
        adata.obs_names = [c.split("-")[0] for c in adata.obs_names]
        adata.obs_names_make_unique(join='_')
    ------------------
    
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-1-01aaadebece1> in <module>
         30 
         31     # Concatenate to main adata object
    ---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
         33     adata.var['gene_id'] = adata.var['gene_id-1']
         34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
    
    ~/miniconda3/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
       1694         all_adatas = (self,) + tuple(adatas)
       1695 
    -> 1696         out = concat(
       1697             all_adatas,
       1698             axis=0,
    
    ~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, axis, join, merge, uns_merge, label, keys, index_unique, fill_value, pairwise)
        812 
        813     # Annotation for other axis
    --> 814     alt_annot = merge_dataframes(
        815         [getattr(a, alt_dim) for a in adatas], alt_indices, merge
        816     )
    
    ~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in merge_dataframes(dfs, new_index, merge_strategy)
        524 
        525 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
    --> 526     dfs = [df.reindex(index=new_index) for df in dfs]
        527     # New dataframe with all shared data
        528     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)
    
    ~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
        524 
        525 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
    --> 526     dfs = [df.reindex(index=new_index) for df in dfs]
        527     # New dataframe with all shared data
        528     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
        310         @wraps(func)
        311         def wrapper(*args, **kwargs) -> Callable[..., Any]:
    --> 312             return func(*args, **kwargs)
        313 
        314         kind = inspect.Parameter.POSITIONAL_OR_KEYWORD
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in reindex(self, *args, **kwargs)
       4171         kwargs.pop("axis", None)
       4172         kwargs.pop("labels", None)
    -> 4173         return super().reindex(**kwargs)
       4174 
       4175     def drop(
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
       4804 
       4805         # perform the reindex on the axes
    -> 4806         return self._reindex_axes(
       4807             axes, level, limit, tolerance, method, fill_value, copy
       4808         ).__finalize__(self, method="reindex")
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
       4017         index = axes["index"]
       4018         if index is not None:
    -> 4019             frame = frame._reindex_index(
       4020                 index, method, copy, level, fill_value, limit, tolerance
       4021             )
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in _reindex_index(self, new_index, method, copy, level, fill_value, limit, tolerance)
       4036             new_index, method=method, level=level, limit=limit, tolerance=tolerance
       4037         )
    -> 4038         return self._reindex_with_indexers(
       4039             {0: [new_index, indexer]},
       4040             copy=copy,
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
       4870 
       4871             # TODO: speed up on homogeneous DataFrame objects
    -> 4872             new_data = new_data.reindex_indexer(
       4873                 index,
       4874                 indexer,
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate, only_slice)
       1299         # some axes don't allow reindexing with dups
       1300         if not allow_dups:
    -> 1301             self.axes[axis]._can_reindex(indexer)
       1302 
       1303         if axis >= self.ndim:
    
    ~/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
       3474         # trying to reindex on an axis with duplicates
       3475         if not self._index_as_unique and len(indexer):
    -> 3476             raise ValueError("cannot reindex from a duplicate axis")
       3477 
       3478     def reindex(self, target, method=None, level=None, limit=None, tolerance=None):
    
    ValueError: cannot reindex from a duplicate axis
    ValueError: cannot reindex from a duplicate axis
    

    ===============================================================================

    I didn't change anything (the file names, the file directory or code all stay the same), so I'm not sure why it didn't work. Am I supposed to modify the code or anything? I'm very new in scRNA analysis, so I have a hard time to figure our where to start debugging, could you please give me some advice on this?

    Thanks very very much! Leran

    opened by Leran10 10
  • Package versions in .yml file

    Package versions in .yml file

    It is rather a question, not an issue:

    in the .yml file some of the packages are listed with older than most recent versions:

    • python>=3.5, <3.7
    • cmake>=3.9, <3.11

    the most recent and installed are: -python=3.9.7 -cmake=3.21.2

    there should not be a problem, right?

    nice day,

    Iliya

    opened by lefterov 9
  • environment installation problem

    environment installation problem

    Hello,

    I am trying to install the environment by

    conda env create -f sc_tutorial_environment.yml

    but get the following error:

    Pip subprocess error:
      Running command git clone -q https://github.com/flying-sheep/anndata2ri /tmp/pip-req-build-_sxdlckc
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-46eoz0zx
           cwd: /tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/a9/11/5f175fc3d2313b53cb3c921db9e8bba58b67d739f5a637146b45f2e0e80c/rpy2-3.5.5.tar.gz#sha256=a252c40e21cf4f23ac6e13bffdcb82b5900b49c3043ed8fd31da5c61fb58d037 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-za7i_5lx
           cwd: /tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/40/09/a754484c80f8c58f077b1b9b2249787c689c8dd1559e3eb91cd5b8690dc2/rpy2-3.5.4.tar.gz#sha256=ba0a877b2b96e27d2091383d4652b82aa2271cff4a505243d45da430b712aaf5 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-ruvwrq4l
           cwd: /tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/9b/5d/44d001cb386d009e228afbc9327ee07dc9ade108a908dc84a6801c093255/rpy2-3.5.3.tar.gz#sha256=53a092d48b44f46428fb30cb3155664d6d2f7af08ebc4c45df98df4c45a42ccb (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-7z8170o7
           cwd: /tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/ee/2f/392c6fd5fad3cd6f2eda24855e6b53d0b61246da105a77b55acd66c3bddd/rpy2-3.5.2.tar.gz#sha256=45ee00fdbad0b481c39369fb315651ddf854cf4ad56d24b7dcb3c74e135bd10e (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-dtd_7sou
           cwd: /tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/82/3b/7258fa09eff9bb64ea09a8a8220a1f845561f0b8af76306924fbe58a2009/rpy2-3.5.1.tar.gz#sha256=d35717489bd0754b556202a6b990ebc6f3a1c18c9c23b3a862a46a91fc265805 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-apcl_bp6
           cwd: /tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/
      Complete output (1 lines):
      rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/a8/e5/e532f7189d852dfefb4f9fe4baa0ac53f180538ac8acec8d30c4be5887c1/rpy2-3.5.0.tar.gz#sha256=4d8a20253320ae3e402ec20d56640772bdc27731bd14f7d0475a95d86e7cd1c7 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-dogw0wzm
           cwd: /tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/75/13/76f3fa526a5e39ee26752c6e78fca509821c57699999e68086094a6ff9cb/scanpy-1.3.6.tar.gz#sha256=ebc7cd0a9726a4a9088a8d0eafb8eb59802f8acb85bc28a2bdf8dbf0144f87c8 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-7yuxe4rl
           cwd: /tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/f6/1a/83abbd428fbd060b041aa5ccf884a3cbba3c7e21342f5c76107386cc6025/scanpy-1.3.5.tar.gz#sha256=22ec896d232c1586fab8bd5a989c0a1251b840f87a67e466a0d784b3b10d0782 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-imxrfwme
           cwd: /tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/47/a2/3edd61806453cccad8828fc74b0d9377cf419abc14b737ed57187e446460/scanpy-1.3.4.tar.gz#sha256=fd1be48c00919ce72e67635a8d31e039d618865bc1456d72abf42570bcd760d6 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-3x6f6xm0
           cwd: /tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/c0/f6/b247ffd1c4f776c4b5b3ee32ffc990a0aee09cf6197002ad3e31a8e0fbad/scanpy-1.3.2.tar.gz#sha256=0dd11ebb2098636fadbc7d3d355e206ffedcdd1d70c3fc85ac9cb95e0e398a5e (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-l49aybw0
           cwd: /tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/
      Complete output (2 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ['anndata>=0.6.10', 'matplotlib>=2.2', 'pandas>=0.21', 'scipy', 'seaborn', 'h5py', 'tables', 'scikit-learn>=0.19.1', 'statsmodels', 'networkx', 'natsort', 'joblib', 'numba']
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/78/cf/b2cc01e9c33613d738c1cdeb5af7d250cf965bda187aa258e45ed25a6a15/scanpy-1.3.1.tar.gz#sha256=cbf81b67933c5a8358012ca58a71a5b08fe41a2123c3ecd02d27660c1864eb83 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-f74u8w54
           cwd: /tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/
      Complete output (1 lines):
      error in scanpy setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Parse error at "'[doc]'": Expected W:(abcd...)
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/36/70/367316a52e6ab4529ab6489f34cfbea89ea8d6dc6ad95bae01e3f54dfa4d/scanpy-1.3.tar.gz#sha256=c12517cb7c373f0f562b3276057e890744edd015486a4448e502a5f264ff1193 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-jcdxvh3p
           cwd: /tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/b3/43/76f4f30dce3d6599ba61a09eeedd4e456c4699d4eb0e11fa70b85e909b48/scanpy-1.2.1.tar.gz#sha256=fa11a3d922b95dea007a6fee53cef320816c7ff538c30a45b4483ef267bbbe5f (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-cdt3h3s0
           cwd: /tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/8e/0a/5124937fa23ec0311ad10e9eda00731cf2152755f5b78c4751f371050ee2/scanpy-1.2.0.tar.gz#sha256=5065bf203ecb67176373a94b5b0914bc8a6da990e7f6b67b0f08be3de9dd2f03 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-qmezagzm
           cwd: /tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/81/02/5df31ff33b28b9b438d70f0d82a9452acc6aa67e8652e6d9169920e4574d/scanpy-1.1.tar.gz#sha256=4be902d350ccc34b57447081e26091a74cd254ff4ba11b4c8c228f38d645e727 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-kobc5xlh
           cwd: /tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/f6/12/d1c24809472c8d0c75d9d21de6fea83600dfde47fdf8d505d7340d6cd3c8/scanpy-1.0.4.tar.gz#sha256=ca2424d265e4118dfd71719fd18738e1396d8090987c613b99a8633cf4775cd2 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
      ERROR: Command errored out with exit status 1:
       command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-y5ej_ol2
           cwd: /tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/
      Complete output (1 lines):
      error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
      ----------------------------------------
    WARNING: Discarding https://files.pythonhosted.org/packages/45/15/09af7e433871b94ec26086cd1b2de71eb376d006698701d1c52d3c8acb0a/scanpy-1.0.3.post1.tar.gz#sha256=d83f7da9bc838a69182e612ea9e9003751247c317433788d842cb29e434257e6 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
    ERROR: Package 'anndata2ri' requires a different Python: 3.6.15 not in '>=3.7'
    
    failed
    
    CondaEnvException: Pip failed
    

    The error seems to be related to python version. Could you tell me how to solve this problem?

    Thank you

    opened by Sunyp-IM 1
  • enviroment install failed

    enviroment install failed

    Dear authors,

    I tried to run the tutorial but get stuck in the environment install step. I ran the following command as guided:

    conda env create -f sc_tutorial_environment.yml

    During this process was running, the memory of my computer was gradually used up until the process was killed. My computer has a RAM of 16 G, which I thought should be large enough for dealing with most general computation problems. So I wonder what kind of computer configuration is needed to run this tutorial. And is there something wrong with my operation?

    Best regards

    opened by Sunyp-IM 1
  • Sparse matrix dimensions

    Sparse matrix dimensions

    Hello,

    Thank you for providing this guide, it is very helpful. I am going through some parts of the code and I realized the following line in Case-study_Mouse-intestinal-epithelium_1906.ipynb raise an error:

    adata.obs['mt_frac'] = adata.X[:, mt_gene_mask].sum(1)/adata.obs['n_counts']

    When I investigated a little I realized adata.X is a 2d sparse matrix so it can't be divided by 1d series. Also, applying flatten or reshape directly doesn't work because it returns a numpy matrix which can't be flatten (or I couldn't). This might be a version issue, but my solution was replacing the line with

    adata.obs['mt_frac'] = np.array(adata.X[:, mt_gene_mask].sum(1)).flatten()/adata.obs['n_counts']

    Best

    opened by acihanckr 1
  • R command throw UnicodeDecodeError

    R command throw UnicodeDecodeError

    Hello dear authors,

    I have a very persistend problem I can't find a solution for. When executing almost all commands in cells with "%%R" get a similar Error like this: cell probalmes utf8

    utf-8 decoding

    The cell that did work was this one: but for the imports I had to work around as you can see, since it threw an error: worked and workaround

    How can I resolve this or what may be the cause? My R version is 4.2.0 and I am on Windows 10 x64.

    Thank you a lot,

    Mariam

    opened by Mari123i 3
  • Normalizing subsetted data

    Normalizing subsetted data

    Hi @LuckyMD, I've been reprocessing some old data using your single cell tutorial workflow and have a best practices question (I am not sure if this is the correct place for this, or if this should be moved to the new scverse discourse group?). I have an adata object that is scran normalized. I want to take a subset of clusters from the adata object and create a new adata_sub object with its own dimensional reduction to investigate a subpopulation of interest. My understanding is that, had I opted for a basic log normalization, I would not need to re-normalize adata_sub as log normalization is done on a per-cell basis. However, because scran normalization uses a coarse clustering of cells present in the object, I would want to re-normalize adata_sub if adata had been normalized via scran, correct? What would be the best way to subset and re-normalize adata_sub? I am not sure what steps of the original scran normalization process need to be repeated and what steps can be omitted. For instance, I would want to perform a new clustering for the subsetted data for scran normalization and get new size factors, but I wouldn't need to set adata_sub.layers['counts'] = adata_sub.X.copy(), as adata_sub contains a subsetted counts layer from adata (correct?). Would I need to restore adata_sub.raw = adata_sub in this scenario?:

    # subset adata to clusters of interest
    adata_sub = adata[adata.obs['leiden_r1.0'].isin(['1, '3', '5'])].copy()
    
    # perform clustering for scran normalization
    adata_sub_pp = adata_sub.copy()
    #sc.pp.normalize_per_cell(adata_sub_pp, counts_per_cell_after = 1e6) - can we omit this since we did it for adata?
    #sc.pp.log1p(adata_sub_pp) - can we omit this since we did it for adata?
    sc.pp.pca(adata_sub_pp, n_comps = 15)
    sc.pp.neighbors(adata_sub_pp)
    sc.tl.leiden(adata_sub_pp, key_added = 'groups', resolution = 0.5)
    
    # preprocess variables for scran normalization
    input_groups = adata_sub_pp.obs['groups']
    data_mat = adata_sub.X.T
    
    %%R -i data_mat -i input_groups -o size_factors
    
    size_factors = sizeFactors(computeSumFactors(SingleCellExperiment(list(counts = data_mat)), 
                                                 clusters = input_groups, 
                                                 min.mean = 0.1))
    
    del adata_sub_pp
    
    adata_sub.obs['size_factors'] = size_factors # overwrites existing ['size_factors'] from adata
    
    # adata_sub.layers['counts'] = adata_sub.X.copy() - this can be omitted?
    
    # Normalize adata_sub
    adata_sub.X /= adata_sub.obs['size_factors'].values[:,None]
    sc.pp.log1p(adata_sub) # should this be omitted?
    adata_sub.X = sp.sparse.csr_matrix(adata_sub.X)
    adata_sub.raw = adata_sub
    

    Thank you for any help and advice!

    opened by oligomyeggo 1
Owner
Theis Lab
Institute of Computational Biology
Theis Lab
Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Duong H. Le 18 Jun 13, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 5, 2023
Best Practices on Recommendation Systems

Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith

Microsoft 14.8k Jan 3, 2023
A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 3, 2023
Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Representation Robustness Evaluations Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all t

Sicheng 19 Dec 7, 2022
Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash Abstract: Apple recently revealed its deep perceptual hashing system NeuralHash to

ml-research@TUDarmstadt 11 Dec 3, 2022
7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

null 8 Jul 9, 2021
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 8, 2022
Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Rbc-traps-generative-design - The generative design for single red clood cell hydrodynamic traps using GEFEST framework

Natural Systems Simulation Lab 4 Jun 16, 2022
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022
Robustness between the worst and average case

Robustness between the worst and average case A repository that implements intermediate robustness training and evaluation from the NeurIPS 2021 paper

CMU Locus Lab 10 Dec 10, 2021
In the case of your data having only 1 channel while want to use timm models

timm_custom Description In the case of your data having only 1 channel while want to use timm models (with or without pretrained weights), run the fol

null 2 Nov 26, 2021
One line to host them all. Bootstrap your image search case in minutes.

One line to host them all. Bootstrap your image search case in minutes. Survey NOW gives the world access to customized neural image search in just on

Jina AI 403 Dec 30, 2022
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

null 35 Oct 4, 2022
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

Sanja Fidler's Lab 52 Nov 22, 2022
pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.

pcnaDeep: a deep-learning based single-cell cycle profiler with PCNA signal Welcome! pcnaDeep integrates cutting-edge detection techniques with tracki

ChanLab 8 Oct 18, 2022
A lightweight Python-based 3D network multi-agent simulator. Uses a cell-based congestion model. Calculates risk, loudness and battery capacities of the agents. Suitable for 3D network optimization tasks.

AMAZ3DSim AMAZ3DSim is a lightweight python-based 3D network multi-agent simulator. It uses a cell-based congestion model. It calculates risk, battery

Daniel Hirsch 13 Nov 4, 2022
Study of human inductive biases in CNNs and Transformers.

Are Convolutional Neural Networks or Transformers more like human vision? This repository contains the code and fine-tuned models of popular Convoluti

Shikhar Tuli 39 Dec 8, 2022