Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Theis Lab

Last update: Dec 28, 2022

Related tags

Deep Learning single-cell-tutorial

Overview

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial"

This repository is complementary to the publication:

M.D. Luecken, F.J. Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial", Molecular Systems Biology 15(6) (2019): e8746

The paper was recommended on F1000 prime as being of special significance in the field.

The repository contains:

scripts to generate the paper figures
a case study which complements the manuscript
the code for the marker gene detection study from the supplementary material

The main part of this repository is a case study where the best-practices established in the manuscript are applied to a mouse intestinal epithelium regions dataset from Haber et al., Nature 551 (2018) available from the GEO under GSE92332. This case study can be found in different versions in the latest_notebook/ and old_releases/ directories.

The scripts in the plotting_scripts/ folder reproduce the figures that are shown in the manuscript and the supplementary materials. These scripts contain comments to explain each step. Each figure that does not have a corresponding script in the plotting_scripts/ folder was taken from the case study or the marker gene study.

In case of questions or issues, please get in touch by posting an issue in this repository.

If the materials in this repo are of use to you, please consider citing the above publication.

Environment set up

A docker container with a working sc-tutorial environment is now available here thanks to Leander Dony. If you would like to set up the environment via conda or manually outside of the docker container, please follow the instructions below.

To run the tutorial case study, several packages must be installed. As both R and python packages are required, we prefer using a conda environment. To facilitate the setup of a conda environment, we have provided the sc_tutorial_environment.yml file, which contains all conda and pip installable dependencies. R dependencies, which are not already available as conda packages, must be installed into the environment itself.

To set up a conda environment, the following instructions must be followed.

Set up the conda environment from the sc_tutorial_environment.yml file.
```
conda env create -f sc_tutorial_environment.yml
```
Ensure that the environment can find the gsl libraries from R. This is done by setting the CFLAGS and LDFLAGS environment variables (see https://bit.ly/2CjJsgn). Here we set them so that they are correctly set every time the environment is activated.
```
cd YOUR_CONDA_ENV_DIRECTORY
mkdir -p ./etc/conda/activate.d
mkdir -p ./etc/conda/deactivate.d
touch ./etc/conda/activate.d/env_vars.sh
touch ./etc/conda/deactivate.d/env_vars.sh
```
Where YOUR_CONDA_ENV_DIRECTORY can be found by running conda info --envs and using the directory that corresponds to your conda environment name (default: sc-tutorail).

WHILE NOT IN THE ENVIRONMENT(!!!!) open the env_vars.sh file at ./etc/conda/activate.d/env_vars.sh and enter the following into the file:
```
#!/bin/sh

CFLAGS_OLD=$CFLAGS
export CFLAGS_OLD
export CFLAGS="`gsl-config --cflags` ${CFLAGS_OLD}"
 
LDFLAGS_OLD=$LDFLAGS
export LDFLAGS_OLD
export LDFLAGS="`gsl-config --libs` ${LDFLAGS_OLD}"
```
Also change the ./etc/conda/deactivate.d/env_vars.sh file to:
```
#!/bin/sh
 
CFLAGS=$CFLAGS_OLD
export CFLAGS
unset CFLAGS_OLD
 
LDFLAGS=$LDFLAGS_OLD
export LDFLAGS
unset LDFLAGS_OLD
```
Note again that these files should be written WHILE NOT IN THE ENVIRONMENT. Otherwise you may overwrite the CFLAGS and LDFLAGS environment variables in the base environment!
Enter the environment by conda activate sc-tutorial or conda activate ENV_NAME if you changed the environment name in the sc_tutorial_environment.yml file.

Open R and install the dependencies via the commands:

install.packages(c('devtools', 'gam', 'RColorBrewer', 'BiocManager'))
update.packages(ask=F)
BiocManager::install(c("scran","MAST","monocle","ComplexHeatmap","slingshot"), version = "3.8")

These steps should set up an environment to perform single cell analysis with the tutorial workflow on a Linux system. Please note that we have encountered issues with conda environments on Mac OS. When using Mac OS we recommend installing the packages without conda using separately installed python and R versions. Alternatively, you can try using the base conda environment and installing all packages as described in the conda_env_instructions_for_mac.txt file. In the base environment, R should be able to find the relevant gsl libraries, so LDFLAGS and CFLAGS should not need to be set.

Also note that conda and pip doesn't always play nice together. Conda developers have suggested first installing all conda packages and then installing pip packages on top of this where conda packages are not available. Thus, installing further conda packages into the environment may cause issues. Instead, start a new environment and reinstall all conda packages first.

If you prefer to set up an environment manually, a list of all package requirements are given at the end of this document.

Downloading the data

As mentioned above the data for the case study comes from GSE92332. To run the case study as shown, you must download this data and place it in the correct folder. Unpacking the data requires tar and gunzip, which should already be available on most systems. If you are cloning the github repository and have the case study script in a latest_notebook/ folder, then from the location where you store the case study ipynb file, this can be done via the following commands:

cd ../  #To get to the main github repo folder
mkdir -p data/Haber-et-al_mouse-intestinal-epithelium/
cd data/Haber-et-al_mouse-intestinal-epithelium/
wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE92nnn/GSE92332/suppl/GSE92332_RAW.tar
mkdir GSE92332_RAW
tar -C GSE92332_RAW -xvf GSE92332_RAW.tar
gunzip GSE92332_RAW/*_Regional_*

The annotated dataset with which we briefly compare the results at the end of the notebook, is available from the same GEO accession ID (GSE92332). It can be obtained using the following command:

cd data/Haber-et-al_mouse-intestinal-epithelium/
wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE92nnn/GSE92332/suppl/GSE92332_Regional_UMIcounts.txt.gz
gunzip GSE92332_Regional_UMIcounts.txt.gz

Case study notes

We have noticed that results such as visualization, dimensionality reduction, and clustering (and hence all downstream results as well) can give slightly different results on different systems. This has to do with the numerical libraries that are used in the backend. Thus, we cannot guarantee that a rerun of the notebook will generate exactly the same clusters.

While all results are qualitatively similar, the assignment of cells to clusters especialy for stem cells, TA cells, and enterocyte progenitors can differ between runs across systems. To show the diversity that can be expected, we have uploaded shortened case study notebooks to the alternative_clustering_results/ folder.

Note that running sc.pp.pca() with the parameter svd_solver='arpack' drastically reduces the variability between systems, however the output is not exactly the same.

Adapting the pipeline for other datasets:

The pipeline was designed to be easily adaptable to new datasets. However, there are several limitations to the general applicability of the current workflow. When adapting the pipeline for your own dataset please take into account the following:

Sparse data formats are not supported by rpy2 and therefore do not work with any of the integrated R commands. Datasets can be turned into a dense format using the code: adata.X = adata.X.toarray()
The case study assumes that the input data is count data obtained from a single-cell protocol with UMIs. If the input data is full-length read data, then one could consider replacing the normalization method with another method that includes gene length normalization (e.g., TPM).

Manual installation of package requirements

The following packages are required to run the first version of the case study notebook. For further versions see the README.md in the latest_notebook/ and old_releases/ folders.

General:

Jupyter notebook
IRKernel
rpy2
R >= 3.4.3
Python >= 3.5

Python:

scanpy
numpy
scipy
pandas
seaborn
louvain>=0.6
python-igraph
gprofiler-official (from Case study notebook 1906 version)
python-gprofiler from Valentine Svensson's github (vals/python-gprofiler)
- only needed for notebooks before version 1906
ComBat python implementation from Maren Buettner's github (mbuttner/maren_codes/combat.py)
- only needed for scanpy versions before 1.3.8 which don't include sc.pp.combat()

scater
scran
MAST
gam
slingshot (change DESCRIPTION file for R version 3.4.3)
monocle 2
limma
ComplexHeatmap
RColorBrewer
clusterExperiment
ggplot2
IRkernel

Possible sources of error in the manual installation:

For R 3.4.3:

When using Slingshot in R 3.4.3, you must pull a local copy of slingshot via the github repository and change the DESCRIPTION file to say R>=3.4.3 instead of R>=3.5.0.

For R >= 3.5 and bioconductor >= 3.7:

The clusterExperiment version that comes for bioconductor 3.7 has slightly changed naming convention. clusterExperiment() is now called ClusterExperiment(). The latest version of the notebook includes this change, but when using the original notebook, please note that this may throw an error.

For rpy2 < 3.0.0:

Pandas 0.24.0 is not compatible with rpy2 < 3.0.0. When using old versions of rpy2, please downgrade pandas to 0.23.X. Please also note that Pandas 0.24.0 requires anndata version 0.6.18 and scanpy version > 1.37.0.

For enrichment analysis with g:profiler:

Ensure that the correct g:profiler package is used for the notebook. Notebooks until 1904 use python-gprofiler from valentine svensson's github, and Notebooks from 1906 use the gprofiler-official package from the g:profiler team.

If not R packages can be found:

Ensure that IRkernel has linked the correct version of R with your jupyter notebook. Check instructions at https://github.com/IRkernel/IRkernel.

Comments

Concatenate files error

I'm following the notebook but I'm getting an error in concatenating the files:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-b1c46f8469f7> in <module>
     28 
     29     # Concatenate to main adata object
---> 30     adata = adata.concatenate(adata_tmp)
     31     adata.var['gene_id'] = adata.var['gene_id-1']
     32     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in concatenate(self, join, batch_key, batch_categories, index_unique, *adatas)
   2018             # var
   2019             for c in ad.var.columns:
-> 2020                 new_c = c + (index_unique if index_unique is not None else '-') + categories[i]
   2021                 var.loc[vars_intersect, new_c] = ad.var.loc[vars_intersect, c]
   2022 

TypeError: unsupported operand type(s) for +: 'int' and `'str'`

I'm using this parameters:

#Data files
sample_strings = ['bystander', 'uninfected', 'infected']
sample_id_strings = ['1', '2', '3']
file_base = '/home/ec2-user/LSHIV/LSHIV8'
exp_string = '_Regional_'
data_file_end = '_matrix.mtx'
barcode_file_end = '_barcodes.tsv'
gene_file_end = '_genes.tsv'
cc_genes_file = '~/pipeline/Macosko_cell_cycle_genes.txt'`

opened by davidepisu 22

R libraries not accessible

When using R in anaconda, a warning that the libraries cannot be installed occurs. These libraries can be added only locally. In order to overcome this, it helps to change the R_LIB path in the activate script

opened by mumichae 17
Get errors when performing sc.pp.highly_variable_genes!

Hi, I am following this excellent workflow to analyze my single-cell sequencing data sets. I have calculated the size factor using the scran package and did not perform the batch correction step as I have only one sample. Then, I intended to extract highly variable genes by using the function sc.pp.highly_variable_genes. Unfortunately, I got an error:

'LinAlgError: Last 2 dimensions of the array must be square'

LinAlgError Traceback (most recent call last) in ----> 1 sc.pp.highly_variable_genes(adata)

~/miniconda3/lib/python3.6/site-packages/scanpy/preprocessing/highly_variable_genes.py in highly_variable_genes(adata, min_disp, max_disp, min_mean, max_mean, n_top_genes, n_bins, flavor, subset, inplace) 94 X = np.expm1(adata.X) if flavor == 'seurat' else adata.X 95 ---> 96 mean, var = materialize_as_ndarray(_get_mean_var(X)) 97 # now actually compute the dispersion 98 mean[mean == 0] = 1e-12 # set entries equal to zero to small value

~/miniconda3/lib/python3.6/site-packages/scanpy/preprocessing/utils.py in _get_mean_var(X) 16 mean_sq = np.multiply(X, X).mean(axis=0) 17 # enforece R convention (unbiased estimator) for variance ---> 18 var = (mean_sq - mean**2) * (X.shape[0]/(X.shape[0]-1)) 19 else: 20 from sklearn.preprocessing import StandardScaler

~/miniconda3/lib/python3.6/site-packages/numpy/matrixlib/defmatrix.py in pow(self, other) 226 227 def pow(self, other): --> 228 return matrix_power(self, other) 229 230 def ipow(self, other):

~/miniconda3/lib/python3.6/site-packages/numpy/linalg/linalg.py in matrix_power(a, n) 600 a = asanyarray(a) 601 _assertRankAtLeast2(a) --> 602 _assertNdSquareness(a) 603 604 try:

~/miniconda3/lib/python3.6/site-packages/numpy/linalg/linalg.py in _assertNdSquareness(*arrays) 213 m, n = a.shape[-2:] 214 if m != n: --> 215 raise LinAlgError('Last 2 dimensions of the array must be square') 216 217 def _assertFinite(*arrays):

This is my adata.X looks like right now: matrix([[0. , 0. , 0. , ..., 0. , 0. , 0. ], [0. , 0. , 1.203, ..., 0. , 0. , 0. ], [0. , 1.096, 0. , ..., 0. , 0. , 0. ], ..., [0. , 0. , 2.042, ..., 0. , 0. , 0. ], [0. , 0. , 0. , ..., 0.926, 0. , 0. ], [0. , 0. , 2.951, ..., 0. , 0. , 0. ]])

also versions of my modules: scanpy==1.3.7 anndata==0.6.17 numpy==1.15.4 scipy==1.2.0 pandas==0.24.0 scikit-learn==0.20.2 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

Looking forward your response! Thank you !

opened by jipeifeng 17
computeSumFactors(data_mat, clusters=input_groups, min.mean=0.1) gives error
I can run everything without problem until these lines:

%%R -i data_mat -i input_groups -o size_factors size_factors = computeSumFactors(data_mat, clusters=input_groups, min.mean=0.1)

where I get this error:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘assay’ for signature ‘"matrix", "character"’

Any idea whats going on here?

Im using the original code except for these lines that I commented: #adata.var['gene_id'] = adata.var['gene_id-1'] #adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
opened by ariberar 16

Exception: Data must be 1-dimensional when plotting new marker genes in jupyter notebook

I am following the notebook to understand the steps of single-cell RNA seq.

I had the same issue #21 for the version of scanpy, so I followed what is said in the answer. It worked perfectly beside some problems that I solved and I want to write here so everyone that has the same issues can resolve them:

also adata = adata.concatenate(adata_tmp, batch_key='sample_id')and adata.obs.drop(columns=['sample_id'], inplace=True) generated errors so I commented out also those lines
I had errors when a graph is plotted using sc.pl.palettes.godsnot_64 or sc.pl.palettes.default_64 so I use sc.pl.palettes.vega_20_scanpy instead
In the step Marker genes & cluster annotation I replaced:

adata.rename_categories('louvain_r0.5', ['TA', 'EP (early)', 'Stem', 'Goblet', 'EP (stress)', 'Enterocyte', 'Paneth', 'Enteroendocrine', 'Tuft'])

with

adata.rename_categories('louvain_r0.5', ['TA', 'EP (early)', 'Stem', 'Goblet', 'EP (stress)', 'Enterocyte', 'Paneth'])

because the number of old and new categories don't match, so that way it works.

When I run

#Plot the new marker genes
sc.pl.rank_genes_groups(adata, key='rank_genes_r0.5_entero_sub', groups=['Enterocyte,0','Enterocyte,1','Enterocyte,2'], fontsize=12)

I get that no field of name Enterocyte,2, so I commented out that group and with the other two it works.

I get now to my problem. I am running the notebook with the case study data and with scanpy==1.4.5.1 anndata==0.7.1 umap==0.3.10 numpy==1.18.1 scipy==1.4.1 pandas==0.25.3 scikit-learn==0.22.1 statsmodels==0.11.1 python-igraph==0.8.0 louvain==0.6.1. (Note: I use pandas 0.25.3 because previously, when I try to run it with 1.0.1 it had incompatibility problems)

Now I have a problem in the steps of subclustering, when I try to run:

entero_clusts = [clust for clust in adata.obs['louvain_r0.5_entero_sub'].cat.categories if clust.startswith('Enterocyte')]
 for clust in entero_clusts:
    sc.pl.rank_genes_groups_violin(adata, use_raw=True, key='rank_genes_r0.5_entero_sub', groups=[clust], gene_names=adata.uns['rank_genes_r0.5_entero_sub']['names'][clust][90:100])

the error I get is


 ---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-58-4453b5691a91> in <module>
      2 
      3 for clust in entero_clusts:
----> 4     sc.pl.rank_genes_groups_violin(adata, use_raw=True, key='rank_genes_r0.5_entero_sub', groups=[clust], gene_names=adata.uns['rank_genes_r0.5_entero_sub']['names'][clust][90:100])
      5 

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/scanpy/plotting/_tools/__init__.py in rank_genes_groups_violin(adata, groups, n_genes, gene_names, gene_symbols, use_raw, key, split, scale, strip, jitter, size, ax, show, save)
    727             if issparse(X_col): X_col = X_col.toarray().flatten()
    728             new_gene_names.append(g)
--> 729             df[g] = X_col
    730         df['hue'] = adata.obs[groups_key].astype(str).values
    731         if reference == 'rest':

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
   3485         else:
   3486             # set column
-> 3487             self._set_item(key, value)
   3488 
   3489     def _setitem_slice(self, key, value):

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in _set_item(self, key, value)
   3561         """
   3562 
-> 3563         self._ensure_valid_index(value)
   3564         value = self._sanitize_column(key, value)
   3565         NDFrame._set_item(self, key, value)

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in _ensure_valid_index(self, value)
   3538         if not len(self.index) and is_list_like(value):
   3539             try:
-> 3540                 value = Series(value)
   3541             except (ValueError, NotImplementedError, TypeError):
   3542                 raise ValueError(

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    312                     data = data.copy()
    313             else:
--> 314                 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
    315 
    316                 data = SingleBlockManager(data, index, fastpath=True)

~/anaconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    727     elif subarr.ndim > 1:
    728         if isinstance(data, np.ndarray):
--> 729             raise Exception("Data must be 1-dimensional")
    730         else:
    731             subarr = com.asarray_tuplesafe(data, dtype=dtype)

Exception: Data must be 1-dimensional

I thought it was an issue for the cache but I cleaned it, restarted the kernel and cleared all outputs and the problem remains. How can I fix this?

opened by federicaress 15

ComBat error "TypeError: data type not understood"

Issue report for the issue posted in #1: ComBat gives the following error: TypeError: data type not understood.

@jphe Could you clarify whether you are still using a sparse data matrix? The current ComBat implementation does not work with the sparse matrix format.

The ComBat function from www.github.com/mbuttner/maren_codes/ was designed to take pandas Dataframes as input, so the pandas dataframe is not the problem. The code does have issues when your data has 0 variance in the expression values of a gene. So you should filter out genes with constant gene expression values (usually genes with 0 expression).

It would also be good to know the output of type(data.T).

opened by LuckyMD 14

Problem with adata concatenation

Hello, I am working with the jupyter notebook on macos and followed the Environment set up instructions. I am aware that the conda build may be problematic on macos, but as far as I can tell that was not an issue for me. To confirm, interactive cell 2 returns

scanpy==1.5.1 anndata==0.7.3 umap==0.4.3 numpy==1.18.4 scipy==1.4.1 pandas==1.0.3 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.8.2 louvain==0.6.1

When running the notebook (inside the conda environment) I encounter the following error message apparently triggered by the adata method concatenate:

 ... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
<ipython-input-6-01aaadebece1> in <module>
     30 
     31     # Concatenate to main adata object
---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
     33     adata.var['gene_id'] = adata.var['gene_id-1']
     34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1696         all_adatas = (self,) + tuple(adatas)
   1697 
-> 1698         out = concat(
   1699             all_adatas,
   1700             join=join,

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, join, batch_key, batch_categories, uns_merge, index_unique, fill_value)
    454 
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
--> 456     reindexers = [
    457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
    455     var_names = resolve_index([a.var_names for a in adatas], join=join)
    456     reindexers = [
--> 457         gen_reindexer(var_names, a.var_names, fill_value=fill_value) for a in adatas
    458     ]
    459 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/anndata/_core/merge.py in gen_reindexer(new_var, cur_var, fill_value)
    255     new_size = len(new_var)
    256     old_size = len(cur_var)
--> 257     new_pts = new_var.get_indexer(cur_var)
    258     cur_pts = np.arange(len(new_pts))
    259 

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
   2731 
   2732         if not self.is_unique:
-> 2733             raise InvalidIndexError(
   2734                 "Reindexing only valid with uniquely valued Index objects"
   2735             )

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

By title, this seems potentially related to #25 that in turn took me to #21 and #28, where it is stated that commenting out adata = adata.concatenate(adata_tmp, batch_key='sample_id') and adata.obs.drop(columns=['sample_id'], inplace=True) may be required. However, this in turn generated the error message

... reading from cache file cache/..-data-Haber-et-al_mouse-intestinal-epithelium-GSE92332_RAW-GSM2836574_Regional_Duo_M2_matrix.h5ad

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-7-6ee374501f9d> in <module>
     31     # Concatenate to main adata object
     32     #adata = adata.concatenate(adata_tmp, batch_key='sample_id')
---> 33     adata.var['gene_id'] = adata.var['gene_id-1']
     34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
     35     #adata.obs.drop(columns=['sample_id'], inplace=True)

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2798             if self.columns.nlevels > 1:
   2799                 return self._getitem_multilevel(key)
-> 2800             indexer = self.columns.get_loc(key)
   2801             if is_integer(indexer):
   2802                 indexer = [indexer]

~/opt/miniconda3/envs/sc-tutorial/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'gene_id-1'

that brought me back to comment on #21 . I have been asked to open as a separate issue.

I think even if there is an already established solution, it may be confusing chasing through the issues having to find it- sorry!

opened by davidtourigny 12

Best practices for using regress_out?
I am adapting the current best practices workflow (epithelial cells) from @LuckyMD with my own data set, and am running into an issue/question. I am subsetting my data to include a few clusters of interest. Once I have those clusters isolated, I am selecting highly variable genes, regressing out effects of cell cycle, ribo genes and mito genes, scaling the data, and embedding a new UMAP, all in preparation for some downstream trajectory analysis (I opted not to regress out anything in my main data set, but want to regress out some confounding factors in my subset specifically for trajectory analysis). Is it better to first run pp.highly_variable_genes and then use pp.regress_out, or is better to run pp.regress_out followed by pp.highly_variable_genes? My code currently looks like the following:

Subset to highly variable genes sc.pp.highly_variable_genes(adata_sub, flavor='cell_ranger', n_top_genes=4000, subset=True)

Regress out effects of cell cycle, mito genes, and ribo genes sc.pp.regress_out(adata_sub, ['S_score', 'G2M_score', 'percent_mt', 'percent_ribo'])

Scale sc.pp.scale(adata_sub, max_value=10)

Calculate the visualization

sc.pp.pca(adata_sub, n_comps=50, use_highly_variable=True, svd_solver='arpack') sc.pp.neighbors(adata_sub) sc.tl.umap(adata_sub)

If I run pp.regress_out before pp.highly_variable_genes, I have to include the line pp.filter_genes(adata_sub, min_counts=1 or else I get ValueError: The first guess on the deviance function returned a nan. This could be a boundary problem and should be reported. However, after doing some trial and error runs, I believe that including pp.filter_genes(adata_sub, min_counts=1 is excluding some genes of interest from my downstream trajectory analysis. I am able to recover these genes by reverting back to running pp.highly_variable_genes before pp.regress_out and excluding pp.filter_genes.

Intuitively I feel like it makes more sense to run pp.regress_out before pp.highly_variable_genes, but considering I am having issues using that order for downstream analysis, is it OK to run pp.regress_out after pp.highly_variable_genes? What is the best practice?
opened by oligomyeggo 11

Unable to deploy the .yml - Docker enhancement request

Creating the environment with the yml file provided generate error:

WARNING: The conda.compat module is deprecated and will be removed in a future release. WARNING: The conda.compat module is deprecated and will be removed in a future release.

>>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

Traceback (most recent call last):
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda/exceptions.py", line 1003, in __call__
    return func(*args, **kwargs)
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/cli/main.py", line 73, in do_call
    exit_code = getattr(module, func_name)(args, parser)
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/cli/main_create.py", line 77, in execute
    directory=os.getcwd())
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/specs/__init__.py", line 40, in detect
    if spec.can_handle():
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/specs/yaml_file.py", line 18, in can_handle
    self._environment = env.from_file(self.filename)
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/env.py", line 143, in from_file
    return from_yaml(yamlstr, filename=filename)
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda_env/env.py", line 128, in from_yaml
    data = yaml_load_standard(yamlstr)
  File "/home/xma/anaconda3/lib/python3.7/site-packages/conda/common/serialize.py", line 76, in yaml_load_standard
    return yaml.load(string, Loader=yaml.Loader, version="1.2")
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/main.py", line 640, in load
    return loader._constructor.get_single_data()  # type: ignore
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/constructor.py", line 102, in get_single_data
    node = self.composer.get_single_node()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/composer.py", line 75, in get_single_node
    document = self.compose_document()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/composer.py", line 99, in compose_document
    self.parser.get_event()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/parser.py", line 166, in get_event
    self.current_event = self.state()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/parser.py", line 244, in parse_document_end
    token = self.scanner.peek_token()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 173, in peek_token
    self.fetch_more_tokens()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 273, in fetch_more_tokens
    return self.fetch_value()
  File "/home/xma/anaconda3/lib/python3.7/site-packages/ruamel_yaml/scanner.py", line 626, in fetch_value
    self.reader.get_mark())
ruamel_yaml.scanner.ScannerError: mapping values are not allowed here
  in "<unicode string>", line 32, column 187:
     ...  in single-cell RNA-seq analysis: a tutorial&quot;  - theislab/s ... 
                                         ^ (line: 32)

$ /home/xma/anaconda3/bin/conda-env create -f /home/xma/Downloads/sc_tutorial_environment.yml

environment variables: CIO_TEST= CONDA_AUTO_UPDATE_CONDA=false CONDA_DEFAULT_ENV=base CONDA_EXE=/home/xma/anaconda3/bin/conda CONDA_PREFIX=/home/xma/anaconda3 CONDA_PROMPT_MODIFIER=(base) CONDA_ROOT=/home/xma/anaconda3 CONDA_SHLVL=1 PATH=/home/xma/anaconda3/bin:/home/xma/anaconda3/condabin:/home/xma/anacond a3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/u sr/games:/usr/local/games:/snap/bin REQUESTS_CA_BUNDLE= SSL_CERT_FILE= WINDOWPATH=2

 active environment : base
active env location : /home/xma/anaconda3
        shell level : 1
   user config file : /home/xma/.condarc

populated config files : /home/xma/.condarc conda version : 4.6.11 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : /home/xma/anaconda3 (read only) channel URLs : https://repo.anaconda.com/pkgs/main/linux-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/linux-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/linux-64 https://repo.anaconda.com/pkgs/r/noarch package cache : /home/xma/anaconda3/pkgs /home/xma/.conda/pkgs envs directories : /home/xma/.conda/envs /home/xma/anaconda3/envs platform : linux-64 user-agent : conda/4.6.11 requests/2.21.0 CPython/3.7.3 Linux/4.18.0-20-generic ubuntu/18.04.2 glibc/2.27 UID:GID : 1000:1000 netrc file : None offline mode : False

An unexpected error has occurred. Conda has prepared the above report.

If submitted, this report will be used by core maintainers to improve future releases of conda. Would you like conda to send this report to the core maintainers?

enhancement

opened by marshelma 11

ValueError: cannot reindex from a duplicate axis

Hi,

I followed the instruction step by step, and have successfully set up the environment and load the data. But when I ran Case-study_Mouse-intestinal-epithelium_1904.ipynb in linux, I got errors:

ValueError: cannot reindex from a duplicate axis

The full error message is like this:
==============================================================================
Traceback (most recent call last):
  File "/home/lwang/miniconda3/bin/jupyter-nbconvert", line 11, in <module>
    sys.exit(main())
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/jupyter_core/application.py", line 254, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/traitlets/config/application.py", line 845, in launch_instance
    app.start()
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 350, in start
    self.convert_notebooks()
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
    output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
    nb_copy, resources = super().from_notebook_node(nb, resources, **kw)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
    nb_copy, resources = self._preprocess(nb_copy, resources)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
    nbc, resc = preprocessor(nbc, resc)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
    return self.preprocess(nb, resources)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
    self.execute()
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
    return loop.run_until_complete(coro)
  File "/home/lwang/miniconda3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 540, in async_execute
    await self.async_execute_cell(
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
    cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
    cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
    return just_run(coro(*args, **kwargs))
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
    return loop.run_until_complete(coro)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
  File "/home/lwang/miniconda3/lib/python3.8/asyncio/futures.py", line 178, in result
    raise self._exception
  File "/home/lwang/miniconda3/lib/python3.8/asyncio/tasks.py", line 280, in __step
    result = coro.send(None)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 832, in async_execute_cell
    self._check_raise_for_error(cell, exec_reply)
  File "/home/lwang/miniconda3/lib/python3.8/site-packages/nbclient/client.py", line 740, in _check_raise_for_error
    raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
# Loop to load rest of data sets
for i in range(len(sample_strings)):
    #Parse Filenames
    sample = sample_strings[i]
    sample_id = sample_id_strings[i]
    data_file = file_base+sample_id+exp_string+sample+data_file_end
    barcode_file = file_base+sample_id+exp_string+sample+barcode_file_end
    gene_file = file_base+sample_id+exp_string+sample+gene_file_end
    
    #Load data
    adata_tmp = sc.read(data_file, cache=True)
    adata_tmp = adata_tmp.transpose()
    adata_tmp.X = adata_tmp.X.toarray()

    barcodes_tmp = pd.read_csv(barcode_file, header=None, sep='\t')
    genes_tmp = pd.read_csv(gene_file, header=None, sep='\t')
    
    #Annotate data
    barcodes_tmp.rename(columns={0:'barcode'}, inplace=True)
    barcodes_tmp.set_index('barcode', inplace=True)
    adata_tmp.obs = barcodes_tmp
    adata_tmp.obs['sample'] = [sample]*adata_tmp.n_obs
    adata_tmp.obs['region'] = [sample.split("_")[0]]*adata_tmp.n_obs
    adata_tmp.obs['donor'] = [sample.split("_")[1]]*adata_tmp.n_obs
    
    genes_tmp.rename(columns={0:'gene_id', 1:'gene_symbol'}, inplace=True)
    genes_tmp.set_index('gene_symbol', inplace=True)
    adata_tmp.var = genes_tmp
    adata_tmp.var_names_make_unique()

    # Concatenate to main adata object
    adata = adata.concatenate(adata_tmp, batch_key='sample_id')
    adata.var['gene_id'] = adata.var['gene_id-1']
    adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)
    adata.obs.drop(columns=['sample_id'], inplace=True)
    adata.obs_names = [c.split("-")[0] for c in adata.obs_names]
    adata.obs_names_make_unique(join='_')
------------------

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-01aaadebece1> in <module>
     30 
     31     # Concatenate to main adata object
---> 32     adata = adata.concatenate(adata_tmp, batch_key='sample_id')
     33     adata.var['gene_id'] = adata.var['gene_id-1']
     34     adata.var.drop(columns = ['gene_id-1', 'gene_id-0'], inplace=True)

~/miniconda3/lib/python3.8/site-packages/anndata/_core/anndata.py in concatenate(self, join, batch_key, batch_categories, uns_merge, index_unique, fill_value, *adatas)
   1694         all_adatas = (self,) + tuple(adatas)
   1695 
-> 1696         out = concat(
   1697             all_adatas,
   1698             axis=0,

~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in concat(adatas, axis, join, merge, uns_merge, label, keys, index_unique, fill_value, pairwise)
    812 
    813     # Annotation for other axis
--> 814     alt_annot = merge_dataframes(
    815         [getattr(a, alt_dim) for a in adatas], alt_indices, merge
    816     )

~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in merge_dataframes(dfs, new_index, merge_strategy)
    524 
    525 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
--> 526     dfs = [df.reindex(index=new_index) for df in dfs]
    527     # New dataframe with all shared data
    528     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)

~/miniconda3/lib/python3.8/site-packages/anndata/_core/merge.py in <listcomp>(.0)
    524 
    525 def merge_dataframes(dfs, new_index, merge_strategy=merge_unique):
--> 526     dfs = [df.reindex(index=new_index) for df in dfs]
    527     # New dataframe with all shared data
    528     new_df = pd.DataFrame(merge_strategy(dfs), index=new_index)

~/miniconda3/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    310         @wraps(func)
    311         def wrapper(*args, **kwargs) -> Callable[..., Any]:
--> 312             return func(*args, **kwargs)
    313 
    314         kind = inspect.Parameter.POSITIONAL_OR_KEYWORD

~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in reindex(self, *args, **kwargs)
   4171         kwargs.pop("axis", None)
   4172         kwargs.pop("labels", None)
-> 4173         return super().reindex(**kwargs)
   4174 
   4175     def drop(

~/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
   4804 
   4805         # perform the reindex on the axes
-> 4806         return self._reindex_axes(
   4807             axes, level, limit, tolerance, method, fill_value, copy
   4808         ).__finalize__(self, method="reindex")

~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   4017         index = axes["index"]
   4018         if index is not None:
-> 4019             frame = frame._reindex_index(
   4020                 index, method, copy, level, fill_value, limit, tolerance
   4021             )

~/miniconda3/lib/python3.8/site-packages/pandas/core/frame.py in _reindex_index(self, new_index, method, copy, level, fill_value, limit, tolerance)
   4036             new_index, method=method, level=level, limit=limit, tolerance=tolerance
   4037         )
-> 4038         return self._reindex_with_indexers(
   4039             {0: [new_index, indexer]},
   4040             copy=copy,

~/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
   4870 
   4871             # TODO: speed up on homogeneous DataFrame objects
-> 4872             new_data = new_data.reindex_indexer(
   4873                 index,
   4874                 indexer,

~/miniconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate, only_slice)
   1299         # some axes don't allow reindexing with dups
   1300         if not allow_dups:
-> 1301             self.axes[axis]._can_reindex(indexer)
   1302 
   1303         if axis >= self.ndim:

~/miniconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
   3474         # trying to reindex on an axis with duplicates
   3475         if not self._index_as_unique and len(indexer):
-> 3476             raise ValueError("cannot reindex from a duplicate axis")
   3477 
   3478     def reindex(self, target, method=None, level=None, limit=None, tolerance=None):

ValueError: cannot reindex from a duplicate axis
ValueError: cannot reindex from a duplicate axis

===============================================================================

I didn't change anything (the file names, the file directory or code all stay the same), so I'm not sure why it didn't work. Am I supposed to modify the code or anything? I'm very new in scRNA analysis, so I have a hard time to figure our where to start debugging, could you please give me some advice on this?

Thanks very very much! Leran

opened by Leran10 10

Package versions in .yml file
It is rather a question, not an issue:

in the .yml file some of the packages are listed with older than most recent versions:

python>=3.5, <3.7

cmake>=3.9, <3.11

the most recent and installed are: -python=3.9.7 -cmake=3.21.2

there should not be a problem, right?

nice day,

Iliya
opened by lefterov 9

environment installation problem

Hello,

I am trying to install the environment by

conda env create -f sc_tutorial_environment.yml

but get the following error:

Pip subprocess error:
  Running command git clone -q https://github.com/flying-sheep/anndata2ri /tmp/pip-req-build-_sxdlckc
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-46eoz0zx
       cwd: /tmp/pip-install-o92j89t3/rpy2_1bf8d8eb188d4e75931a32c7ea456056/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/a9/11/5f175fc3d2313b53cb3c921db9e8bba58b67d739f5a637146b45f2e0e80c/rpy2-3.5.5.tar.gz#sha256=a252c40e21cf4f23ac6e13bffdcb82b5900b49c3043ed8fd31da5c61fb58d037 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-za7i_5lx
       cwd: /tmp/pip-install-o92j89t3/rpy2_0de3b36ef510453fbd549602b7f057d5/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/40/09/a754484c80f8c58f077b1b9b2249787c689c8dd1559e3eb91cd5b8690dc2/rpy2-3.5.4.tar.gz#sha256=ba0a877b2b96e27d2091383d4652b82aa2271cff4a505243d45da430b712aaf5 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-ruvwrq4l
       cwd: /tmp/pip-install-o92j89t3/rpy2_4baddf3e0e7b4f5a8be15c4fcb6ce495/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/9b/5d/44d001cb386d009e228afbc9327ee07dc9ade108a908dc84a6801c093255/rpy2-3.5.3.tar.gz#sha256=53a092d48b44f46428fb30cb3155664d6d2f7af08ebc4c45df98df4c45a42ccb (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-7z8170o7
       cwd: /tmp/pip-install-o92j89t3/rpy2_5e14da6ada314beb8b0debfce30760ed/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/ee/2f/392c6fd5fad3cd6f2eda24855e6b53d0b61246da105a77b55acd66c3bddd/rpy2-3.5.2.tar.gz#sha256=45ee00fdbad0b481c39369fb315651ddf854cf4ad56d24b7dcb3c74e135bd10e (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-dtd_7sou
       cwd: /tmp/pip-install-o92j89t3/rpy2_eccbf35ec2224834a4bef7354fc7f001/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/82/3b/7258fa09eff9bb64ea09a8a8220a1f845561f0b8af76306924fbe58a2009/rpy2-3.5.1.tar.gz#sha256=d35717489bd0754b556202a6b990ebc6f3a1c18c9c23b3a862a46a91fc265805 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-apcl_bp6
       cwd: /tmp/pip-install-o92j89t3/rpy2_9a14ebcf66f5471e8bc439ba2dc729d1/
  Complete output (1 lines):
  rpy2 is no longer supporting Python < 3.7.Consider using an older rpy2 release when using an older Python release.
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/a8/e5/e532f7189d852dfefb4f9fe4baa0ac53f180538ac8acec8d30c4be5887c1/rpy2-3.5.0.tar.gz#sha256=4d8a20253320ae3e402ec20d56640772bdc27731bd14f7d0475a95d86e7cd1c7 (from https://pypi.org/simple/rpy2/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-dogw0wzm
       cwd: /tmp/pip-install-o92j89t3/scanpy_3fd33314094b4ee88ccdf68c7f622a7a/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/75/13/76f3fa526a5e39ee26752c6e78fca509821c57699999e68086094a6ff9cb/scanpy-1.3.6.tar.gz#sha256=ebc7cd0a9726a4a9088a8d0eafb8eb59802f8acb85bc28a2bdf8dbf0144f87c8 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-7yuxe4rl
       cwd: /tmp/pip-install-o92j89t3/scanpy_1a06cb60eacf41da9f18ddbbef3593b9/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/f6/1a/83abbd428fbd060b041aa5ccf884a3cbba3c7e21342f5c76107386cc6025/scanpy-1.3.5.tar.gz#sha256=22ec896d232c1586fab8bd5a989c0a1251b840f87a67e466a0d784b3b10d0782 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-imxrfwme
       cwd: /tmp/pip-install-o92j89t3/scanpy_1fb24f41c22e4c0ab63a7d6a11b5b63c/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/47/a2/3edd61806453cccad8828fc74b0d9377cf419abc14b737ed57187e446460/scanpy-1.3.4.tar.gz#sha256=fd1be48c00919ce72e67635a8d31e039d618865bc1456d72abf42570bcd760d6 (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-3x6f6xm0
       cwd: /tmp/pip-install-o92j89t3/scanpy_4aff80b97095477b963148ff34c0e2ea/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/c0/f6/b247ffd1c4f776c4b5b3ee32ffc990a0aee09cf6197002ad3e31a8e0fbad/scanpy-1.3.2.tar.gz#sha256=0dd11ebb2098636fadbc7d3d355e206ffedcdd1d70c3fc85ac9cb95e0e398a5e (from https://pypi.org/simple/scanpy/) (requires-python:>=3.5). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-l49aybw0
       cwd: /tmp/pip-install-o92j89t3/scanpy_f8269e7304db43c8a3c2bdff41aed4ac/
  Complete output (2 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ['anndata>=0.6.10', 'matplotlib>=2.2', 'pandas>=0.21', 'scipy', 'seaborn', 'h5py', 'tables', 'scikit-learn>=0.19.1', 'statsmodels', 'networkx', 'natsort', 'joblib', 'numba']
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/78/cf/b2cc01e9c33613d738c1cdeb5af7d250cf965bda187aa258e45ed25a6a15/scanpy-1.3.1.tar.gz#sha256=cbf81b67933c5a8358012ca58a71a5b08fe41a2123c3ecd02d27660c1864eb83 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-f74u8w54
       cwd: /tmp/pip-install-o92j89t3/scanpy_dab42db58f034dacbf4343922b1eee3e/
  Complete output (1 lines):
  error in scanpy setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Parse error at "'[doc]'": Expected W:(abcd...)
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/36/70/367316a52e6ab4529ab6489f34cfbea89ea8d6dc6ad95bae01e3f54dfa4d/scanpy-1.3.tar.gz#sha256=c12517cb7c373f0f562b3276057e890744edd015486a4448e502a5f264ff1193 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-jcdxvh3p
       cwd: /tmp/pip-install-o92j89t3/scanpy_441a9cdefe6b4366881b69bcac6dc49e/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/b3/43/76f4f30dce3d6599ba61a09eeedd4e456c4699d4eb0e11fa70b85e909b48/scanpy-1.2.1.tar.gz#sha256=fa11a3d922b95dea007a6fee53cef320816c7ff538c30a45b4483ef267bbbe5f (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-cdt3h3s0
       cwd: /tmp/pip-install-o92j89t3/scanpy_4a8e3a157fc5431b8280396df160a9c5/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/8e/0a/5124937fa23ec0311ad10e9eda00731cf2152755f5b78c4751f371050ee2/scanpy-1.2.0.tar.gz#sha256=5065bf203ecb67176373a94b5b0914bc8a6da990e7f6b67b0f08be3de9dd2f03 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-qmezagzm
       cwd: /tmp/pip-install-o92j89t3/scanpy_0c1d497596cc424ab3b349ae6bacff9f/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/81/02/5df31ff33b28b9b438d70f0d82a9452acc6aa67e8652e6d9169920e4574d/scanpy-1.1.tar.gz#sha256=4be902d350ccc34b57447081e26091a74cd254ff4ba11b4c8c228f38d645e727 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-kobc5xlh
       cwd: /tmp/pip-install-o92j89t3/scanpy_7a0d42446100419b8ef978066f4f32d3/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/f6/12/d1c24809472c8d0c75d9d21de6fea83600dfde47fdf8d505d7340d6cd3c8/scanpy-1.0.4.tar.gz#sha256=ca2424d265e4118dfd71719fd18738e1396d8090987c613b99a8633cf4775cd2 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  ERROR: Command errored out with exit status 1:
   command: /PHShome/ys738/tutorial/single-cell-tutorial/test/env/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/setup.py'"'"'; __file__='"'"'/tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-y5ej_ol2
       cwd: /tmp/pip-install-o92j89t3/scanpy_c8919e6b3dab4d8ab374bd91e06699d4/
  Complete output (1 lines):
  error in scanpy setup command: "values of 'package_data' dict" must be a list of strings (got '*.txt')
  ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/45/15/09af7e433871b94ec26086cd1b2de71eb376d006698701d1c52d3c8acb0a/scanpy-1.0.3.post1.tar.gz#sha256=d83f7da9bc838a69182e612ea9e9003751247c317433788d842cb29e434257e6 (from https://pypi.org/simple/scanpy/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Package 'anndata2ri' requires a different Python: 3.6.15 not in '>=3.7'

failed

CondaEnvException: Pip failed

The error seems to be related to python version. Could you tell me how to solve this problem?

Thank you

opened by Sunyp-IM 1

enviroment install failed

Dear authors,

I tried to run the tutorial but get stuck in the environment install step. I ran the following command as guided:

conda env create -f sc_tutorial_environment.yml

During this process was running, the memory of my computer was gradually used up until the process was killed. My computer has a RAM of 16 G, which I thought should be large enough for dealing with most general computation problems. So I wonder what kind of computer configuration is needed to run this tutorial. And is there something wrong with my operation?

Best regards

opened by Sunyp-IM 1
Sparse matrix dimensions

Hello,

Thank you for providing this guide, it is very helpful. I am going through some parts of the code and I realized the following line in Case-study_Mouse-intestinal-epithelium_1906.ipynb raise an error:

adata.obs['mt_frac'] = adata.X[:, mt_gene_mask].sum(1)/adata.obs['n_counts']

When I investigated a little I realized adata.X is a 2d sparse matrix so it can't be divided by 1d series. Also, applying flatten or reshape directly doesn't work because it returns a numpy matrix which can't be flatten (or I couldn't). This might be a version issue, but my solution was replacing the line with

adata.obs['mt_frac'] = np.array(adata.X[:, mt_gene_mask].sum(1)).flatten()/adata.obs['n_counts']

Best

opened by acihanckr 1
R command throw UnicodeDecodeError

Hello dear authors,

I have a very persistend problem I can't find a solution for. When executing almost all commands in cells with "%%R" get a similar Error like this:

The cell that did work was this one: but for the imports I had to work around as you can see, since it threw an error:

How can I resolve this or what may be the cause? My R version is 4.2.0 and I am on Windows 10 x64.

Thank you a lot,

Mariam

opened by Mari123i 3
Normalizing subsetted data
Hi @LuckyMD, I've been reprocessing some old data using your single cell tutorial workflow and have a best practices question (I am not sure if this is the correct place for this, or if this should be moved to the new scverse discourse group?). I have an adata object that is scran normalized. I want to take a subset of clusters from the adata object and create a new adata_sub object with its own dimensional reduction to investigate a subpopulation of interest. My understanding is that, had I opted for a basic log normalization, I would not need to re-normalize adata_sub as log normalization is done on a per-cell basis. However, because scran normalization uses a coarse clustering of cells present in the object, I would want to re-normalize adata_sub if adata had been normalized via scran, correct? What would be the best way to subset and re-normalize adata_sub? I am not sure what steps of the original scran normalization process need to be repeated and what steps can be omitted. For instance, I would want to perform a new clustering for the subsetted data for scran normalization and get new size factors, but I wouldn't need to set adata_sub.layers['counts'] = adata_sub.X.copy(), as adata_sub contains a subsetted counts layer from adata (correct?). Would I need to restore adata_sub.raw = adata_sub in this scenario?:

# subset adata to clusters of interest adata_sub = adata[adata.obs['leiden_r1.0'].isin(['1, '3', '5'])].copy() # perform clustering for scran normalization adata_sub_pp = adata_sub.copy() #sc.pp.normalize_per_cell(adata_sub_pp, counts_per_cell_after = 1e6) - can we omit this since we did it for adata? #sc.pp.log1p(adata_sub_pp) - can we omit this since we did it for adata? sc.pp.pca(adata_sub_pp, n_comps = 15) sc.pp.neighbors(adata_sub_pp) sc.tl.leiden(adata_sub_pp, key_added = 'groups', resolution = 0.5) # preprocess variables for scran normalization input_groups = adata_sub_pp.obs['groups'] data_mat = adata_sub.X.T

%%R -i data_mat -i input_groups -o size_factors size_factors = sizeFactors(computeSumFactors(SingleCellExperiment(list(counts = data_mat)), clusters = input_groups, min.mean = 0.1))

del adata_sub_pp adata_sub.obs['size_factors'] = size_factors # overwrites existing ['size_factors'] from adata # adata_sub.layers['counts'] = adata_sub.X.copy() - this can be omitted? # Normalize adata_sub adata_sub.X /= adata_sub.obs['size_factors'].values[:,None] sc.pp.log1p(adata_sub) # should this be omitted? adata_sub.X = sp.sparse.csr_matrix(adata_sub.X) adata_sub.raw = adata_sub

Thank you for any help and advice!
opened by oligomyeggo 1