Mini bioinformatics project: PCA on genotypes
This repo contains the code from this video: https://youtu.be/-PCKK_nwFdA
If you somehow end up publishing a paper based on this type of project, you should cite this original paper it was inspired by: https://www.nature.com/articles/nature07331
URLs to download files
curl -O ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz
curl -O ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz.tbi
curl -O ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/phase1_integrated_calls.20101123.ALL.panel
# Faster downloads through AWS:
curl -O https://1000genomes.s3.amazonaws.com/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz
curl -O https://1000genomes.s3.amazonaws.com/release/20110521/ALL.chr22.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz.tbi
curl -O https://1000genomes.s3.amazonaws.com/release/20110521/phase1_integrated_calls.20101123.ALL.panel
# or use "wget" instead of "curl -O" if you don't have curl.
Colab notebook:
https://colab.research.google.com/drive/1o24ZfqDoBEwvRSx_Br1u7edPPXLLx4w7?usp=sharing