Incubator for useful bioinformatics code, primarily in Python and R

Related tags

Data Analysis bcbb
Overview

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics.

All code, images and documents in this repository are freely available for all uses. Code is available under the MIT license and images, documentations and talks under the Creative Commons No Rights Reserved (CC0) license.

Some projects which may be especially interesting:

  • CloudBioLinux -- An automated environment to install useful biological software and libraries. This is used to bootstrap blank machines, such as those you'd find on Cloud providers like Amazon, to ready to go analysis workstations. See the CloudBioLinux effort for more details. This project moved to its own repository at https://github.com/chapmanb/cloudbiolinux.
  • gff -- A GFF parsing library in Python, aimed for inclusion into Biopython.
  • nextgen -- A python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. This project has moved into its own repository: https://github.com/chapmanb/bcbio-nextgen
  • distblast -- A distributed BLAST analysis running for identifying best hits in a wide variety of organisms for downstream phylogenetic analyses. The code is generalized to run on local multi-processor and distributed Hadoop clusters.
Comments
  • biopython->numpy interactive (y/n) while deploying pipeline

    biopython->numpy interactive (y/n) while deploying pipeline

    Even putting numpy >=1.6.1 in setup.py's before biopython, the following message pops up:

    Numerical Python (NumPy) is not installed.
    
    This package is required for many Biopython features.  Please install
    it before you install Biopython. You can install Biopython anyway, but
    anything dependent on NumPy will not work. If you do this, and later
    install NumPy, you should then re-install Biopython.
    
    You can find NumPy at http://numpy.scipy.org
    
    Do you want to continue this installation? (y/N):
    

    Apparently install_requires packages are not installed in order, so no dependency order can be defined that way... are you aware of any "pre_install_requires" or similar in setuptools ? Couldn't find it after quickly checking docs :-/

    opened by brainstorm 14
  • barcode_sort_trim.py

    barcode_sort_trim.py

    i'm not sure if it's a problem in the latest version of barcode_sort_trim.py

    After updating to the pipeline with FastQC, I've got an extra base 'A' in the 3' of read 1.

    opened by tanglingfung 13
  • FastQC vs SolexaQA

    FastQC vs SolexaQA

    Brad, It's not really an issue. But I want to know, from your experience, how much time you would save from switching to FastQC from SolexaQA?

    Thanks, Paul

    opened by tanglingfung 13
  • doing bcl->qseq->fastq->analysis->galaxy in one machine

    doing bcl->qseq->fastq->analysis->galaxy in one machine

    Hi,

    We have a different setting here where the drive with the bcl files is mounted to the analysis machine and we would do everything there. Do you recommend we keep the messaging system in the pipeline? Just want to get some advices.

    Thanks, Paul

    opened by tanglingfung 13
  • picard_sam_to_bam.py

    picard_sam_to_bam.py

    Hi Brad,

    it seems that it will keep finding CreateSequenceDictionary in /usr/share/java/picard even though I have specify another path in my config file? I have tried doing the setup again after I modified the config files, but still it didn't look up the path I specified.

    and I didn't seem to have specified the path of hg19.fa for GATK?

    Thanks, Paul

    opened by tanglingfung 12
  • Convert GFF file to Sequin TBL file

    Convert GFF file to Sequin TBL file

    Submitting to GenBank requires converting a GFF file to a Sequin TBL file, which is then converted ASN.1 using tbl2asn. I have searched, and I have not found a good (or any, really) converter from GFF to Sequin TBL. Would you be interested in adding such a tool? Here's the hacky script that I cobbled together for this purpose: gff3-to-tbl. It's not general purpose, but could be a useful starting point.

    opened by sjackman 11
  • merging of demuxed fastq files and project-based analyses

    merging of demuxed fastq files and project-based analyses

    Hi Brad,

    more of a question than an issue. I noticed you've added code (bcbio.pipeline.sample.merge_sample) to merge samples across lanes. I've been using save_diskspace=true in order to remove sam files, but this I noticed also removes the demultiplexed files, right? I just want to make sure because it affects our data delivery routines, as outlined below.

    In our setup, we have situations when we run several projects on one lane, which we distinguish with an extra "description" tag in run_info, so in principle each barcode could have a description with a different project name. We then partition fastq files in a lane based on the description tag when delivering data to customers.

    On a similar note, when I do analyses for customers, I've been doing it on a project-by-project basis (it makes more sense to me), and therefore written helper scripts (project_*, see EDIT: https://github.com/percyfal/bcbb/tree/develop/nextgen/scripts) for this purpose. project_analysis_pipeline.sh is almost a copy of automated_initial_analysis.py, but starts off with demultiplexed files. Have you had this functionality in mind (or is it even already there)?

    Cheers,

    Per

    opened by percyfal 11
  • Trailing Illumina 'A' and demultiplexing

    Trailing Illumina 'A' and demultiplexing

    Hi Brad,

    We are seeing some issues with unexpectedly many reads ending up in the 'unmatched' category after demultiplexing. After digging around a little, we think that this may be related to the trailing 'A' that the Illumina machines add after the barcode.

    More specifically, we allow one mismatch and no indels for the demuxing. It seems that the reads that are unexpectedly classified as unmatched have one mismatch in the actual 6-nucleotide barcode and are, in addition, having the trailing 'A' nucleotide miscalled.

    Reading the code, it does indeed seem that for Illumina reads, the last 7 nucleotides, including the trailing 'A', of each read are matched when demultiplexing. Can you confirm that this is the case?

    Our preference is to match just the 6-mer index sequence, excluding the last nucleotide in the read and it would be nice to have this done by default for Illumina reads, or at least be able to influence this behavior with a configuration option. What do you think?

    Thanks /Pontus

    opened by b97pla 11
  • GFFExaminer() displaying empty dict for UCSC GTF

    GFFExaminer() displaying empty dict for UCSC GTF

    I tried following http://biopython.org/wiki/GFF_Parsing to parse UCSC-generated GTF file.

    After executing

    pprint.pprint(examiner.parent_child_map(handle))
    

    the output was

    {}
    

    Similarly,

    examiner.available_limits(handle)
    

    produced

    3: {'gff_id': {}, 'gff_source': {}, 'gff_source_type': {}, 'gff_type': {}}
    

    Trying to parse that same file with

    from BCBio import GFF
    for rec in GFF.parse(handle):
        print rec
    

    produced

    ID: chr1
    Name: <unknown name>
    Description: <unknown description>
    Number of features: 2
    UnknownSeq(14409, alphabet = Alphabet(), character = '?')
    

    Here are the first 10 lines from the GTF in question

    chr1 hg19_knownGene exon 11874 12227 0.000000 + . gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; chr1 hg19_knownGene exon 12613 12721 0.000000 + . gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; chr1 hg19_knownGene exon 13221 14409 0.000000 + . gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; chr1 hg19_knownGene start_codon 12190 12192 0.000000 + . gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene CDS 12190 12227 0.000000 + 0 gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene exon 11874 12227 0.000000 + . gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene CDS 12595 12721 0.000000 + 1 gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene exon 12595 12721 0.000000 + . gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene CDS 13403 13636 0.000000 + 0 gene_id "uc010nxq.1"; transcript_id "uc010nxq.1"; chr1 hg19_knownGene stop_codon 13637 13639 0.000000 + . gene_id "uc010nxq.1"; transcript_id "uc010nxq.1";

    opened by spock 10
  • num_cores: messages; socket.timeout: timed out

    num_cores: messages; socket.timeout: timed out

    Hey Brad, we are trying to use the distributed version of the pipeline.

    We have a couple of test sets that we use to quickly see if the pipeline is working. One that takes the normal pipeline about 3 hours to finish, and another much smaller that takes about 7 minutes (this is with 8 cores).

    When running the small test set on the messaging variant all files get generated as they should, and the program exits properly. Note that this small set consists of fastq files which are only 12 lines each, and I'm guessing much of the analysis gets skipped due to a lack of data.

    When we run the messaging version of the pipeline for the larger set, the programs work for a while (time varies, but say between 45 minutes and 1 hour 30 minutes), but then one of the jobs crashes with a socket.timeout error, (this specific job I believe is some master that coordinates what the other jobs should be doing.

    I'll include the output of that job here:

    [2012-02-25 02:55:26,856] Found YAML samplesheet, using /proj/a2010002/nobackup/illumina/pipeline_test/archive/000101_SN001_001_AABCD99XX/run_info.yaml instead of Galaxy API
    Traceback (most recent call last):
      File "/bubo/home/h10/vale/.virtualenvs/devel/bin/automated_initial_analysis.py", line 7, in <module>
        execfile(__file__)
      File "/bubo/home/h10/vale/bcbb/nextgen/scripts/automated_initial_analysis.py", line 117, in <module>
        main(*args, **kwargs)
      File "/bubo/home/h10/vale/bcbb/nextgen/scripts/automated_initial_analysis.py", line 48, in main
        run_main(config, config_file, fc_dir, work_dir, run_info_yaml)
      File "/bubo/home/h10/vale/bcbb/nextgen/scripts/automated_initial_analysis.py", line 65, in run_main
        lane_items = run_parallel("process_lane", lanes)
      File "/bubo/home/h10/vale/bcbb/nextgen/bcbio/distributed/messaging.py", line 28, in run_parallel
        return runner_fn(fn_name, items)
      File "/bubo/home/h10/vale/bcbb/nextgen/bcbio/distributed/messaging.py", line 67, in _run
        while not result.ready():
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/result.py", line 306, in ready
        return all(result.ready() for result in self.results)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/result.py", line 306, in <genexpr>
        return all(result.ready() for result in self.results)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/result.py", line 108, in ready
        return self.status in self.backend.READY_STATES
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/result.py", line 196, in status
        return self.state
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/result.py", line 191, in state
        return self.backend.get_status(self.task_id)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/backends/base.py", line 237, in get_status
        return self.get_task_meta(task_id)["status"]
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/backends/amqp.py", line 128, in get_task_meta
        return self.poll(task_id)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/celery/backends/amqp.py", line 153, in poll
        with self.app.pool.acquire_channel(block=True) as (_, channel):
      File "/sw/comp/python/2.7.1_kalkyl/lib/python2.7/contextlib.py", line 17, in __enter__
        return self.gen.next()
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/connection.py", line 789, in acquire_channel
        yield connection, connection.default_channel
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/connection.py", line 593, in default_channel
        self.connection
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/connection.py", line 586, in connection
        self._connection = self._establish_connection()
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/connection.py", line 546, in _establish_connection
        conn = self.transport.establish_connection()
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 252, in establish_connection
        connect_timeout=conninfo.connect_timeout)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/kombu/transport/amqplib.py", line 62, in __init__
        super(Connection, self).__init__(*args, **kwargs)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/amqplib/client_0_8/connection.py", line 129, in __init__
        self.transport = create_transport(host, connect_timeout, ssl)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/amqplib/client_0_8/transport.py", line 281, in create_transport
        return TCPTransport(host, connect_timeout)
      File "/bubo/home/h10/vale/.virtualenvs/devel/lib/python2.7/site-packages/amqplib/client_0_8/transport.py", line 85, in __init__
        raise socket.error, msg
    socket.timeout: timed out
    [INFO/MainProcess] process shutting down
    [DEBUG/MainProcess] running all "atexit" finalizers with priority >= 0
    [DEBUG/MainProcess] running the remaining "atexit" finalizers
    

    Have you encountered any issues with socket.timeout? Any ideas what we might be doing wrong?

    opened by vals 9
  • GFF parsing fails with most recent version of BioPython

    GFF parsing fails with most recent version of BioPython

    Overview

    After upgrading to Biopython 1.68, GFF.parse() is now failing where it had no issues before.

    To Reproduce

    In a new virtualenv environment, run:

    pip install numpy
    pip install biopython
    pip install bcbio-gff
    
    wget http://tritrypdb.org/common/downloads/release-27/TcruziCLBrenerEsmeraldo-like/gff/data/TriTrypDB-27_TcruziCLBrenerEsmeraldo-like.gff
    

    Next, launch python and run:

    >>> from BCBio import GFF
    >>> gff = 'TriTrypDB-27_TcruziCLBrenerEsmeraldo-like.gff'
    >>> x=list(GFF.parse(gff))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 737, in parse
        target_lines):
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 327, in parse_in_parts
        cur_dict = self._results_to_features(cur_dict, results)
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 367, in _results_to_features
        results.get('child', []))
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 428, in _add_parent_child_features
        children)
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 471, in _add_children_to_parent
        cur_child, _ = self._add_children_to_parent(cur_child, children)
      File "/home/keith/.virtualenvs/gff/lib/python3.5/site-packages/BCBio/GFF/GFFParser.py", line 477, in _add_children_to_parent
        cur_parent.sub_features.append(cur_child)
    AttributeError: 'SeqFeature' object has no attribute 'sub_features'
    >>> import Bio
    >>> Bio.__version__
    '1.68'
    

    The same code worked with Biopython 1.67, so it seems likely to be an issue resulting from changes made in the 1.68 release.

    opened by khughitt 8
  • docs: Fix a few typos

    docs: Fix a few typos

    There are small typos in:

    • posts/conferences/bosc2018_day1a.md
    • posts/seminars/tumor_heterogeneity_carter.md

    Fixes:

    • Should read suppressors rather than supressors.
    • Should read service rather than serivce.

    Semi-automated pull request generated by https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md

    opened by timgates42 0
  • glimmergff_to_proteins.py / Alternative Codon Table

    glimmergff_to_proteins.py / Alternative Codon Table

    Hi,

    I'd like to add this issue for those who'd like to use the script with an alternative codon table.

    Example: If you want to translate the sequence with a codon table 6 you need to change the script like the following, protein_seq = gene_seq.translate(6)

    Regards, Zeynep

    opened by zeynepkurtw 0
  • glimmergff_to_proteins.py / Reordering Fasta file

    glimmergff_to_proteins.py / Reordering Fasta file

    Hi,

    Thank you for this very useful script!!

    I was wondering if it's possible to create the protein multi fasta file with the order of ref_file (contigs) instead of the glimmer_file (gff) file?

    I've re-ordered my assembly from large contigs to smaller ones. However, when I run this script I got my protein multi fasta file in the order of glimmer.gff instead of the contigs.

    Regards, Zeynep

    opened by zeynepkurtw 0
  • Any chance a new release will be made sometime soon?

    Any chance a new release will be made sometime soon?

    I was just wondering if it would be possible for a new release of the gff package to be made sometime soon. The fix from #126 would be really nice to have in a released version.

    opened by DavyCats 0
  • IndexError with NCBI gff

    IndexError with NCBI gff

    Hi!

    I annotated a bacterium (Acidipropionibacterium acidipropionici - strain FAM19036) with NCBI PGAP.

    I wanted to create SeqIO-objects from the gff file, but it failed:

    import pprint
    from BCBio import GFF
    from BCBio.GFF import GFFExaminer
    examiner = GFFExaminer()
    with open('data/FAM19036/annot.gff') as in_handle:
        pprint.pprint(examiner.available_limits(in_handle))
    print("------------------------------------------------------------")
    with open('FAM19036/annot.gff') as in_handle:
        for rec in GFF.parse(in_handle):
            print(rec)
    
    {'gff_id': {('CP040634.1',): 6772},
     'gff_source': {('.',): 3361,
                    ('GeneMarkS-2+',): 360,
                    ('Local',): 1,
                    ('Protein Homology',): 2916,
                    ('cmsearch',): 24,
                    ('tRNAscan-SE',): 110},
     'gff_source_type': {('.', 'exon'): 8,
                         ('.', 'gene'): 3208,
                         ('.', 'pseudogene'): 137,
                         ('.', 'rRNA'): 8,
                         ('GeneMarkS-2+', 'CDS'): 360,
                         ('Local', 'region'): 1,
                         ('Protein Homology', 'CDS'): 2916,
                         ('cmsearch', 'RNase_P_RNA'): 1,
                         ('cmsearch', 'SRP_RNA'): 1,
                         ('cmsearch', 'exon'): 7,
                         ('cmsearch', 'rRNA'): 4,
                         ('cmsearch', 'riboswitch'): 10,
                         ('cmsearch', 'tmRNA'): 1,
                         ('tRNAscan-SE', 'exon'): 55,
                         ('tRNAscan-SE', 'tRNA'): 55},
     'gff_type': {('CDS',): 3276,
                  ('RNase_P_RNA',): 1,
                  ('SRP_RNA',): 1,
                  ('exon',): 70,
                  ('gene',): 3208,
                  ('pseudogene',): 137,
                  ('rRNA',): 12,
                  ('region',): 1,
                  ('riboswitch',): 10,
                  ('tRNA',): 55,
                  ('tmRNA',): 1}}
    ------------------------------------------------------------
    
    Error
    Traceback (most recent call last):
      File "/usr/lib64/python3.7/unittest/case.py", line 59, in testPartExecutor
        yield
      File "/usr/lib64/python3.7/unittest/case.py", line 628, in run
        testMethod()
      File "/project/gene_loci_comparison/test_gene_loci_comparison.py", line 129, in test_recreate_gff_bug
        for rec in GFF.parse(in_handle):
      File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 746, in parse
        target_lines):
      File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 327, in parse_in_parts
        cur_dict = self._results_to_features(cur_dict, results)
      File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 369, in _results_to_features
        base = self._add_directives(base, results.get('directive', []))
      File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 388, in _add_directives
        val = (val[0], int(val[1]) - 1, int(val[2]))
    IndexError: tuple index out of range
    

    To recreate the bug, here is the relevant gff file.

    Thanks in advance.

    Edit: bcbio-gff version 0.6.6

    opened by MrTomRod 0
  • Did not find remapped ID location:

    Did not find remapped ID location:

    I'm trying to parse a gff file downloaded from NCBI (GCA_001536265) and when I iterate on the parser it gives me this error Did not find remapped ID location: gene670, [[42143, 44074], [44736, 45087], [45979, 46332], [47064, 47369]], [42143, 47369]

    Inspecting the GFF with GFFExaminer gives no error at all.

    opened by fbeghini 2
Owner
Brad Chapman
Biologist and programmer
Brad Chapman
NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s

Elizabeth Brooks 2 Jun 13, 2022
Very useful and necessary functions that simplify working with data

Additional-function-for-pandas Very useful and necessary functions that simplify working with data random_fill_nan(module_name, nan) - Replaces all sp

Alexander Goldian 2 Dec 2, 2021
Useful tool for inserting DataFrames into the Excel sheet.

PyCellFrame Insert Pandas DataFrames into the Excel sheet with a bunch of conditions Install pip install pycellframe Usage Examples Let's suppose that

Luka Sosiashvili 1 Feb 16, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Tuplex 791 Jan 4, 2023
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Coiled 102 Nov 10, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

???? ??. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

Marc Skov Madsen 97 Dec 8, 2022
Sample code for Harry's Airflow online trainng course

Sample code for Harry's Airflow online trainng course You can find the videos on youtube or bilibili. I am working on adding below things: the slide p

null 102 Dec 30, 2022
Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center 2 Jul 1, 2022
[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022

Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes

Jun Li 65 Dec 9, 2022
Example Of Splunk Search Query With Python And Splunk Python SDK

SSQAuto (Splunk Search Query Automation) Example Of Splunk Search Query With Python And Splunk Python SDK installation: ➜ ~ git clone https://github.c

AmirHoseinTangsiriNET 1 Nov 14, 2021
A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

Rishikesh S 4 Oct 17, 2022
ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit

null 0 Dec 1, 2021
Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa

sammuhrai 4 Jul 29, 2022
Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Larch: Data Analysis Tools for X-ray Spectroscopy and More Documentation: http://xraypy.github.io/xraylarch Code: http://github.com/xraypy/xraylarch L

xraypy 95 Dec 13, 2022
Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep

Sebastian Schäfer 10 Dec 8, 2022
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

null 2 Nov 20, 2021