This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

Overview

The Gene_finder program

This program is designed to manipulate data from DNA and find the genes that are encoded by the DNA sequence. It contains 11 independent functions.These functions are :

  • get_complement(nucleotide)

    • Returns the complementary nucleotide nucleotide: a nucleotide (A, C, G, or T) represented as a string returns: the complementary nucleotide get_complement('A') 'T'
  • get_reverse_complement(dna)

    • Computes the reverse complementary sequence of DNA for the specfied DNA sequence
  • rest_of_ORF(dna)

    • Takes a DNA sequence that is assumed to begin with a start codon and returns the sequence up to but not including the first in frame stop codon. If there is no in frame stop codon, returns the whole string.
  • find_all_ORFs_oneframe(dna)

    • Finds all non-nested open reading frames in the given DNA sequence and returns them as a list. This function should only find ORFs that are in the default frame of the sequence (i.e. they start on indices that are multiples of 3). By non-nested we mean that if an ORF occurs entirely within another ORF, it should not be included in the returned list of ORFs.
  • find_all_ORFs(dna)

    • Finds all non-nested open reading frames in the given DNA sequence in all 3 possible frames and returns them as a list. By non-nested we mean that if an ORF occurs entirely within another ORF and they are both in the same frame, it should not be included in the returned list of ORFs.
  • longest_ORF(dna)

    • Finds the longest ORF on both strands of the specified DNA and returns it as a string
  • find_all_ORFs_both_strands(dna)

    • Finds all non-nested open reading frames in the given DNA sequence on both strands
  • shuffle_string(s)

    • Shuffles the characters in the input string
  • longest_ORF_noncoding(dna, num_trials)

    • Computes the maximum length of the longest ORF over num_trials shuffles of the specfied DNA sequence
  • coding_strand_to_AA(dna)

    • Computes the Protein encoded by a sequence of DNA. This function does not check for start and stop codons (it assumes that the input DNA sequence represents an protein coding region)
  • gene_finder(dna)

    • Returns the amino acid sequences that are likely coded by the specified dna

To start the program import the MainMenu() function from the menu module:

disclaimer!!

In the menu module, MainMenu function ... you need to ensure the path matches your path. The path given here matches the data given

You might also like...
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

 ๐Ÿงช Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.
๐Ÿงช Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

๐Ÿงช๐Ÿ“ˆ ๐Ÿ. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan

Statistical Analysis ๐Ÿ“ˆ focused on statistical analysis and exploration used on various data sets for personal and professional projects.
Statistical Analysis ๐Ÿ“ˆ focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis ๐Ÿ“ˆ This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

Working Time Statistics of working hours and working conditions by industry and company

Working Time Statistics of working hours and working conditions by industry and company

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit

Owner
null
AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures.

AptaMAT Purpose AptaMat is a simple script which aims to measure differences between DNA or RNA secondary structures. The method is based on the compa

GEC UTC 3 Nov 3, 2022
peptides.py is a pure-Python package to compute common descriptors for protein sequences

peptides.py Physicochemical properties and indices for amino-acid sequences. ??๏ธ Overview peptides.py is a pure-Python package to compute common descr

Martin Larralde 32 Dec 31, 2022
A program that uses an API and a AI model to get info of sotcks

Stock-Market-AI-Analysis I dont mind anyone using this code but please give me credit A program that uses an API and a AI model to get info of stocks

null 1 Dec 17, 2021
a tool that compiles a csv of all h1 program stats

h1stats - h1 Program Stats Scraper This python3 script will call out to HackerOne's graphql API and scrape all currently active programs for informati

Evan 40 Oct 27, 2022
Display the behaviour of a realtime program with a scope or logic analyser.

1. A monitor for realtime MicroPython code This library provides a means of examining the behaviour of a running system. It was initially designed to

Peter Hinch 17 Dec 5, 2022
My first Python project is a simple Mad Libs program.

Python CLI Mad Libs Game My first Python project is a simple Mad Libs program. Mad Libs is a phrasal template word game created by Leonard Stern and R

Carson Johnson 1 Dec 10, 2021
DefAP is a program developed to facilitate the exploration of a material's defect chemistry

DefAP is a program developed to facilitate the exploration of a material's defect chemistry. A large number of features are provided and rapid exploration is supported through the use of autoplotting with carefully considered automatic data labelling and simplification options enabling production of publication quality plots.

null 6 Oct 25, 2022
OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase working capital.

Overview OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase

Tom 3 Feb 12, 2022
A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

This tutorial's purpose is to introduce Pythonistas to methods for scaling their data science and machine learning work to larger datasets and larger models, using the tools and APIs they know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Coiled 102 Nov 10, 2022