1190 Repositories
Python analysis-pipeline Libraries
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.
Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing
A scrapy pipeline that provides an easy way to store files and images using various folder structures.
scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str
PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).
PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).
NFCDS Workshop Beginners Guide Bioinformatics Data Analysis
Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s
Malware-analysis-writeups - Some of my Malware Analysis writeups
About This repo contains some malware analysis writeups i've created over time m
Docov - Light-weight, recursive docstring coverage analysis for python modules
docov Light-weight, recursive docstring coverage analysis for python modules. Ov
Reverse engineering the dengue virus (under development construction)
Reverse engineering the dengue virus (under development 🚧 ) What is dengue? Dengue is a viral infection transmitted to humans through the bite of inf
Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics
Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb
Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables
Mortgage-Application-Analysis Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables: age, in
LSTM model - IMDB review sentiment analysis
NLP - Movie review sentiment analysis The colab notebook contains the code for building a LSTM Recurrent Neural Network that gives 87-88% accuracy on
This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th project are focused on Data Analysis, some of them are also put here to show off other skills that I have learned.
Welcome to my Data Analysis projects page! This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th proje
Senator Trades Monitor
Senator Trades Monitor This monitor will grab the most recent trades by senators and send them as a webhook to discord. Installation To use the monito
Edison AT is software Depression Assistant personal.
Edison AT Edison AT is software / program Depression Assistant personal. Feature: Analyze emotional real-time from face. Audio Edison(Comingsoon relea
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t
Transform a Raspberry Pi into a network diagnostic machine.
EtherView Last updated jan 30, 2022. Welcome to the EtherView project! This is a project to transform a RaspberryPi into a portable network diagnostic
PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra
PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra The purpose of this project is to demonstrate a structured streaming pipeline with Apache
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
The mitosheet package, trymito.io, and other public Mito code.
Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener
An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine
A Gentle Introduction to Satellite Image Processing Welcome to this introductory course on Satellite Image Analysis! Satellite imagery has become a pr
Analyzing the most strategic words to guess on Wordle, based on letter frequency distributions
wordle-analysis Evaluating different heuristics to determine the most effective solving strategy and building an AI-powered assistant tool to help you
Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.
Heart_Disease_Diagnostic_Analysis Objective 🎯 The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics
clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)
Moroccan Stocks Clustering Context Hey! we don't always have to forecast time series am I right ? We use k-means to cluster about 70 moroccan stock pr
A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.
Data-Analysis-Web-App Data Analysis Web App can analysis data in multiple formates(csv, txt, xls, xlsx, ods, odt) and gives shows you the analysis in
Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
NANSY: Unofficial Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Notice Papers' D
Heart Arrhythmia Classification
This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for classification purposes.
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis
Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and
A deep learning framework for historical document image analysis
DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https
Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices
EMOShip This repository contains the EMO-Film dataset described in the paper "Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis
PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?
How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis
Hyperparameters tuning and features selection are two common steps in every machine learning pipeline.
shap-hypetune A python package for simultaneous Hyperparameters Tuning and Features Selection for Gradient Boosting Models. Overview Hyperparameters t
Detailed analysis on fraud claims in insurance companies, gives you information as to why huge loss take place in insurance companies
Insurance-Fraud-Claims Detailed analysis on fraud claims in insurance companies, gives you information as to why huge loss take place in insurance com
Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data
Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma
Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning
Machine_Learning Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning This project is based on 2 case-studies:
Final Project for Practical Python Programming and Algorithms for Data Analysis
Final Project for Practical Python Programming and Algorithms for Data Analysis (PHW2781L, Summer 2020) Redlining, Race-Exclusive Deed Restriction Lan
Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis
TweebankNLP This repo contains the new Tweebank-NER dataset and off-the-shelf Twitter-Stanza pipeline for state-of-the-art Tweet NLP, as described in
Data-sets from the survey and analysis
bachelor-thesis "Umfragewerte.xlsx" contains the orginal survey results. "umfrage_alle.csv" contains the survey results but one participant is cancele
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.
Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In
E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides
E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides
Graph Analysis From Scratch
Graph Analysis From Scratch Goal In this notebook we wanted to implement some functionalities to analyze a weighted graph only by using algorithms imp
Data analysis and visualisation projects from a range of individual projects and applications
Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python
Python scripts for semi-automated morphometric analysis of atolls from Landsat satellite Imagery.
AtollGeoMorph Python scripts for semi-automated morphometric analysis of atolls from Landsat satellite Imagery. The python scripts included allow user
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data
FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick
Awesome Transformers in Medical Imaging
This repo supplements our Survey on Transformers in Medical Imaging Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat,
This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"
ICCV Workshop 2021 VTGAN This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"
Segmentation Training Pipeline
Segmentation Training Pipeline This package is a part of Musket ML framework. Reasons to use Segmentation Pipeline Segmentation Pipeline was developed
Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline
Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline. The pipeline accepts english text as input and returns the French translation.
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model. Designed sample dashboard with insights and recommendation for customers.
Nobel Data Analysis
Nobel_Data_Analysis This project is for analyzing a set of data about people who have won the Nobel Prize in different fields and different countries
Analyze the Gravitational wave data stored at LIGO/VIRGO observatories
Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program
Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a local folder
Ingestinator Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a
Stanford CoreNLP provides a set of natural language analysis tools written in Java
Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016).
visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem
visualize_ML visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem. It is build
An curated collection of awesome resources about networking in cybersecurity
An ongoing curated collection of awesome software, libraries, frameworks, talks & videos, best practices, learning tutorials and important practical resources about networking in cybersecurity
Feature engineering and machine learning: together at last
Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu
Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.
Sentiment Analyzer The goal of this project is to perform sentiment analysis on textual data that people generally post on websites like social networ
Sentiment Classification using WSD, Maximum Entropy & Naive Bayes Classifiers
Sentiment Classification using WSD, Maximum Entropy & Naive Bayes Classifiers
Python for Data Analysis, 2nd Edition
Python for Data Analysis, 2nd Edition Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media Buy
INF42 - Topological Data Analysis
TDA INF421(Conception et analyse d'algorithmes) Projet : Topological Data Analysis SphereMin Etant donné un nuage des points, ce programme contient de
This is an open solution to the Home Credit Default Risk challenge 🏡
Home Credit Default Risk: Open Solution This is an open solution to the Home Credit Default Risk challenge 🏡 . More competitions 🎇 Check collection
Google AI Open Images - Object Detection Track: Open Solution
Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c
TGS Salt Identification Challenge
TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house
This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file con
PROTEIN EXPRESSION ANALYSIS FOR DOWN SYNDROME
PROTEIN-EXPRESSION-ANALYSIS-FOR-DOWN-SYNDROME Down syndrome (DS) is a chromosomal disorder where organisms have an extra chromosome 21, sometimes know
Python Sreamlit Duplicate Records Finder Remover
Python-Sreamlit-Duplicate-Records-Finder-Remover Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom w
Open solution to the Toxic Comment Classification Challenge
Starter code: Kaggle Toxic Comment Classification Challenge More competitions 🎇 Check collection of public projects 🎁 , where you can find multiple
Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia
RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph
Tsar-Bot - Crypto auto trade bot that use sentiment analysis from twitter
Tsar Bot - Crypto Sentiment Bot Tsar Bot is a Twitter Crypto Sentiment Bot that
Twitter Analysis of MIUUL CEO
Twitter Analysis of MIUUL CEO Business Problem I got last @mvahitkeskin 184 twee
TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER
TweebankNLP This repo contains the new Tweebank-NER dataset and Twitter-Stanza p
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift
Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2
Pipeline for employing a Lightweight deep learning models for LOW-power systems
PL-LOW A high-performance deep learning model lightweight pipeline that gradually lightens deep neural networks in order to utilize high-performance d
Me and @nathanmargni did a small analysis on what are the best strategies to win more games of League of Legends.
Me and @nathanmargni did a small analysis on what are the best strategies to win more games of League of Legends.
Python Computer Vision from Scratch
This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.
In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.
Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews Abstract Sentiment analysis has made great progress in recent years, due to th
Twitter-NLP-Analysis - Twitter Natural Language Processing Analysis
Twitter-NLP-Analysis Business Problem I got last @turk_politika 3000 tweets with
Random Number Generator Analysis With Python
Random-Number-Generator-Analysis Governor's Honors Program Project to determine
Implement backup and recovery with AWS Backup across your AWS Organizations using a CI/CD pipeline (AWS CodePipeline).
Backup and Recovery with AWS Backup This repository provides you with a management and deployment solution for implementing Backup and Recovery with A
This is a place where I'm playing around with pandas to analyze data in a csv/excel file.
pandas-csv-excel-analysis This is a place where I'm playing around with pandas to analyze data in a csv/excel file. 0-start A very simple cheat sheet
Pipeline code for Sequential-GAM(Genome Architecture Mapping).
Sequential-GAM Pipeline code for Sequential-GAM(Genome Architecture Mapping). mapping whole_preprocess.sh include the whole processing of mapping. usa
⛤Keylogger Generator for Windows written in Python⛤
⛤Keylogger Generator for Windows written in Python⛤
Auxiliator is telegram bot for basic web-application analysis
Auxiliator Auxiliator is telegram bot for basic web-application analysis What for? Sometimes there is no access to your main PC, where you can scan we
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data
VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De
Exploring the Top ML and DL GitHub Repositories
This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.
Yesitsme - Simple OSINT script to find Instagram profiles by name and e-mail/phone
Simple OSINT script to find Instagram profiles by name and e-mail/phone
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Python data processing, analysis, visualization, and data operations
Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio
Sentiment analysis translations of the Bhagavad Gita
Sentiment and Semantic Analysis of Bhagavad Gita Translations It is well known that translations of songs and poems not only breaks rhythm and rhyming
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.
Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum
This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.
IPL-data-analysis This project consists of data analysis and data visualization of all IPL seasons from 2008 to 2019 and answering the most asked ques
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand
Project for the discipline of Visual Data Analysis at EMAp FGV.
Analysis of the dissemination of fake news about COVID-19 on Twitter This project was the final work for the discipline of Visual Data Analysis of the
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.
Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization
Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp
Self-supervised spatio-spectro-temporal represenation learning for EEG analysis
EEG-Oriented Self-Supervised Learning and Cluster-Aware Adaptation This repository provides a tensorflow implementation of a submitted paper: EEG-Orie
Madanalysis5 - A package for event file analysis and recasting of LHC results
Welcome to MadAnalysis 5 Outline What is MadAnalysis 5? Requirements Downloading
Aircraft design optimization made fast through modern automatic differentiation
Aircraft design optimization made fast through modern automatic differentiation. Plug-and-play analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.
Stock-Prediction- In this project, we aim to enhance the prediction of stock market movements using sentiment analysis and deep learning. We divide th