2908 Repositories
Python data-studio-google Libraries
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning This repository provides an implementation of the paper Beta S
Used Logistic Regression, Random Forest, and XGBoost to predict the outcome of Search & Destroy games from the Call of Duty World League for the 2018 and 2019 seasons.
Call of Duty World League: Search & Destroy Outcome Predictions Growing up as an avid Call of Duty player, I was always curious about what factors led
converts nominal survey data into a numerical value based on a dictionary lookup.
SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text
This is a place where I'm playing around with pandas to analyze data in a csv/excel file.
pandas-csv-excel-analysis This is a place where I'm playing around with pandas to analyze data in a csv/excel file. 0-start A very simple cheat sheet
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines
Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy
Convert any binary data to a PNG image file and vice versa.
What is PngBin? The name PngBin comes from an image format file extension PNG (Portable Network Graphics) and the word Binary. An image produced by Pn
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation
PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks.
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks. The goal is to successfully use this notebook project below with Security Onion for beacon detection capabilities.
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data
VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De
Magic: The Gathering Arena draft tool that utilizes 17Lands data
MTGA_Draft_17Lands Magic: The Gathering Arena draft tool that utilizes 17Lands data. Steps for Windows Step 1: Download and unzip the MTGA_Draft_17Lan
Exploring the Top ML and DL GitHub Repositories
This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.
[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)
Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation) [arXiv] [paper] @inproceedings{hou2021multiview, title={Multiview
MSDorkDump is a Google Dork File Finder that queries a specified domain name and variety of file extensions
MSDorkDump is a Google Dork File Finder that queries a specified domain name and variety of file extensions (pdf, doc, docx, etc), and downloads them.
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins
Dex-scrapper - Hobby project for scrapping dex data on VeChain
Folders /zumo_abis # abi extracted from zumo repo /zumo_pools # runtime e
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG ๐ : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Data collection, enhancement, and metrics calculation.
l3_data_collection Data collection, enhancement, and metrics calculation. Summary Repository containing code for QuantDAO's JDT data collection task.
Data App Performance Tests
Data App Performance Tests My hypothesis is that The different architectures of
Implementation of Axial attention - attending to multi-dimensional data efficiently
Axial Attention Implementation of Axial attention in Pytorch. A simple but powerful technique to attend to multi-dimensional data efficiently. It has
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio
Completed task 1 and task 2 at LetsGrowMore as a data science intern.
LetsGrowMore-Internship Completed task 1 and task 2 at LetsGrowMore as a data science intern. Task 1- Task 2- Creating a Decision Tree classifier and
100 Days of Code The Complete Python Pro Bootcamp for 2022
100-Day-With-Python 100 Days of Code - The Complete Python Pro Bootcamp for 2022. In this course, I spend with python language over 100 days, and I up
Python data processing, analysis, visualization, and data operations
Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio
Mercury: easily convert Python notebook to web app and share with others
Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad
NW 2022 Hackathon Project by Angelique Clara Hanzel, Aryan Sonik, Damien Fung, Ramit Brata Biswas
Spiral-Data-Visualizer NW 2022 Hackathon Project by Angelique Clara Hanzell, Aryan Sonik, Damien Fung, Ramit Brata Biswas Description This project vis
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat
This repository gives an example on how to preprocess the data of the HECKTOR challenge
HECKTOR 2021 challenge This repository gives an example on how to preprocess the data of the HECKTOR challenge. Any other preprocessing is welcomed an
Code and data (Incidents Dataset) for ECCV 2020 Paper "Detecting natural disasters, damage, and incidents in the wild".
Incidents Dataset See the following pages for more details: Project page: IncidentsDataset.csail.mit.edu. ECCV 2020 Paper "Detecting natural disasters
This repo generates the training data and the model for Morpheus-Deblend
Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as
Deeplearning project at The Technological University of Denmark (DTU) about Neural ODEs for finding dynamics in ordinary differential equations and real world time series data
Authors Marcus Lenler Garsdal, [email protected] Valdemar Sรธgaard, [email protected] Simon Moe Sรธrensen, [email protected] Introduction This repo contains the
Accurate identification of bacteriophages from metagenomic data using Transformer
PhaMer is a python library for identifying bacteriophages from metagenomic data. PhaMer is based on a Transorfer model and rely on protein-based vocab
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.
Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi
Data-driven reduced order modeling for nonlinear dynamical systems
SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi
A curated list of awesome game datasets, and tools to artificial intelligence in games
๐ฎ Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th
An awesome Data Science repository to learn and apply for real world problems.
AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s
๐ Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-ml Curated papers, articles, and blogs on data science & machine learning in production. โ๏ธ Figuring out how to implement your ML project? Lea
A curated list of awesome DataOps tools
Awesome DataOps A curated list of awesome DataOps tools. Awesome DataOps Data Catalog Data Exploration Data Ingestion Data Lake Data Processing Data Q
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.
numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.
DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex
Drobo Status is a python program that will connect to your Drobo and return JSON data regarding your Drobo
This is a simple python script that will run a docker container to pull data from Drobo. It will give information like (Name, serial, firmware, disk-total, disk-used, disk-free and individual disk status). I have an older Drobo FS 8 disk array. Please let me know if anyone can test. The results are outputted via JSON
This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.
IPL-data-analysis This project consists of data analysis and data visualization of all IPL seasons from 2008 to 2019 and answering the most asked ques
Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)
Astrostatistics Davide Gerosa - [email protected] University of Milano-Bicocca, 2022. Schedule Introduction Probability and Statistics I Probabi
An open-source outlier detection package by Getcontact Data Team
pyfbad The pyfbad library supports anomaly detection projects. An end-to-end anomaly detection application can be written using the source codes of th
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand
A cpp project template that uses CMake to build and Google Test / Github Actions to provide a CI
A cpp project template that uses CMake to build and Google Test / Github Actions to provide a CI
The final project of "Applying AI to 2D Medical Imaging Data" of "AI for Healthcare" nanodegree - Udacity.
Pneumonia Detection from X-Rays Project Overview In this project, you will apply the skills that you have acquired in this 2D medical imaging course t
Credit EDA Case Study Using Python
This case study aims to identify patterns which indicate if a client has difficulty paying their installments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.
Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test
Project for the discipline of Visual Data Analysis at EMAp FGV.
Analysis of the dissemination of fake news about COVID-19 on Twitter This project was the final work for the discipline of Visual Data Analysis of the
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.
Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.
Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result
Vigia-youtube - The YouTube Watch bot is able to monitor channels on Google's video platform
Vigia do YouTube O bot Vigia do YouTube รฉ capaz de monitorar canais na plataform
Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.
Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.
A list of Python Bots used to extract data from several websites
A list of Python Bots used to extract data from several websites. Data extraction is for products on e-commerce (ecommerce) websites. Data fetched i
GEGVL: Google Earth Based Geoscience Video Library
Google Earth Based Geoscience Video Library is transforming to Server Based. The
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet and has the capability to segment 120 unique tissue classes from a whole-body 18F-FDG PET/CT image.
A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)
DFC2022 Baseline A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022) This repository uses TorchGeo, PyTorch Lightning, and Segmenta
Calendar heatmaps from Pandas time series data
Note: See MarvinT/calmap for the maintained version of the project. That is also the version that gets published to PyPI and it has received several f
Unauthenticated Sqlinjection that leads to dump data base but this one impersonated Admin and drops a interactive shell
Unauthenticated Sqlinjection that leads to dump database but this one impersonated Admin and drops a interactive shell
Med to csv - A simple way to parse MedAssociate output file in tidy data
MedAssociates to CSV file A simple way to parse MedAssociate output file in tidy
This project is related to a No-SQL database, whose data are referred to autoctone botanic species
This project is related to a No-SQL database, whose data are referred to autoctone botanic species. The final goal is creating a function that performs the estimation of the ornamental value, given the specific characteristics of a single species.
CellRank's reproducibility repository.
CellRank's reproducibility repository We believe that reproducibility is key and have made it as simple as possible to reproduce our results. Please e
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection
An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection
ALSPAC data analysis studying links between screen-usage and mental health issues in children. Provided data has been synthesised.
ADSMH - Mental Health and Screen Time Group coursework for Applied Data Science at the University of Bristol. Overview The data set that you have was
The repository is my code for various types of data visualization cases based on the Matplotlib library.
ScienceGallery The repository is my code for various types of data visualization cases based on the Matplotlib library. It summarizes the code and cas
STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.
STARCH (Storm Tracking And Regional CHaracterization) STARCH computes regional extreme storm physical and moisture balance characteristics based on sp
Data aggregated from the reports found at the MCPS COVID Dashboard into a set of visualizations.
Montgomery County Public Schools COVID-19 Visualizer Contents About this project Data Support this project About this project Data All data we use can
This is the course project of AI3602: Data Mining of SJTU
This is the course project of AI3602: Data Mining of SJTU. Group Members include Jinghao Feng, Mingyang Jiang and Wenzhong Zheng.
Automatically pick a winner who Retweeted, Commented, and Followed your Twitter account!
AutomaticTwitterGiveaways automates selecting winners for "Retweet, Comment, Follow" type Twitter giveaways.
Fully Automated YouTube Channel โถ๏ธwith Added Extra Features.
Fully Automated Youtube Channel โโโโโ โโโโ โโโโโ โโโโโ โโโโ โโโโ โโโ โโโโ โโโโโ โโโโ โโโโโ โโโโโ โโโโ โโโโ โโโ โโโโ โโโโโ โโโโ โโโโโ โโโโโ โโโโ โโโโ
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)
KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p
Python data loader for Solar Orbiter's (SolO) Energetic Particle Detector (EPD).
Data loader (and downloader) for Solar Orbiter/EPD energetic charged particle sensors EPT, HET, and STEP. Supports level 2 and low latency data provided by ESA's Solar Orbiter Archive.
Data Analytics on Genomes and Genetics
Data Analytics performed on On genomes and Genetics dataset to predict genetic disorder and disorder subclass. DONE by TEAM SIGMA!
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
Predict the Site EUI, given the characteristics of the building and the weather data for the location of the building.
wids_datathon_2022 Description: Contains a data pipeline used to predict energy EUI Goals: Dataset exploration Automating the parameter fitting, gener
In this project , I play with the YouTube data API and extract trending videos in Nigeria on a particular day
YouTubeTrendingVideosAnalysis In this project , I played with the YouTube data API and extracted trending videos in Nigeria on a particular day. This
This repository structures data in title, summary, tags, sentiment given a fragment of a conversation
Understand-conversation-AI This repository structures data in title, summary, tags, sentiment given a fragment of a conversation How to install: pip i
Kinetics-Data-Preprocessing
Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow
Python Machine Learning Jupyter Notebooks (ML website)
Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
The bidirectional mapping library for Python.
bidict The bidirectional mapping library for Python. Status bidict: has been used for many years by several teams at Google, Venmo, CERN, Bank of Amer
Python Library to get fast extensive Dummy Data for testing
Dumda Python Library to get fast extensive Dummy Data for testing https://pypi.org/project/dumda/ Installation pip install dumda Usage: Cities from d
Displaying plot of death rates from past years in Poland. Data source from these years is in readme
Average-Death-Rate Displaying plot of death rates from past years in Poland The goal collect the data from a CSV file count the ADR (Average Death Rat
Advanced raster and geometry manipulations
buzzard In a nutshell, the buzzard library provides powerful abstractions to manipulate together images and geometries that come from different kind o
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big
Shelf DB is a tiny document database for Python to stores documents or JSON-like data
Shelf DB Introduction Shelf DB is a tiny document database for Python to stores documents or JSON-like data. Get it $ pip install shelfdb shelfquery S
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.
2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou
DCM is a set of tools that helps you to keep your data in your Django Models consistent.
Django Consistency Model DCM is a set of tools that helps you to keep your data in your Django Models consistent. Motivation You have a lot of legacy
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.
Data Science Course at Dept. of Computer Engineering, Chula 2022
2110446 Data Science Course at Chula 2022 Short links for exercises: Week1: Intro to Numpy, Pandas Numpy: https://colab.research.google.com/github/kao
Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data
WeRateDogs Twitter Data from 2015 to 2017 Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data Table of Contents Introduction Proj
A simple notebook to stream torrent files directly to Google Drive using Google Colab.
Colab-Torrent-to-Drive Originally by FKLC, this is a simple notebook to stream torrent files directly to Google Drive using Google Colab. You can eith
BSDotPy, A module to get a bombsquad player's account data.
BSDotPy BSDotPy, A module to get a bombsquad player's account data from bombsquad's servers. Badges Provided By: shields.io Acknowledgements Issues Pu
Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).
Knowledge Informed Machine Learning using a Weibull-based Loss Function Exploring the concept of knowledge-informed machine learning with the use of a
Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation
Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation This repository contains code and data f
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks arXiv preprint: https://arxiv.org/abs/2201.02143. Architec