3218 Repositories
Python open-data Libraries
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016).
An easy-to-use feature store
A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.
This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML)
package tests docs license stats support This repository contains FEDOT - an open-source framework for automated modeling and machine learning (AutoML
Learn Basic to advanced level Data visualisation techniques from this Repository
Data visualisation Hey, You can learn Basic to advanced level Data visualisation techniques from this Repository. Data visualization is the graphic re
Create charts with Python in a very similar way to creating charts using Chart.js
Create charts with Python in a very similar way to creating charts using Chart.js. The charts created are fully configurable, interactive and modular and are displayed directly in the output of the the cells of your jupyter notebook environment.
visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem
visualize_ML visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem. It is build
Equibles Stocks API for Python
Equibles Stocks API for Python Requirements. Python 2.7 and 3.4+ Installation & Usage pip install If the python package is hosted on Github, you can i
A fairly common feature in web applications to have links that open a popover when hovered
Add Popovers to Links in Flask App It is a fairly common feature in web applications to have links that open a popover when hovered. Twitter does this
WallAlley.bot is an open source and free to use financial discord bot originaly build for WallAlley server's community
WallAlley.bot About WallAlley.bot is an open source and free to use financial discord bot originaly build for WallAlley server's community. All data a
A server and client for passing data between computercraft computers/turtles across dimensions or even servers.
ccserver A server and client for passing data between computercraft computers/turtles across dimensions or even servers. pastebin get zUnE5N0v client
Feature engineering and machine learning: together at last
Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu
INFO-H515 - Big Data Scalable Analytics
INFO-H515 - Big Data Scalable Analytics Jacopo De Stefani, Giovanni Buroni, Théo Verhelst and Gianluca Bontempi - Machine Learning Group Exercise clas
This module is used to create Convolutional AutoEncoders for Variational Data Assimilation
VarDACAE This module is used to create Convolutional AutoEncoders for Variational Data Assimilation. A user can define, create and train an AE for Dat
Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining
**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** S
Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.
Sentiment Analyzer The goal of this project is to perform sentiment analysis on textual data that people generally post on websites like social networ
Dive into Machine Learning
Dive into Machine Learning Hi there! You might find this guide helpful if: You know Python or you're learning it 🐍 You're new to Machine Learning You
TensorDebugger (TDB) is a visual debugger for deep learning. It extends TensorFlow with breakpoints + real-time visualization of the data flowing through the computational graph
TensorDebugger (TDB) is a visual debugger for deep learning. It extends TensorFlow (Google's Deep Learning framework) with breakpoints + real-time visualization of the data flowing through the computational graph.
Python for Data Analysis, 2nd Edition
Python for Data Analysis, 2nd Edition Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media Buy
INF42 - Topological Data Analysis
TDA INF421(Conception et analyse d'algorithmes) Projet : Topological Data Analysis SphereMin Etant donné un nuage des points, ce programme contient de
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code
sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ
This is an open solution to the Home Credit Default Risk challenge 🏡
Home Credit Default Risk: Open Solution This is an open solution to the Home Credit Default Risk challenge 🏡 . More competitions 🎇 Check collection
Google AI Open Images - Object Detection Track: Open Solution
Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c
TGS Salt Identification Challenge
TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp
Airbus Ship Detection Challenge
Airbus Ship Detection Challenge This is an open solution to the Airbus Ship Detection Challenge. Our goals We are building entirely open solution to t
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house
This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file con
Python Sreamlit Duplicate Records Finder Remover
Python-Sreamlit-Duplicate-Records-Finder-Remover Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom w
Program that predicts the NBA mvp based on data from previous years.
NBA MVP Predictor A machine learning model using RandomForest Regression that predicts NBA MVP's using player data. Explore the docs » View Demo · Rep
Simple Translator in Python
Simple Translator in Python Project Description: In this project, we'll be making a very simple translator in Python using some libraries. Requirement
Open solution to the Toxic Comment Classification Challenge
Starter code: Kaggle Toxic Comment Classification Challenge More competitions 🎇 Check collection of public projects 🎁 , where you can find multiple
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
dbd: database prototyping tool dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL d
Data-driven Computer Science UoB
COMS20011_2021 Data-driven Computer Science UoB Staff Laurence Aitchison [[email protected]] (unit director) Majid Mirmehdi [m.mirmehdi
Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia
RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph
An interactive App to play with Spotify data, both from the Spotify Web API and from CSV datasets.
An interactive App to play with Spotify data, both from the Spotify Web API and from CSV datasets.
Airborne magnetic data of the Osborne Mine and Lightning Creek sill complex, Australia
Osborne Mine, Australia - Airborne total-field magnetic anomaly This is a section of a survey acquired in 1990 by the Queensland Government, Australia
This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!
Credit Card Fraud Detection Introduction Online transactions have become a crucial part of any business over the years. Many of those transactions use
FFCV: Fast Forward Computer Vision (and other ML workloads!)
Fast Forward Computer Vision: train models at a fraction of the cost with accele
'rl_UK' is an open-source command-line tool in Python for calculating the shortest path between BUS stop sequences in the UK
'rl_UK' is an open-source command-line tool in Python for calculating the shortest path between BUS stop sequences in the UK. As input files, it uses an ATCO-CIF file and 'OS Open Roads' dataset from Ordnance Survey Data Hub.
ServiceX Transformer that converts flat ROOT ntuples into columnwise data
ServiceX_Uproot_Transformer ServiceX Transformer that converts flat ROOT ntuples into columnwise data Usage You can invoke the transformer from the co
This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP.
Overview Welcome to the Step-X repository. This repo is dedicated to the data extraction and manipulation of the World Bank's database called STEP. Be
ECLARE: Extreme Classification with Label Graph Correlations
ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal
Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning
Circuit Training: An open-source framework for generating chip floor plans with distributed deep reinforcement learning. Circuit Training is an open-s
Holehe OSINT - Email to Registered Accounts
holehe allows you to check if the mail is used on different sites like twitter, instagram and will retrieve information on sites with the forgotten password function.
CO2Ampel - This RaspberryPi project uses weather data to estimate the share of renewable energy in the power grid
CO2Ampel This RaspberryPi project uses weather data to estimate the share of ren
The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards
The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards.
RedlineSpam - Python tool to spam Redline Infostealer panels with legit looking data
RedlineSpam Python tool to spam Redline Infostealer panels with legit looking da
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift
Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2
Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion
This code implements the paper, Kim et al. (2021). Imputing Qualitative Attributes for Trip Chains Extracted from Smart Card Data Using a Conditional Generative Adversarial Network. Transportation Research Part C. Under Review.
Import, connect and transform data into Excel
xlwings_query Import, connect and transform data into Excel. Description The concept is to apply data transformations to a main query object. When the
Customizable and open-sourced bot for a few private servers
MarlBot A private bot for controlling monkeys and turtles. Why does this bot exist? The bot exists as a general-purpose community bot for a select few
Code for Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)
Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022) We consider how a user of a web servi
A combination between python-flask, that fetch and send data from league client during champion select thanks to LCU
A combination between python-flask, that fetch data and send from league client during champion select thanks to LCU and compare picked champs to the gamesDataBase that we need to collect using my other python script and then send the games result to localhost:5000/members that will be read by electron-reactJS script to present the results as a GUI on browser (localhost:5000)
RedisJSON - a JSON data type for Redis
RedisJSON is a Redis module that implements ECMA-404 The JSON Data Interchange Standard as a native data type. It allows storing, updating and fetching JSON values from Redis keys (documents).
open source online judge based on Vue, Django and Docker
An onlinejudge system based on Python and Vue
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor. It is devel
100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.
100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.
Elkeid HUB - A rule/event processing engine maintained by the Elkeid Team that supports streaming/offline data processing
Elkeid HUB - A rule/event processing engine maintained by the Elkeid Team that supports streaming/offline data processing
x2 - a miniminalistic, open-source language created by iiPython
x2 is a miniminalistic, open-source language created by iiPython, inspired by x86 assembly and batch. It is a high-level programming language with low-level, easy-to-remember syntaxes, similar to x86 assembly.
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning This repository provides an implementation of the paper Beta S
Used Logistic Regression, Random Forest, and XGBoost to predict the outcome of Search & Destroy games from the Call of Duty World League for the 2018 and 2019 seasons.
Call of Duty World League: Search & Destroy Outcome Predictions Growing up as an avid Call of Duty player, I was always curious about what factors led
converts nominal survey data into a numerical value based on a dictionary lookup.
SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text
This is a place where I'm playing around with pandas to analyze data in a csv/excel file.
pandas-csv-excel-analysis This is a place where I'm playing around with pandas to analyze data in a csv/excel file. 0-start A very simple cheat sheet
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines
Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy
Convert any binary data to a PNG image file and vice versa.
What is PngBin? The name PngBin comes from an image format file extension PNG (Portable Network Graphics) and the word Binary. An image produced by Pn
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation
PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks.
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks. The goal is to successfully use this notebook project below with Security Onion for beacon detection capabilities.
Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data
VIMuRe Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data. If you use this code please cite this article (preprint). De
An open source two key macro-pad modeled to look like a cartoony melting popsicle
macropopsicle An open source two key macro-pad modeled to look like a cartoony melting popsicle. Build instructions Parts List -1x Top case half (3D p
Magic: The Gathering Arena draft tool that utilizes 17Lands data
MTGA_Draft_17Lands Magic: The Gathering Arena draft tool that utilizes 17Lands data. Steps for Windows Step 1: Download and unzip the MTGA_Draft_17Lan
Exploring the Top ML and DL GitHub Repositories
This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.
Yesitsme - Simple OSINT script to find Instagram profiles by name and e-mail/phone
Simple OSINT script to find Instagram profiles by name and e-mail/phone
[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)
Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation) [arXiv] [paper] @inproceedings{hou2021multiview, title={Multiview
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins
Dex-scrapper - Hobby project for scrapping dex data on VeChain
Folders /zumo_abis # abi extracted from zumo repo /zumo_pools # runtime e
A Unified Framework and Analysis for Structured Knowledge Grounding
UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu
[AI6122] Text Data Management & Processing
[AI6122] Text Data Management & Processing is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Sun Aixin.
Data collection, enhancement, and metrics calculation.
l3_data_collection Data collection, enhancement, and metrics calculation. Summary Repository containing code for QuantDAO's JDT data collection task.
Data App Performance Tests
Data App Performance Tests My hypothesis is that The different architectures of
An implementation of the efficient attention module.
Efficient Attention An implementation of the efficient attention module. Description Efficient attention is an attention mechanism that substantially
Implementation of Axial attention - attending to multi-dimensional data efficiently
Axial Attention Implementation of Axial attention in Pytorch. A simple but powerful technique to attend to multi-dimensional data efficiently. It has
PyElastica is the Python implementation of Elastica, an open-source software for the simulation of assemblies of slender, one-dimensional structures using Cosserat Rod theory.
PyElastica PyElastica is the python implementation of Elastica: an open-source project for simulating assemblies of slender, one-dimensional structure
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio
Completed task 1 and task 2 at LetsGrowMore as a data science intern.
LetsGrowMore-Internship Completed task 1 and task 2 at LetsGrowMore as a data science intern. Task 1- Task 2- Creating a Decision Tree classifier and
100 Days of Code The Complete Python Pro Bootcamp for 2022
100-Day-With-Python 100 Days of Code - The Complete Python Pro Bootcamp for 2022. In this course, I spend with python language over 100 days, and I up
Python data processing, analysis, visualization, and data operations
Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio
Open Source API and interchange format for editorial timeline information.
OpenTimelineIO is currently in Public Beta. That means that it may be missing some essential features and there are large changes planned. During this phase we actively encourage you to provide feedback, requests, comments, and/or contributions.
Mercury: easily convert Python notebook to web app and share with others
Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad
NW 2022 Hackathon Project by Angelique Clara Hanzel, Aryan Sonik, Damien Fung, Ramit Brata Biswas
Spiral-Data-Visualizer NW 2022 Hackathon Project by Angelique Clara Hanzell, Aryan Sonik, Damien Fung, Ramit Brata Biswas Description This project vis
EasyModerationKit is an open-source framework designed to moderate and filter inappropriate content.
EasyModerationKit is a public transparency statement. It declares any repositories and legalities used in the EasyModeration system. It allows for implementing EasyModeration into an advanced character/word/phrase detection system.
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat
StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system
StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system, initially used for researching optimal incentive parameters for Liquidations 2.0.
This repository gives an example on how to preprocess the data of the HECKTOR challenge
HECKTOR 2021 challenge This repository gives an example on how to preprocess the data of the HECKTOR challenge. Any other preprocessing is welcomed an
Code and data (Incidents Dataset) for ECCV 2020 Paper "Detecting natural disasters, damage, and incidents in the wild".
Incidents Dataset See the following pages for more details: Project page: IncidentsDataset.csail.mit.edu. ECCV 2020 Paper "Detecting natural disasters
This repo generates the training data and the model for Morpheus-Deblend
Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as
Deeplearning project at The Technological University of Denmark (DTU) about Neural ODEs for finding dynamics in ordinary differential equations and real world time series data
Authors Marcus Lenler Garsdal, [email protected] Valdemar Søgaard, [email protected] Simon Moe Sørensen, [email protected] Introduction This repo contains the
Accurate identification of bacteriophages from metagenomic data using Transformer
PhaMer is a python library for identifying bacteriophages from metagenomic data. PhaMer is based on a Transorfer model and rely on protein-based vocab
Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.
Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi
Data-driven reduced order modeling for nonlinear dynamical systems
SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based
WorldsCollide - Final Fantasy VI Randomizer
FFVI Worlds Collide Worlds Collide is an open worlds randomizer for Final Fantas
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi