3123 Repositories
Python how-to-move-data-to-any-cloud Libraries
Dag-bakery - Dag Bakery enables the capability to define Airflow DAGs via YAML.
DAG Bakery - WIP 🔧 dag-bakery aims to simplify our DAG development by removing all the boilerplate and duplicated code when defining multiple DAG cro
Autoscaling volumes for Kubernetes (with the help of Prometheus)
Kubernetes Volume Autoscaler (with Prometheus) This repository contains a service that automatically increases the size of a Persistent Volume Claim i
PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation
SITT The repo contains official PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation. Authors: Boyi Li Yin Cui T
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.
This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code
Streaming Finance Data with AWS Lambda
A data pipeline consisting of an AWS lambda function reading data from yfinance API, an AWS Kinesis stream to receive & store data in S3 buckets and AWS Glue crawler & Athena to run SQL queries.
Building a Robust IOT device which is customizable, encrypted, secure and user friendly
Building a Robust IOT device which is customizable, encrypted, secure and user friendly, which uses a single GPIO pin to extract multiple sensor values
Google scholar share - Simple python script to pull Google Scholar data from an author's profile
google_scholar_share Simple python script to pull Google Scholar data from an au
Splore - a simple graphical interface for scrolling through and exploring data sets of molecules
Scroll through and exPLORE molecule sets The splore framework aims to offer a si
Autoencoder - Reducing the Dimensionality of Data with Neural Network
autoencoder Implementation of the Reducing the Dimensionality of Data with Neural Network – G. E. Hinton and R. R. Salakhutdinov paper. Notes Aim to m
Preprossing-loan-data-with-NumPy - In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United States.
Preprossing-loan-data-with-NumPy In this project, I have cleaned and pre-processed the loan data that belongs to an affiliate bank based in the United
Imports VZD (Latvian State Land Service) open data into postgis enabled database
Python script main.py downloads and imports Latvian addresses into PostgreSQL database. Data contains parishes, counties, cities, towns, and streets.
Covid-ml-predictors - COVID predictions using AI.
COVID Predictions This repo contains ML models to be trained on COVID-19 data from the UK, sourced off of Kaggle here. This uses many different ML mod
OMDB-and-TasteDive-Mashup - Mashing up data from two different APIs to make movie recommendations.
OMDB-and-TasteDive-Mashup This hadns-on project is in the Python 3 Programming Specialization offered by University of Michigan via Coursera. Mashing
SAT Project - The first project I had done at General Assembly, performed EDA, data cleaning and created data visualizations
Project 1: Standardized Test Analysis by Adam Klesc Overview This project covers: Basic statistics and probability Many Python programming concepts Pr
UnpNet - Rethinking 3-D LiDAR Point Cloud Segmentation(IEEE TNNLS)
UnpNet Citation Please cite the following paper if you use this repository in your reseach. @article {PMID:34914599, Title = {Rethinking 3-D LiDAR Po
Churn-Prediction-Project - In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class.
Churn-Prediction-Project In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class. Project in
Both social media sentiment and stock market data are crucial for stock price prediction
Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy based on this analysis leads to increased trading profits up to three times compared with a simple buy and hold strategy.
Data Analytics: Modeling and Studying data relating to climate change and adoption of electric vehicles
Correlation-Study-Climate-Change-EV-Adoption Data Analytics: Modeling and Studying data relating to climate change and adoption of electric vehicles I
Python library to prevent XSS(cross site scripting attach) by removing harmful content from data.
A tool for removing malicious content from input data before saving data into database. It takes input containing HTML with XSS scripts and returns va
Markov bot - A Writing bot based on Markov Chain for Data Structure Lab
基于马尔可夫链的写作机器人 前端 用html/css完成 Demo展示(已给出文本的相应展示) 用户提供相关的语料库后训练的成果 后端 要完成的几个接口 解析文
Get-countries-info - A python code that fetches data of any country
Country-info A python code getting countries information including country's map
A python script to extract answers to any question on Quora (Quora+ included)
quora-plus-bypass A python script to extract answers to any question on Quora (Quora+ included) Requirements Python 3.x
IP Rover - An Excellent OSINT tool to get information of any ip address
IP Rover - An Excellent OSINT tool to get information of any ip address. All details are explained in below screenshot
Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly
Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly Problem: 2 peloton users were looking for a way to track their metri
Acc-Data-Gen - Allows you to generate a password, e-mail & token for your Minecraft Account
Acc-Data-Gen Allows you to generate a password, e-mail & token for your Minecraft Account How to use the generator: Move all the files in a single dir
Easyeda2kicad.py - Convert any LCSC components (including EasyEDA) to KiCad library
easyeda2kicad.py A Python script that convert any electronic components from LCSC or EasyEDA to a Kicad library Installation git clone https://github.
Awesome-google-colab - Google Colaboratory Notebooks and Repositories
Unofficial Google Colaboratory Notebook and Repository Gallery Please contact me to take over and revamp this repo (it gets around 30k views and 200k
Practical-statistics-for-data-scientists - Code repository for O'Reilly book
Code repository Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python by Peter Bruce, Andrew Bruce, and Peter Gedeck Pub
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Osmnx-examples - Usage examples, demos, and tutorials for OSMnx.
OSMnx Examples OSMnx is a Python package to work with street networks and other spatial data from OpenStreetMap: retrieve, model, analyze, and visuali
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch
pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar
Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
data-science-on-gcp Source code accompanying book: Data Science on the Google Cloud Platform, 2nd Edition Valliappa Lakshmanan O'Reilly, Jan 2022 Bran
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
🕵 Artificial Intelligence for social control of public administration
Non-tech crash course into Operação Serenata de Amor Tech crash course into Operação Serenata de Amor Contributing with code and tech skills Supportin
Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
AutoGluon: AutoML for Text, Image, and Tabular Data
AutoML for Text, Image, and Tabular Data AutoGluon automates machine learning tasks enabling you to easily achieve strong predictive performance in yo
When traveling in the backcountry during winter time, updating yourself on current and recent weather data is important to understand likely avalanche danger.
Weather Data When traveling in the backcountry during winter time, updating yourself on current and recent weather data is important to understand lik
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke
Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f
Coursera - Quiz & Assignment of Coursera
Coursera Assignments This repository is aimed to help Coursera learners who have difficulties in their learning process. The quiz and programming home
Hitchhikers-guide - The Hitchhiker's Guide to Data Science for Social Good
Welcome to the Hitchhiker's Guide to Data Science for Social Good. What is the Data Science for Social Good Fellowship? The Data Science for Social Go
Speech-Emotion-Analyzer - The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Speech Emotion Analyzer The idea behind creating this project was to build a machine learning model that could detect emotions from the speech we have
Sigma coding youtube - This is a collection of all the code that can be found on my YouTube channel Sigma Coding.
Sigma Coding Tutorials & Resources YouTube • Facebook Support Sigma Coding Patreon • GitHub Sponsor • Shop Amazon Table of Contents Overview Topics Re
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class
Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi
Fastquant - Backtest and optimize your trading strategies with only 3 lines of code!
fastquant 🤓 Bringing backtesting to the mainstream fastquant allows you to easily backtest investment strategies with as few as 3 lines of python cod
Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
A scalable on-line movie recommender using Spark and Flask This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens datase
DAT4 - General Assembly's Data Science course in Washington, DC
DAT4 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (12/15/14 - 3/16/15). Instructors: Sinan Ozdemir
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework
Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework
Lolviz - A simple Python data-structure visualization tool for lists of lists, lists, dictionaries; primarily for use in Jupyter notebooks / presentations
lolviz By Terence Parr. See Explained.ai for more stuff. A very nice looking javascript lolviz port with improvements by Adnan M.Sagar. A simple Pytho
Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.
Kaggle-titanic This is a tutorial in an IPython Notebook for the Kaggle competition, Titanic Machine Learning From Disaster. The goal of this reposito
Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath
Statistical Rethinking with Python and PyMC3 This repository has been deprecated in favour of this one, please check that repository for updates, for
Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.
Orbivator_AI Breast Cancer Wisconsin (Diagnostic) GOAL To Determine which features of data (measurements) are most important for diagnosing breast can
Islam - This is a simple python script.In this script I have written all the suras of Al Quran. As a result, by using this script, you can know the number of any sura at the moment.
Introduction: If you want to know sura number of al quran by just typing the name of sura than you can use this script. Usage in termux: $ pkg install
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana
Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.
As much as this series is to educate aspiring computer programmers and data scientists of all ages and all backgrounds, it is also a reminder to mysel
Emissary - open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
Emissary-ingress Emissary-Ingress is an open-source Kubernetes-native API Gateway + Layer 7 load balancer + Kubernetes Ingress built on Envoy Proxy. E
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Sta
Autosub - Command-line utility for auto-generating subtitles for any video file
Auto-generated subtitles for any video Autosub is a utility for automatic speech recognition and subtitle generation. It takes a video or an a
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
The Master's in Data Science Program run by the Faculty of Mathematics and Information Science
The Master's in Data Science Program run by the Faculty of Mathematics and Information Science is among the first European programs in Data Science and is fully focused on data engineering and data analytics.
UltraGraphQL - a GraphQL interface for querying and modifying RDF data on the Web.
UltraGraphQL - cloned from https://git.rwth-aachen.de/i5/ultragraphql Updated or extended files: build.gradle: updated maven to use maven {url "https:
Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data
line scanning repository This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza
Lavrigon - A Python Webservice to check the status of any given local service via a REST call
lavrigon A Python Webservice to check the status of any given local service via
Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities
Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities. This is aimed at those looking to get into the field of Data Science or those who are already in the field and looking to solve a real-world project with python.
Using SQLite within Python to create database and analyze Starcraft 2 units data (Pandas also used)
SQLite python Starcraft 2 English This project shows the usage of SQLite with python. To create, modify and communicate with the SQLite database from
Advanced_Data_Visualization_Tools - The present hands-on lab mainly uses Immigration to Canada dataset and employs advanced visualization tools such as word cloud, and waffle plot to display relations between features within the dataset.
Hands-on Practice Learning Lab for Data Science Overview This hands on practice lab is a part of Data Visualization with Python course offered by Cour
Codeflare - Scale complex AI/ML pipelines anywhere
Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
2019 Data Science Bowl
Kaggle-2019-Data-Science-Bowl-Solution - Here i present my solution to kaggle 2019 data science bowl and how i improved it to win a silver medal in that competition.
Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application
FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application
TwitterDataStreaming - Twitter data streaming using APIs
Twitter_Data_Streaming Twitter data streaming using APIs Use Case 1: Streaming r
Storage-optimizer - Identify potintial optimizations on the cloud storage accounts
Storage Optimizer Identify potintial optimizations on the cloud storage accounts
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used
Resmed_myair_sensors - This is a Home Assistant custom component to pull daily CPAP data from ResMed's myAir service using an undocumented API
resmed_myair This component will set up the following platforms. Platform Description sensor Show info from the myAir API. Installation Using the tool
Get-Phone-Number-Details-using-Python - To get the details of any number, we can use an amazing Python module known as phonenumbers.
Get-Phone-Number-Details-using-Python To get the details of any number, we can use an amazing Python module known as phonenumbers. We can use the amaz
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.
genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact
Xmas-Tree-GIF-Tool - Convert any given animated gif file into an animation in GIFT CSV format
This repo is made to participate in Matt Parker's XmasTree 2021 event. Convert a
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible
IMBENS: Class-imbalanced Ensemble Learning in Python Language: English | Chinese/中文 Links: Documentation | Gallery | PyPI | Changelog | Source | Downl
The codebase for Data-driven general-purpose voice activity detection.
Data driven GPVAD Repository for the work in TASLP 2021 Voice activity detection in the wild: A data-driven approach using teacher-student training. S
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
If Google News had a Python library
pygooglenews If Google News had a Python library Created by Artem from newscatcherapi.com but you do not need anything from us or from anyone else to
Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"
Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning" Getting started Prerequisites CUD
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.
COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data
DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag
Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance
Semi-supervised Deep Kernel Learning This is the code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data
CCCL: Contrastive Cascade Graph Learning.
CCGL: Contrastive Cascade Graph Learning This repo provides a reference implementation of Contrastive Cascade Graph Learning (CCGL) framework as descr
Get-web-images - A python code that get images from any site
image retrieval This is a python code to retrieve an image from the internet, a
This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database.
This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database. Its original intention is to monitor cryptocurrency related channels, but it can be configured to read any Telegram data that is accessible through the API.
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks ...
Serverless function for replicating weather underground data to an influxDB database
Weather Underground → Influx DB 🌤 Serverless function for replicating Weather U
Generate and Visualize Data Lineage from query history
Tokern Lineage Engine Tokern Lineage Engine is fast and easy to use application to collect, visualize and analyze column-level data lineage in databas
Analysing poker data from home games with friends
Poker Game Analysis Analysing poker data from home games with friends. Not a lot of data is collected, so this project is primarily focussed on descri
eXPeditious Data Transfer
xpdt: eXPeditious Data Transfer About xpdt is (yet another) language for defining data-types and generating code for serializing and deserializing the
Numerical Methods with Python, Numpy and Matplotlib
Numerical Bric-a-Brac Collections of numerical techniques with Python and standard computational packages (Numpy, SciPy, Numba, Matplotlib ...). Diffe
This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.
How to Implement a First-Order Low-Pass Filter in Discrete Time We often teach or learn about filters in continuous time, but then need to implement t
A module to get data about anime characters, news, info, lyrics and more.
Animec A module to get data about anime characters, news, info, lyrics and more. The module scrapes myanimelist to parse requested data. If you wish t
Python interface to IEX and IEX cloud APIs
Python interface to IEX Cloud Referral Please subscribe to IEX Cloud using this referral code. Getting Started Install Install from pip pip install py
A library to generate synthetic time series data by easy-to-use factors and generator
timeseries-generator This repository consists of a python packages that generates synthetic time series dataset in a generic way (under /timeseries_ge
Data reduction pipeline for KOALA on the AAT.
KOALA KOALA, the Kilofibre Optical AAT Lenslet Array, is a wide-field, high efficiency, integral field unit used by the AAOmega spectrograph on the 3.