596 Repositories
Python science Libraries
Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑🔧
Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re
Course materials for: Geospatial Data Science
Course materials for: Geospatial Data Science These course materials cover the lectures for the course held for the first time in spring 2022 at IT Un
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).
What is judgyprophet? judgyprophet is a Bayesian forecasting algorithm based on Prophet, that enables forecasting while using information known by the
Display your data in an attractive way in your notebook!
Bloxs Bloxs is a simple python package that helps you display information in an attractive way (formed in blocks). Perfect for building dashboards, re
This repository contains the best Data Science free hand-picked resources to equip you with all the industry-driven skills and interview preparation kit.
Best Data Science Resources Hey, Data Enthusiasts out there! Finally, after lots of requests from the community I finally came up with the best free D
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
Google Cloud Vertex AI Samples Welcome to the Google Cloud Vertex AI sample repository. Overview The repository contains notebooks and community conte
Write python locally, execute SQL in your data warehouse
RasgoQL Write python locally, execute SQL in your data warehouse ≪ Read the Docs · Join Our Slack » RasgoQL is a Python package that enables you to ea
This repository contains all the data analytics projects that I've worked on in python.
93_Python_Data_Analytics_Projects This repository contains all the data analytics projects that I've worked on in python. No. Name 01 001_Cervical_Can
This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.
👏 Pre- requisites to Machine Learning
My Solutions to 120 commonly asked data science interview questions.
Data_Science_Interview_Questions Introduction 👋 Here are the answers to 120 Data Science Interview Questions The above answer some is modified based
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for
Implementation of RITA (Real Intelligence Threat Analytics) in Jupyter Notebook with improved scoring algorithm.
RITA (Real Intelligence Threat Analytics) in Jupyter Notebook RITA is an open source framework for network traffic analysis sponsored by Active Counte
Repositório para o #alurachallengedatascience1
1° Challenge de Dados - Alura A Alura Voz é uma empresa de telecomunicação que nos contratou para atuar como cientistas de dados na equipe de vendas.
Detecting silent model failure. NannyML estimates performance with an algorithm called Confidence-based Performance estimation (CBPE), developed by core contributors. It is the only open-source algorithm capable of fully capturing the impact of data drift on performance.
Website • Docs • Community Slack 💡 What is NannyML? NannyML is an open-source python library that allows you to estimate post-deployment model perfor
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:
Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab
TigerLily: Finding drug interactions in silico with the Graph.
Drug Interaction Prediction with Tigerlily Documentation | Example Notebook | Youtube Video | Project Report Tigerlily is a TigerGraph based system de
Ana's Portfolio
Ana's Portfolio ✌️ Welcome to my Portfolio! You will find here different Projects I have worked on (from scratch) 💪 Projects 💻 1️⃣ Hangman game (Mad
DLO8012: Natural Language Processing & CSL804: Computational Lab - II Semester VIII
NATURAL-LANGUAGE-PROCESSING-AND-COMPUTATIONAL-LAB-II DLO8012: NLP & CSL804: CL-II [SEMESTER VIII] Syllabus NLP - Reference Books THE WALL MEGA SATISH
Resources complimenting the Machine Learning Course led in the Faculty of mathematics and informatics part of Sofia University.
Machine Learning and Data Mining, Summer 2021-2022 How to learn data science and machine learning? Programming. Learn Python. Basic Statistics. Take a
This repository contains helper functions which can help you generate additional data points depending on your NLP task.
NLP Albumentations For Data Augmentation This repository contains helper functions which can help you generate additional data points depending on you
Machine learning beginner to Kaggle competitor in 30 days. Non-coders welcome. The program starts Monday, August 2, and lasts four weeks. It's designed for people who want to learn machine learning.
30-Days-of-ML-Kaggle 🔥 About the Hands On Program 💻 Machine learning beginner → Kaggle competitor in 30 days. Non-coders welcome The program starts
Tutorial repo for an end-to-end Data Science project
End-to-end Data Science project This is the repo with the notebooks, code, and additional material used in the ITI's workshop. The goal of the session
AI Summer's complete catalog of articles
Learn Deep Learning with AI Summer A collection of all articles (almost 100) written for the AI Summer blog organized by topic. Deep Learning Theory M
1000+ ready code templates to kickstart your next AI experiment
AI Seed Projects Start with ready code for your next AI experiment. Choose from 1000+ code templates, across a wide variety of use cases. All examples
This is a repo of basic Machine Learning!
Basic Machine Learning This repository contains a topic-wise curated list of Machine Learning and Deep Learning tutorials, articles and other resource
Hacktoberfest 2021 contribution repository✨
🎃 HacktoberFest-2021 🎃 Repository for Hacktoberfest Note: Although, We are actively focusing on Machine Learning, Data Science and Tricky Python pro
As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.
Machine Learning Loot Crate 💻 🧰 🔴 Welcome contributors! As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Cra
Learn Data Science with focus on adding value with the most efficient tech stack.
DataScienceWithPython Get started with Data Science with Python An engaging journey to become a Data Scientist with Python TL;DR Download all Jupyter
Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.
Data Inspector Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier. Author:
A Flask Sentiment Analysis API, with visual implementation
The Sentiment Analysis Api was created using python flask module,it allows users to parse a text or sentence throught the (?text) arguement, then view the sentiment analysis of that sentence. It can be implementable into a web application.
DataAnalysis: Some data analysis projects in charles_pikachu
DataAnalysis DataAnalysis: Some data analysis projects in charles_pikachu You can star this repository to keep track of the project if it's helpful fo
Udacity's CS101: Intro to Computer Science - Building a Search Engine
Udacity's CS101: Intro to Computer Science - Building a Search Engine All soluti
Lightweight mmm - Lightweight (Bayesian) Media Mix Model
Lightweight (Bayesian) Media Mix Model This is not an official Google product. L
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model
Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th
Explore-bikeshare-data - GitHub project as part of the Programming for Data Science with Python Nanodegree from Udacity
Date created February 10, 2022 Project Title Explore US Bikeshare Data Descripti
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory
This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the
Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters
Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters. It also works with any regressor compatible with the scikit-learn API (pipelines, CatBoost, LightGBM, XGBoost, Ranger...).
Data science project for exploratory analysis on the kcse grades dataset (Kamilimu Data Science Track)
Kcse-Data-Analysis Data science project for exploratory analysis on the kcse grades dataset (Kamilimu Data Science Track) Findings The performance of
Vaex library for Big Data Analytics of an Airline dataset
Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics
Data science/Analysis Health Care Portfolio
Health-Care-DS-Projects Data Science/Analysis Health Care Portfolio Consists Of 3 Projects: Mexico Covid-19 project, analyze the patient medical histo
LightGBM + Optuna: no brainer
AutoLGBM LightGBM + Optuna: no brainer auto train lightgbm directly from CSV files auto tune lightgbm using optuna auto serve best lightgbm model usin
Image-to-image regression with uncertainty quantification in PyTorch
Image-to-image regression with uncertainty quantification in PyTorch. Take any dataset and train a model to regress images to images with rigorous, distribution-free uncertainty quantification.
Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets
Crowd-Kit: Computational Quality Control for Crowdsourcing Documentation Crowd-Kit is a powerful Python library that implements commonly-used aggregat
Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida
Fall 2021 CIS6930 Topics in Computing for Data Science This repository hosts course materials used for a 13-week course "CIS6930 Topics in Computing f
Geospatial data-science analysis on reasons behind delay in Grab ride-share services
Grab x Pulis Detailed analysis done to investigate possible reasons for delay in Grab services for NUS Data Analytics Competition 2022, to be found in
metedraw is a project mainly for data visualization projects of Atmospheric Science, Marine Science, Environmental Science or other majors
It is mainly for data visualization projects of Atmospheric Science, Marine Science, Environmental Science or other majors.
Implementation of SOMs (Self-Organizing Maps) with neighborhood-based map topologies.
py-self-organizing-maps Simple implementation of self-organizing maps (SOMs) A SOM is an unsupervised method for learning a mapping from a discrete ne
Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.
Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.
Project: Netflix Data Analysis and Visualization with Python
Project: Netflix Data Analysis and Visualization with Python Table of Contents General Info Installation Demo Usage and Main Functionalities Contribut
Python package for concise, transparent, and accurate predictive modeling
Python package for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. 📚 docs • 📖 demo notebooks Modern
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python An easy-to-follow comprehensive guide on integrating Amazon RDS, MySQL Workbe
Conducted ANOVA and Logistic regression analysis using matplot library to visualize the result.
Intro-to-Data-Science Conducted ANOVA and Logistic regression analysis. Project ANOVA The main aim of this project is to perform One-Way ANOVA analysi
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Convert monolithic Jupyter notebooks into Ploomber pipelines.
Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods. We have to upload the image of an affected plant’s leaf through our website and our plant disease prediction model predicts and returns the disease name. And along with the disease name, we also provide the best suitable methods to cure the disease.
Learning -- Numpy January 2022 - winter'22
Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction. The challenge aims to adress the problems of medical imbalanced data classification.
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)
Using DVC with PyCaret & FastAPI (Demo) This repo contains all the resources for my demo explaining how to use DVC along with other interesting tools
The mitosheet package, trymito.io, and other public Mito code.
Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener
CMSC320 - Introduction to Data Science - Fall 2021
CMSC320 - Introduction to Data Science - Fall 2021 Instructors: Elias Jonatan Gonzalez and José Manuel Calderón Trilla Lectures: MW 3:30-4:45 & 5:00-6
TIANCHI Purchase Redemption Forecast Challenge
TIANCHI Purchase Redemption Forecast Challenge
Machine Learning e Data Science com Python
Machine Learning e Data Science com Python Arquivos do curso de Data Science e Machine Learning com Python na Udemy, cliqe aqui para acessá-lo. O prin
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments
repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d
Source code related to the article submitted to the International Conference on Computational Science ICCS 2022 in London
POTHER: Patch-Voted Deep Learning-based Chest X-ray Bias Analysis for COVID-19 Detection Source code related to the article submitted to the Internati
CSPML (crystal structure prediction with machine learning-based element substitution)
CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti
Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻♂️
This is a Kaggle competition in which we have to identify if the given lesion image is malignant or not for Melanoma which is a type of skin cancer.
Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.
open(LARGE) Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included. Motivation Oftentimes, especi
This library provides an abstraction to perform Model Versioning using Weight & Biases.
Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod
Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University
Aalto-CS-MSc-Theses Listing of M.Sc. Theses of the Department of Computer Scienc
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.
Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data
FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick
Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology
A310 Computational Neuroscience - Okinawa Institute of Science and Technology, 2022 This repository contains modeling practice materials and homework
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016).
An easy-to-use feature store
A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.
Feature engineering and machine learning: together at last
Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu
Dive into Machine Learning
Dive into Machine Learning Hi there! You might find this guide helpful if: You know Python or you're learning it 🐍 You're new to Machine Learning You
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Google AI Open Images - Object Detection Track: Open Solution
Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c
TGS Salt Identification Challenge
TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp
Airbus Ship Detection Challenge
Airbus Ship Detection Challenge This is an open solution to the Airbus Ship Detection Challenge. Our goals We are building entirely open solution to t
Open solution to the Toxic Comment Classification Challenge
Starter code: Kaggle Toxic Comment Classification Challenge More competitions 🎇 Check collection of public projects 🎁 , where you can find multiple
Data-driven Computer Science UoB
COMS20011_2021 Data-driven Computer Science UoB Staff Laurence Aitchison [[email protected]] (unit director) Majid Mirmehdi [m.mirmehdi
FFCV: Fast Forward Computer Vision (and other ML workloads!)
Fast Forward Computer Vision: train models at a fraction of the cost with accele
100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.
100 Days of Code Learning program to keep a habit of coding daily and learn things at your own pace with help from our remote community.
Used Logistic Regression, Random Forest, and XGBoost to predict the outcome of Search & Destroy games from the Call of Duty World League for the 2018 and 2019 seasons.
Call of Duty World League: Search & Destroy Outcome Predictions Growing up as an avid Call of Duty player, I was always curious about what factors led
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines
Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation
PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd
Exploring the Top ML and DL GitHub Repositories
This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.
Completed task 1 and task 2 at LetsGrowMore as a data science intern.
LetsGrowMore-Internship Completed task 1 and task 2 at LetsGrowMore as a data science intern. Task 1- Task 2- Creating a Decision Tree classifier and
Mercury: easily convert Python notebook to web app and share with others
Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad
An awesome Data Science repository to learn and apply for real world problems.
AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-ml Curated papers, articles, and blogs on data science & machine learning in production. ⚙️ Figuring out how to implement your ML project? Lea
Credit EDA Case Study Using Python
This case study aims to identify patterns which indicate if a client has difficulty paying their installments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.
Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
Python Machine Learning Jupyter Notebooks (ML website)
Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also
Tech Resources for Academic Communities
Free tech resources for faculty, students, researchers, life-long learners, and academic community builders for use in tech based courses, workshops, and hackathons.
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
Python Library to get fast extensive Dummy Data for testing
Dumda Python Library to get fast extensive Dummy Data for testing https://pypi.org/project/dumda/ Installation pip install dumda Usage: Cities from d