2534 Python Data-catalog Libraries

Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.

7 Sep 4, 2022

Geospatial Data Visualization using PyGMT

Example script to visualize topographic data, earthquake data, and tomographic data on a map

2 Jul 30, 2022

Evaluate on three different ML model for feature selection using Breast cancer data.

Anomaly-detection-Feature-Selection Evaluate on three different ML model for feature selection using Breast cancer data. ML models: SVM, KNN and MLP.

1 Mar 17, 2022

A solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.

This project is intended to implement a solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.

1 Feb 4, 2022

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc

2 Jul 12, 2022

A data structure that extends pyspark.sql.DataFrame with metadata information.

MetaFrame A data structure that extends pyspark.sql.DataFrame with metadata info

8 Feb 15, 2022

sfgp is a package that aggregates individual scripts and notebooks, primarily written for the basic analysis tasks of genetics and pharmacogenomics data.

1 Mar 31, 2022

🌍 Create 3d-printable STLs from satellite elevation data 🌏

mapa 🌍 Create 3d-printable STLs from satellite elevation data Installation pip install mapa Usage mapa uses numpy and numba under the hood to crunch

13 Dec 15, 2022

Soccerdata - Efficiently scrape soccer data from various sources

SoccerData is a collection of wrappers over soccer data from Club Elo, ESPN, FBr

195 Jan 4, 2023

Data from "Datamodels: Predicting Predictions with Training Data"

Data from "Datamodels: Predicting Predictions with Training Data" Here we provid

51 Dec 9, 2022

Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

3 Feb 8, 2022

Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

3 Mar 3, 2022

Python/Selenium script to scrape data about university courses

university-courses Python/Selenium script to scrape data about university courses. Script first extracts URLs of each courses homepage, then trawls ea

1 Feb 2, 2022

Download Web-10K data by querying Bing Image Search

gpv2-web10k This repository contains the script to download images from the Web-10K dataset. The script takes in a list of queries, queries Bing Image

8 Sep 6, 2022

The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

bct_file_generator_for_EasyGSH The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

1 Jul 8, 2022

Convert monolithic Jupyter notebooks into Ploomber pipelines.

65 Dec 16, 2022

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

Common Voice Utils This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project. It aims t

40 Dec 20, 2022

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

14 Aug 19, 2022

Upload comma-delimited files to biglocalnews.org in your GitHub Action

Upload comma-delimited files to biglocalnews.org in your GitHub Action Inputs api-key: Your biglocalnews.org API token. project-id: The identifier of

1 Apr 20, 2022

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

Used for data processing in machine learning, and help us to construct ML model more easily from scratch. Can be used in linear model, logistic regression model, and decision tree.

0 Jul 5, 2022

A Login/Registration GUI Application with SQLite database for manipulating data.

Login-Register_Tk A Login/Registration GUI Application with SQLite database for manipulating data. What is this program? This program is a GUI applica

1 Feb 1, 2022

Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

This repository contains code that gets data regarding financial disclosures from the Court Listener API main.py: contains driver code that interacts

2 Aug 6, 2022

Enable geospatial data mining through Google Earth Engine in Grasshopper 3D, via its most recent Hops component.

AALU_Geo Mining This repository is produced for a masterclass at the Architectural Association Landscape Urbanism programme. Requirements Rhinoceros (

4 Nov 16, 2022

MoRecon - A tool for reconstructing missing frames in motion capture data.

38 Dec 3, 2022

NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s

2 Jun 13, 2022

Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022

Reverse engineering the dengue virus (under development construction)

Reverse engineering the dengue virus (under development 🚧 ) What is dengue? Dengue is a viral infection transmitted to humans through the bite of inf

4 Feb 9, 2022

Proyecto - Análisis de texto de eventos históricos

Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Link de acceso: https:

1 Jan 31, 2022

Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics

Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb

1 Jan 31, 2022

A command line tool that can convert Day One data into markdown files.

Introduction Features Before Start Export data from Day One Check Integrity Special Cases for Photo Extension Name Audio Extension Name Usage Known Is

26 Dec 31, 2022

A simple app to scrap data from Twitter.

Twitter-Scraping-App A simple app to scrap data from Twitter. Available Features Search query. Select number of data you want to fetch from twitter. C

2 Oct 31, 2022

This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

1 Jan 31, 2022

Integrating C Buffer Data Into the instruction of `.text` segment instead of on `.data`, `.rodata` to avoid copy.

gcc-bufdata-integrating2text Integrating C Buffer Data Into the instruction of .text segment instead of on .data, .rodata to avoid copy. Usage In your

1 Jan 31, 2022

A Simple and User-Friendly Google Collab Notebook with UI to transfer your data from Mega to Google Drive.

Mega to Google Drive (UI Added! 😊 ) A Simple and User-Friendly Google Collab Notebook with UI to transfer your data from Mega to Google Drive. ⚙️ How

18 Aug 16, 2022

Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.

Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods. We have to upload the image of an affected plant’s leaf through our website and our plant disease prediction model predicts and returns the disease name. And along with the disease name, we also provide the best suitable methods to cure the disease.

2 Feb 2, 2022

Learning -- Numpy January 2022 - winter'22

Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along

0 Mar 12, 2022

To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction

To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction. The challenge aims to adress the problems of medical imbalanced data classification.

1 Jan 31, 2022

This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th project are focused on Data Analysis, some of them are also put here to show off other skills that I have learned.

Welcome to my Data Analysis projects page! This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th proje

1 Jan 31, 2022

Senator Trades Monitor

Senator Trades Monitor This monitor will grab the most recent trades by senators and send them as a webhook to discord. Installation To use the monito

5 Jun 11, 2022

This open source Python project allow you to create JSON data trees using Minmup.com

This open source Python project allow you to create JSON data trees using Minmup.com. I try to develop this project all the time. But feel free to use :).

1 Jan 30, 2022

PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra The purpose of this project is to demonstrate a structured streaming pipeline with Apache

5 Nov 13, 2022

Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia

3 May 4, 2022

An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.

SSIDprobeCollector An ML & Correlation platform for transforming disparate data points of interest into usable intelligence. At a High level the platf

1 Jan 30, 2022

Discord webhooks for alerting crypto currency price changes & historical data.

Crypto-Discord Discord Webhooks for alerting crypto currency price changes & historical data. Create virtual environment and install requirements. $ s

1 Sep 2, 2022

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊

2 May 26, 2022

List of Land Cover datasets in the GEE Catalog

List of Land Cover datasets in the GEE Catalog A list of all the Land Cover (or discrete) datasets in Google Earth Engine. Values, Colors and Descript

5 Aug 24, 2022

Realtime data read and write without page refresh using Ajax in Django.

Realtime read-write with AJAX Hey,this is the basic implementation type of ajax realtime read write from the database. where you can insert or view re

3 Dec 13, 2022

Robust and blazing fast open-redirect vulnerability scanner with ability of recursevely crawling all of web-forms, entry points, or links with data.

After Golismero project got dead there is no more any up to date open-source tool that can collect links with parametrs and web-forms and then test th

34 Aug 25, 2022

Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)

Using DVC with PyCaret & FastAPI (Demo) This repo contains all the resources for my demo explaining how to use DVC along with other interesting tools

6 Jul 22, 2022

The mitosheet package, trymito.io, and other public Mito code.

Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener

1.4k Dec 31, 2022

Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.

Heart_Disease_Diagnostic_Analysis Objective 🎯 The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics

4 Jan 28, 2022

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set This is the repository for the Deep Learning proje

3 Feb 6, 2022

CMSC320 - Introduction to Data Science - Fall 2021

CMSC320 - Introduction to Data Science - Fall 2021 Instructors: Elias Jonatan Gonzalez and José Manuel Calderón Trilla Lectures: MW 3:30-4:45 & 5:00-6

6 Sep 12, 2022

clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)

Moroccan Stocks Clustering Context Hey! we don't always have to forecast time series am I right ? We use k-means to cluster about 70 moroccan stock pr

7 Oct 18, 2022

This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

LSHTM_RCS This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine (LSHTM) in collabo

3 Jan 30, 2022

Data Engineering ZoomCamp

Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w

61 Jan 6, 2023

Processed, version controlled history of Minecraft's generated data and assets

mcmeta Processed, version controlled history of Minecraft's generated data and assets Repository structure Each of the following branches has a commit

75 Dec 28, 2022

A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.

Data-Analysis-Web-App Data Analysis Web App can analysis data in multiple formates(csv, txt, xls, xlsx, ods, odt) and gives shows you the analysis in

19 Dec 9, 2022

TIANCHI Purchase Redemption Forecast Challenge

4 Aug 26, 2022

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

WikiPron WikiPron is a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary, as well as a database of pronuncia

213 Jan 1, 2023

Import some key/value data to Prometheus custom-built Node Exporter in Python

About the app In one particilar project, i had to import some key/value data to Prometheus. So i have decided to create my custom-built Node Exporter

1 May 19, 2022

Machine Learning e Data Science com Python

Machine Learning e Data Science com Python Arquivos do curso de Data Science e Machine Learning com Python na Udemy, cliqe aqui para acessá-lo. O prin

1 Jan 27, 2022

Clean and reusable data-sciency notebooks.

KPACUBO KPACUBO is a set Jupyter notebooks focused on the best practices in both software development and data science, namely, code reuse, explicit d

1 Jan 28, 2022

This repository contains the raw data and a python notebook to ingest historical A&E attendance data and then use a simple Prophet model to predict the number of A&E attendances in England if the COVID-19 pandemic had not happened

ae_attendances_modelling This repository contains the raw data and a python notebook to ingest historical A&E attendance data and then use a simple Pr

2 Mar 29, 2022

FAIR Enough Metrics is an API for various FAIR Metrics Tests, written in python

☑️ FAIR Enough metrics for research FAIR Enough Metrics is an API for various FAIR Metrics Tests, written in python, conforming to the specifications

3 Jul 6, 2022

This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness

HealthGen: Conditional EHR Time Series Generation This repository contains the implementation of the HealthGen model, a generative model to synthesize

0 Jan 20, 2022

A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".

Dual-Contrastive-Learning A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation". Y

85 Dec 26, 2022

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime Created by Prarthana Bhattacharyya. Disclaimer: This is n

5 Nov 8, 2022

coldcuts is an R package to automatically generate and plot segmentation drawings in R

coldcuts coldcuts is an R package that allows you to draw and plot automatically segmentations from 3D voxel arrays. The name is inspired by one of It

2 Sep 3, 2022

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques This repository is derived from the NMTGMinor

1 Sep 7, 2022

Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

Semantic Code Search Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project. The model

24 Nov 29, 2022

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data

12 Dec 1, 2022

A classification model capable of accurately predicting the price of secondhand cars

The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this repository. Most packages used are usually pre-installed in most developed environments and tools like collab, jupyter, etc. This can be useful for people looking to enhance the way the code their predicitve models and efficient ways to deal with tabular data!

2 Sep 13, 2022

Analyzing Earth Observation (EO) data is complex and solutions often require custom tailored algorithms.

eo-grow Earth observation framework for scaled-up processing in Python. Analyzing Earth Observation (EO) data is complex and solutions often require c

18 Dec 23, 2022

Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma

1 Jan 27, 2022

Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

This is a Kaggle competition in which we have to identify if the given lesion image is malignant or not for Melanoma which is a type of skin cancer.

1 Jan 27, 2022

Final Project for Practical Python Programming and Algorithms for Data Analysis

Final Project for Practical Python Programming and Algorithms for Data Analysis (PHW2781L, Summer 2020) Redlining, Race-Exclusive Deed Restriction Lan

1 Jan 27, 2022

Fetch fund data from avanza.se using Python and some web scraping with bs4

Py(A)vanza Fetch fund data from avanza.se using Python and some web scraping with bs4. The default way is to display the data in the terminal, apply -

1 Jan 27, 2022

Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.

open(LARGE) Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included. Motivation Oftentimes, especi

2 Jan 30, 2022

This library provides an abstraction to perform Model Versioning using Weight & Biases.

Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod

2 Jan 28, 2022

Python-geoarrow - Storing geometry data in Apache Arrow format

geoarrow Storing geometry data in Apache Arrow format Installation $ pip install

11 Mar 3, 2022

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

5 Oct 30, 2022

Graviti-python-sdk - Graviti Data Platform Python SDK

Graviti Python SDK Graviti Python SDK is a python library to access Graviti Data

13 Dec 15, 2022

DomainMonitor is a web project that has a RESTful API to get a domain's subdomains and whois data.

2 Feb 5, 2022

OptiPLANT is a cloud-based based system that empowers professional and non-professional data scientists to build high-quality predictive models

OptiPLANT OptiPLANT is a cloud-based based system that empowers professional and non-professional data scientists to build high-quality predictive mod

1 Jan 26, 2022

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 7, 2022

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

GreaseLM: Graph REASoning Enhanced Language Models This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language

137 Jan 2, 2023

Clustering is a popular approach to detect patterns in unlabeled data

Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data

24 Nov 11, 2022

SpyQL - SQL with Python in the middle

SpyQL SQL with Python in the middle Concept SpyQL is a query language that combines: the simplicity and structure of SQL with the power and readabilit

853 Dec 30, 2022

Data-sets from the survey and analysis

bachelor-thesis "Umfragewerte.xlsx" contains the orginal survey results. "umfrage_alle.csv" contains the survey results but one participant is cancele

1 Jan 26, 2022

Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In

12 Feb 8, 2022

Es-schema - Common Data Schemas for Elasticsearch

Common Data Schemas for Elasticsearch The Common Data Schema for Elasticsearch i

2 Jan 25, 2022

Data analysis and visualisation projects from a range of individual projects and applications

Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python

1 Jan 25, 2022

Predicting Auction Sale Price using the kaggle bulldozer auction sales data: Modeling with Ensembles vs Neural Network

Predicting Auction Sale Price using the kaggle bulldozer auction sales data: Modeling with Ensembles vs Neural Network The performances of tree ensemb

2 Sep 13, 2022

A program made in PYTHON🐍 that automatically performs data insertions into a POSTGRES database 🐘 , using as base a .CSV file 📁 , useful in mass data insertions

A program made in PYTHON🐍 that automatically performs data insertions into a POSTGRES database 🐘 , using as base a .CSV file 📁 , useful in mass data insertions.

1 Oct 17, 2022

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Labelbox Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications. Use this github repository to help you s

1.7k Dec 29, 2022

FairLens is an open source Python library for automatically discovering bias and measuring fairness in data

FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick

69 Dec 15, 2022

This is a curated list of medical data for machine learning

Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,

5.4k Dec 26, 2022

Segment axon and myelin from microscopy data using deep learning

Segment axon and myelin from microscopy data using deep learning. Written in Python. Using the TensorFlow framework. Based on a convolutional neural network architecture. Pixels are classified as either axon, myelin or background.

103 Nov 29, 2022

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model. Designed sample dashboard with insights and recommendation for customers.

2 Apr 7, 2022

Python Data-catalog Resources

Python data-catalog Libraries

Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.

Geospatial Data Visualization using PyGMT

Evaluate on three different ML model for feature selection using Breast cancer data.

A solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

A data structure that extends pyspark.sql.DataFrame with metadata information.

sfgp is a package that aggregates individual scripts and notebooks, primarily written for the basic analysis tasks of genetics and pharmacogenomics data.

🌍 Create 3d-printable STLs from satellite elevation data 🌏

Soccerdata - Efficiently scrape soccer data from various sources

Data from "Datamodels: Predicting Predictions with Training Data"

Data-depth-inference - Data depth inference with python

Catalogue data - A Python Scripts to prepare catalogue data

Python/Selenium script to scrape data about university courses

Download Web-10K data by querying Bing Image Search

The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

Convert monolithic Jupyter notebooks into Ploomber pipelines.

This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Upload comma-delimited files to biglocalnews.org in your GitHub Action

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

A Login/Registration GUI Application with SQLite database for manipulating data.

Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

Enable geospatial data mining through Google Earth Engine in Grasshopper 3D, via its most recent Hops component.

MoRecon - A tool for reconstructing missing frames in motion capture data.

NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Reverse engineering the dengue virus (under development construction)

Proyecto - Análisis de texto de eventos históricos

Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics

A command line tool that can convert Day One data into markdown files.

A simple app to scrap data from Twitter.

This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.

Integrating C Buffer Data Into the instruction of `.text` segment instead of on `.data`, `.rodata` to avoid copy.

A Simple and User-Friendly Google Collab Notebook with UI to transfer your data from Mega to Google Drive.

Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.

Learning -- Numpy January 2022 - winter'22

To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction

This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th project are focused on Data Analysis, some of them are also put here to show off other skills that I have learned.

Senator Trades Monitor

This open source Python project allow you to create JSON data trees using Minmup.com

PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia

An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.

Discord webhooks for alerting crypto currency price changes & historical data.

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

List of Land Cover datasets in the GEE Catalog

Realtime data read and write without page refresh using Ajax in Django.

Robust and blazing fast open-redirect vulnerability scanner with ability of recursevely crawling all of web-forms, entry points, or links with data.

Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)

The mitosheet package, trymito.io, and other public Mito code.

Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set

CMSC320 - Introduction to Data Science - Fall 2021

clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)

This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

Data Engineering ZoomCamp

Processed, version controlled history of Minecraft's generated data and assets

A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.

TIANCHI Purchase Redemption Forecast Challenge

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

Import some key/value data to Prometheus custom-built Node Exporter in Python

Machine Learning e Data Science com Python

Clean and reusable data-sciency notebooks.

This repository contains the raw data and a python notebook to ingest historical A&E attendance data and then use a simple Prophet model to predict the number of A&E attendances in England if the COVID-19 pandemic had not happened

FAIR Enough Metrics is an API for various FAIR Metrics Tests, written in python

This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness

A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

coldcuts is an R package to automatically generate and plot segmentation drawings in R

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

A classification model capable of accurately predicting the price of secondhand cars

Analyzing Earth Observation (EO) data is complex and solutions often require custom tailored algorithms.

Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

Final Project for Practical Python Programming and Algorithms for Data Analysis

Fetch fund data from avanza.se using Python and some web scraping with bs4