3466 Repositories
Python data-pre-processing Libraries
This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.
This is the core of the program which takes 5k SYMBOLS and looks back N years to pull in the daily OHLC data of those symbols and saves them to disc.
Integrating C Buffer Data Into the instruction of `.text` segment instead of on `.data`, `.rodata` to avoid copy.
gcc-bufdata-integrating2text Integrating C Buffer Data Into the instruction of .text segment instead of on .data, .rodata to avoid copy. Usage In your
A Simple and User-Friendly Google Collab Notebook with UI to transfer your data from Mega to Google Drive.
Mega to Google Drive (UI Added! 😊 ) A Simple and User-Friendly Google Collab Notebook with UI to transfer your data from Mega to Google Drive. ⚙️ How
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods.
Our product DrLeaf which not only makes the work easier but also reduces the effort and expenditure of the farmer to identify the disease and its treatment methods. We have to upload the image of an affected plant’s leaf through our website and our plant disease prediction model predicts and returns the disease name. And along with the disease name, we also provide the best suitable methods to cure the disease.
Learning -- Numpy January 2022 - winter'22
Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along
Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.
Low-resource-Machine-Translation This repository contains the code for the project relative to the course Deep Natural Language Processing. The goal o
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction
To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction. The challenge aims to adress the problems of medical imbalanced data classification.
LSTM model - IMDB review sentiment analysis
NLP - Movie review sentiment analysis The colab notebook contains the code for building a LSTM Recurrent Neural Network that gives 87-88% accuracy on
This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th project are focused on Data Analysis, some of them are also put here to show off other skills that I have learned.
Welcome to my Data Analysis projects page! This GitHub Repository contains Data Analysis projects that I have completed so far! While most of th proje
Senator Trades Monitor
Senator Trades Monitor This monitor will grab the most recent trades by senators and send them as a webhook to discord. Installation To use the monito
Image processing is one of the most common term in computer vision
Image processing is one of the most common term in computer vision. Computer vision is the process by which computers can understand images and videos, and how they are stored, manipulated, and retrieve details from them. OpenCV is an open source computer vision image processing library for machine learning, deep leaning and AI application which plays a major role in real-time operation which is very important in today’s systems.
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t
This open source Python project allow you to create JSON data trees using Minmup.com
This open source Python project allow you to create JSON data trees using Minmup.com. I try to develop this project all the time. But feel free to use :).
PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra
PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra The purpose of this project is to demonstrate a structured streaming pipeline with Apache
Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia
Eureka is a Rest-API framework scraper based on FastAPI for cleaning and organizing data, designed for the Eureka by Turing project of the National University of Colombia
An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.
SSIDprobeCollector An ML & Correlation platform for transforming disparate data points of interest into usable intelligence. At a High level the platf
Discord webhooks for alerting crypto currency price changes & historical data.
Crypto-Discord Discord Webhooks for alerting crypto currency price changes & historical data. Create virtual environment and install requirements. $ s
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
Realtime data read and write without page refresh using Ajax in Django.
Realtime read-write with AJAX Hey,this is the basic implementation type of ajax realtime read write from the database. where you can insert or view re
Robust and blazing fast open-redirect vulnerability scanner with ability of recursevely crawling all of web-forms, entry points, or links with data.
After Golismero project got dead there is no more any up to date open-source tool that can collect links with parametrs and web-forms and then test th
Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)
Using DVC with PyCaret & FastAPI (Demo) This repo contains all the resources for my demo explaining how to use DVC along with other interesting tools
The mitosheet package, trymito.io, and other public Mito code.
Mito Monorepo Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and gener
Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.
Heart_Disease_Diagnostic_Analysis Objective 🎯 The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics
Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set
Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set This is the repository for the Deep Learning proje
CMSC320 - Introduction to Data Science - Fall 2021
CMSC320 - Introduction to Data Science - Fall 2021 Instructors: Elias Jonatan Gonzalez and José Manuel Calderón Trilla Lectures: MW 3:30-4:45 & 5:00-6
clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)
Moroccan Stocks Clustering Context Hey! we don't always have to forecast time series am I right ? We use k-means to cluster about 70 moroccan stock pr
This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine
LSHTM_RCS This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine (LSHTM) in collabo
Data Engineering ZoomCamp
Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w
🤖 Project template for your next awesome AI project. 🦾
🤖 AI Awesome Project Template 👋 Template author You may want to adjust badge links in a README.md file. 💎 Installation with pip Installation is as
Processed, version controlled history of Minecraft's generated data and assets
mcmeta Processed, version controlled history of Minecraft's generated data and assets Repository structure Each of the following branches has a commit
A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.
Data-Analysis-Web-App Data Analysis Web App can analysis data in multiple formates(csv, txt, xls, xlsx, ods, odt) and gives shows you the analysis in
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
DART Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners. Environment [email protected] Use pi
TIANCHI Purchase Redemption Forecast Challenge
TIANCHI Purchase Redemption Forecast Challenge
NLP applications using deep learning.
NLP-Natural-Language-Processing NLP applications using deep learning like text generation etc. 1- Poetry Generation: Using a collection of Irish Poem
Natural Language Processing Specialization
Natural Language Processing Specialization In this folder, Natural Language Processing Specialization projects and notes can be found. WHAT I LEARNED
WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary
WikiPron WikiPron is a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary, as well as a database of pronuncia
Import some key/value data to Prometheus custom-built Node Exporter in Python
About the app In one particilar project, i had to import some key/value data to Prometheus. So i have decided to create my custom-built Node Exporter
Machine Learning e Data Science com Python
Machine Learning e Data Science com Python Arquivos do curso de Data Science e Machine Learning com Python na Udemy, cliqe aqui para acessá-lo. O prin
Clean and reusable data-sciency notebooks.
KPACUBO KPACUBO is a set Jupyter notebooks focused on the best practices in both software development and data science, namely, code reuse, explicit d
This repository contains the raw data and a python notebook to ingest historical A&E attendance data and then use a simple Prophet model to predict the number of A&E attendances in England if the COVID-19 pandemic had not happened
ae_attendances_modelling This repository contains the raw data and a python notebook to ingest historical A&E attendance data and then use a simple Pr
Multi-processing capable print-like logger for Python
MPLogger Multi-processing capable print-like logger for Python Requirements and Installation Python 3.8+ is required Pip pip install mplogger Manual P
FAIR Enough Metrics is an API for various FAIR Metrics Tests, written in python
☑️ FAIR Enough metrics for research FAIR Enough Metrics is an API for various FAIR Metrics Tests, written in python, conforming to the specifications
NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models
NaturalCC NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models for many software engineering tasks,
This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness
HealthGen: Conditional EHR Time Series Generation This repository contains the implementation of the HealthGen model, a generative model to synthesize
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation This the repository for this paper. Find extensions of this w
A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".
Dual-Contrastive-Learning A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation". Y
Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting
Real-Time Seizure Detection using Electroencephalogram (EEG) This is the repository for "Real-Time Seizure Detection using EEG: A Comprehensive Compar
PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime
Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime Created by Prarthana Bhattacharyya. Disclaimer: This is n
coldcuts is an R package to automatically generate and plot segmentation drawings in R
coldcuts coldcuts is an R package that allows you to draw and plot automatically segmentations from 3D voxel arrays. The name is inspired by one of It
Vision transformers (ViTs) have found only limited practical use in processing images
CXV Convolutional Xformers for Vision Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-o
Official code of Team Yao at Multi-Modal-Fact-Verification-2022
Official code of Team Yao at Multi-Modal-Fact-Verification-2022 A Multi-Modal Fact Verification dataset released as part of the De-Factify workshop in
Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques
Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques This repository is derived from the NMTGMinor
Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project
Semantic Code Search Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project. The model
Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"
Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data
A classification model capable of accurately predicting the price of secondhand cars
The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this repository. Most packages used are usually pre-installed in most developed environments and tools like collab, jupyter, etc. This can be useful for people looking to enhance the way the code their predicitve models and efficient ways to deal with tabular data!
Analyzing Earth Observation (EO) data is complex and solutions often require custom tailored algorithms.
eo-grow Earth observation framework for scaled-up processing in Python. Analyzing Earth Observation (EO) data is complex and solutions often require c
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources (e.g. just the lead vocals).
Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data
Statistical_Modelling Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data Statistical Methods for Decision Ma
Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻♂️
This is a Kaggle competition in which we have to identify if the given lesion image is malignant or not for Melanoma which is a type of skin cancer.
Final Project for Practical Python Programming and Algorithms for Data Analysis
Final Project for Practical Python Programming and Algorithms for Data Analysis (PHW2781L, Summer 2020) Redlining, Race-Exclusive Deed Restriction Lan
Fetch fund data from avanza.se using Python and some web scraping with bs4
Py(A)vanza Fetch fund data from avanza.se using Python and some web scraping with bs4. The default way is to display the data in the terminal, apply -
Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.
open(LARGE) Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included. Motivation Oftentimes, especi
This library provides an abstraction to perform Model Versioning using Weight & Biases.
Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod
Python-geoarrow - Storing geometry data in Apache Arrow format
geoarrow Storing geometry data in Apache Arrow format Installation $ pip install
Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms
scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo
Detic ros - A simple ROS wrapper for Detic instance segmentation using pre-trained dataset
Detic ros - A simple ROS wrapper for Detic instance segmentation using pre-trained dataset
Graviti-python-sdk - Graviti Data Platform Python SDK
Graviti Python SDK Graviti Python SDK is a python library to access Graviti Data
DomainMonitor is a web project that has a RESTful API to get a domain's subdomains and whois data.
DomainMonitor is a web project that has a RESTful API to get a domain's subdomains and whois data.
OptiPLANT is a cloud-based based system that empowers professional and non-professional data scientists to build high-quality predictive models
OptiPLANT OptiPLANT is a cloud-based based system that empowers professional and non-professional data scientists to build high-quality predictive mod
A reference implementation for processing the content.log files found at opendata.dwd.de/weather
A reference implementation for processing the content.log files found at opendata.dwd.de/weather.
Command line tool to automate transforming the effects of one color profile to another, possibly more standard one.
Finished rendering the frames of that animation, and now the colors look washed out and ugly? This terminal program will solve exactly that.
Towards Fine-Grained Reasoning for Fake News Detection
FinerFact This is the PyTorch implementation for the FinerFact model in the AAAI 2022 paper Towards Fine-Grained Reasoning for Fake News Detection (Ar
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data
Dynamic VAE frame Automatic feature extraction can be achieved by probability di
SAS: Self-Augmentation Strategy for Language Model Pre-training
SAS: Self-Augmentation Strategy for Language Model Pre-training This repository
This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"
GreaseLM: Graph REASoning Enhanced Language Models This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language
This repository contains pre-trained models and some evaluation code for our paper Towards Unsupervised Dense Information Retrieval with Contrastive Learning
Contriever: Towards Unsupervised Dense Information Retrieval with Contrastive Learning This repository contains pre-trained models and some evaluation
Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis
TweebankNLP This repo contains the new Tweebank-NER dataset and off-the-shelf Twitter-Stanza pipeline for state-of-the-art Tweet NLP, as described in
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
SWAG: Supervised Weakly from hashtAGs This repository contains SWAG models from the paper Revisiting Weakly Supervised Pre-Training of Visual Percepti
Clustering is a popular approach to detect patterns in unlabeled data
Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data
SpyQL - SQL with Python in the middle
SpyQL SQL with Python in the middle Concept SpyQL is a query language that combines: the simplicity and structure of SQL with the power and readabilit
Data-sets from the survey and analysis
bachelor-thesis "Umfragewerte.xlsx" contains the orginal survey results. "umfrage_alle.csv" contains the survey results but one participant is cancele
A shimmer pre-load component for Plotly Dash
dash-loading-shimmer A shimmer pre-load component for Plotly Dash Installation Get it with pip: pip install dash-loading-extras Or maybe you prefer Pi
Generate pixel-style avatars with python.
face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen
RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2
RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.
Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In
Es-schema - Common Data Schemas for Elasticsearch
Common Data Schemas for Elasticsearch The Common Data Schema for Elasticsearch i
Data analysis and visualisation projects from a range of individual projects and applications
Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python
Predicting Auction Sale Price using the kaggle bulldozer auction sales data: Modeling with Ensembles vs Neural Network
Predicting Auction Sale Price using the kaggle bulldozer auction sales data: Modeling with Ensembles vs Neural Network The performances of tree ensemb
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in
A program made in PYTHON🐍 that automatically performs data insertions into a POSTGRES database 🐘 , using as base a .CSV file 📁 , useful in mass data insertions
A program made in PYTHON🐍 that automatically performs data insertions into a POSTGRES database 🐘 , using as base a .CSV file 📁 , useful in mass data insertions.
Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications
Labelbox Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications. Use this github repository to help you s
CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs
CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data
FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick
Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks
Introduction This repository contains the modified caffe library and network architectures for our paper "Automated Melanoma Recognition in Dermoscopy
Pre-trained models for a Cascaded-FCN in caffe and tensorflow that segments
Cascaded-FCN This repository contains the pre-trained models for a Cascaded-FCN in caffe and tensorflow that segments the liver and its lesions out of
This is a curated list of medical data for machine learning
Medical Data for Machine Learning This is a curated list of medical data for machine learning. This list is provided for informational purposes only,
This repository contains code, network definitions and pre-trained models for working on remote sensing images using deep learning
Deep learning for Earth Observation This repository contains code, network definitions and pre-trained models for working on remote sensing images usi
Segment axon and myelin from microscopy data using deep learning
Segment axon and myelin from microscopy data using deep learning. Written in Python. Using the TensorFlow framework. Based on a convolutional neural network architecture. Pixels are classified as either axon, myelin or background.
Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline
Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline. The pipeline accepts english text as input and returns the French translation.
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model
Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model. Designed sample dashboard with insights and recommendation for customers.