3316 Repositories
Python data-quality-analysis Libraries
Community and sentiment analysis based on tweets
The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of the new measures. In particular, we want to research the reference hubs present on the network, but also the sentiment and emotions of peoples with respect to the new limitations.
Geospatial data-science analysis on reasons behind delay in Grab ride-share services
Grab x Pulis Detailed analysis done to investigate possible reasons for delay in Grab services for NUS Data Analytics Competition 2022, to be found in
QDax is a tool to accelerate Quality-Diveristy (QD) algorithms through hardware accelerators and massive parallelism
QDax: Accelerated Quality-Diversity QDax is a tool to accelerate Quality-Diveristy (QD) algorithms through hardware accelerators and massive paralleli
metedraw is a project mainly for data visualization projects of Atmospheric Science, Marine Science, Environmental Science or other majors
It is mainly for data visualization projects of Atmospheric Science, Marine Science, Environmental Science or other majors.
OpenStats is a library built on top of streamlit that extracts data from the Github API and shows the main KPIs
Open Stats Discover and share the KPIs of your OpenSource project. OpenStats is a library built on top of streamlit that extracts data from the Github
A research of IT labor market based especially on hh.ru. Salaries, rate of technologies and etc.
hh_ru_research Проект реализован в учебных целях анализа рынка труда, в особенности по hh.ru Input data В качестве входных данных используются сериали
To build a regression model to predict the concrete compressive strength based on the different features in the training data.
Cement-Strength-Prediction Problem Statement To build a regression model to predict the concrete compressive strength based on the different features
A Python implementation of red-black trees
Python red-black trees A Python implementation of red-black trees. This code was originally copied from programiz.com, but I have made a few tweaks to
Contains a Jupyter Notebook for calculating remaining plants required based on field/lathhouse data.
Davis-Sunflowers-Su21 Project goals: Plants influence their reproduction and mating system in many ways. Various factors such as time of flowering, ab
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.
Implementation of SOMs (Self-Organizing Maps) with neighborhood-based map topologies.
py-self-organizing-maps Simple implementation of self-organizing maps (SOMs) A SOM is an unsupervised method for learning a mapping from a discrete ne
Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.
Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.
Project: Netflix Data Analysis and Visualization with Python
Project: Netflix Data Analysis and Visualization with Python Table of Contents General Info Installation Demo Usage and Main Functionalities Contribut
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed
This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan
This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan. This tool shows your hostname and public IP first, then user give input and according to option this tool work. This tool work diffrent Oprating system.
MRC approach for Aspect-based Sentiment Analysis (ABSA)
B-MRC MRC approach for Aspect-based Sentiment Analysis (ABSA) Paper: Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet Extracti
This project uses ViT to perform image classification tasks on DATA set CIFAR10.
Vision-Transformer-Multiprocess-DistributedDataParallel-Apex Introduction This project uses ViT to perform image classification tasks on DATA set CIFA
CLASSIX is a fast and explainable clustering algorithm based on sorting
CLASSIX Fast and explainable clustering based on sorting CLASSIX is a fast and explainable clustering algorithm based on sorting. Here are a few highl
Twitter Sentiment Analysis using #tag, words and username
Twitter Sentment Analysis Web App using #tag, words and username to fetch data finds Insides of data and Tells Sentiment of the perticular #tag, words or username.
PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning
ExORL: Exploratory Data for Offline Reinforcement Learning This is an original PyTorch implementation of the ExORL framework from Don't Change the Alg
Geowifi 📡 💘 🌎 Search WiFi geolocation data by BSSID and SSID on different public databases.
Geowifi 📡 💘 🌎 Search WiFi geolocation data by BSSID and SSID on different public databases.
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj
The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.
SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you
Interactive Dashboard for Visualizing OSM Data Change
Dashboard and intuitive data downloader for more interactive experience with interpreting osm change data.
Data Analysis: Data Visualization of Airlines
Data Analysis: Data Visualization of Airlines Anderson Cruz | London-UK | Linkedin | Nowa Capital Project: Traffic Airlines Airline Reporting Carrier
Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python
deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid
This program will help you to properly scrape all data from a specific website
This program will help you to properly scrape all data from a specific website
Extract GoPro highlights and GPMF data.
Python script that parses the gpmd stream for GOPRO moov track (MP4) and extract the GPS info into a GPX (and kml) file.
SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification
SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification
L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources.
L3Cube-MahaCorpus L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources. We expand the existing Marathi monolingual
Python package for concise, transparent, and accurate predictive modeling
Python package for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. 📚 docs • 📖 demo notebooks Modern
PyTorch implementation of the paper: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features
Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features Estimate the noise transition matrix with f-mutual information. This co
Nested cross-validation is necessary to avoid biased model performance in embedded feature selection in high-dimensional data with tiny sample sizes
Pruner for nested cross-validation - Sphinx-Doc Nested cross-validation is necessary to avoid biased model performance in embedded feature selection i
Data and code accompanying the paper Politics and Virality in the Time of Twitter
Politics and Virality in the Time of Twitter Data and code accompanying the paper Politics and Virality in the Time of Twitter. In specific: the code
Image Data Augmentation in Keras
Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.
Data Augmentation Using Keras and Python
Data-Augmentation-Using-Keras-and-Python Data augmentation is the process of increasing the number of training dataset. Keras library offers a simple
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python
Definitive Guide to Creating a SQL Database on Cloud with AWS and Python An easy-to-follow comprehensive guide on integrating Amazon RDS, MySQL Workbe
Analysis of Antarctica sequencing samples contaminated with SARS-CoV-2
Analysis of SARS-CoV-2 reads in sequencing of 2018-2019 Antarctica samples in PRJNA692319 The samples analyzed here are described in this preprint, wh
A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.
CAP-Classification-CRNN A deep learning model based on Inception modules paired with gated recurrent units (GRU) for the classification of CAP phases
Structured Data Gradient Pruning (SDGP)
Structured Data Gradient Pruning (SDGP) Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by re
LinkScope allows you to perform online investigations by representing information as discrete pieces of data, called Entities.
LinkScope Client Description This is the repository for the LinkScope Client Online Investigation software. LinkScope allows you to perform online inv
IADS 2021-22 Algorithm and Data structure collection
A collection of algorithms and datastructures introduced during UoE's Introduction to Datastructures and Algorithms class.
Prometheus exporter for chess.com player data
chess-exporter Prometheus exporter for chess.com player data implemented via chess.com's published data API and Prometheus Python Client Example use c
Kartothek - a Python library to manage large amounts of tabular data in a blob store
Kartothek - a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store
Trafffic prediction analysis using hybrid models - Machine Learning
Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis
TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake
Generate database table diagram from SQL data definition.
sql2diagram Generate database table diagram from SQL data definition. e.g. "CREATE TABLE ..." See Example below How does it works? Analyze the SQL to
PATC: Introduction to Big Data Analytics. Practical Data Analytics for Solving Real World Problems
PATC: Introduction to Big Data Analytics. Practical Data Analytics for Solving Real World Problems
Text Analysis & Topic Extraction on Android App user reviews
AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata
A Radare2 based Python module for Binary Analysis and Reverse Engineering.
Zepu1chr3 A Radare2 based Python module for Binary Analysis and Reverse Engineering. Installation You can simply run this command. pip3 install zepu1c
MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI
MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI Hallo
This repo has the source code for the crawler and data crawled from auto-data.net
This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars
Python command line tool and python engine to label table fields and fields in data files.
Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable information (PII).
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure
fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.
COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data
Create artistic visualisations with your exercise data (Python version)
strava_py Create artistic visualisations with your exercise data (Python version). This is a port of the R strava package to Python. Examples Facets A
Conducted ANOVA and Logistic regression analysis using matplot library to visualize the result.
Intro-to-Data-Science Conducted ANOVA and Logistic regression analysis. Project ANOVA The main aim of this project is to perform One-Way ANOVA analysi
1900-2016 Olympic Data Analysis in Python by plotting different graphs
🔥 Olympics Data Analysis 🔥 In Data Science field, there is a big topic before creating a model for future prediction is Data Analysis. We can find o
Analysis of voices based on the Mel-frequency band
Speaker_partition_module Analysis of voices based on the Mel-frequency band. Goal: Identification of voices speaking (diarization) and calculation of
Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI
Data-Visualization-Projects Collection of data visualizing projects through Tableau, Data Wrapper, and Power BI Indigenous-Brands-Social-Movements Pyt
Pokehandy - Data web app sobre Pokémon TCG que desarrollo durante transmisiones de Twitch, 2022
⚡️ Pokéhandy – Pokémon Hand Simulator [WIP 🚧 ] This application aims to simulat
InverterApi - This project has been designed to take monitoring data from Voltronic, Axpert, Mppsolar PIP, Voltacon, Effekta
InverterApi - This project has been designed to take monitoring data from Voltronic, Axpert, Mppsolar PIP, Voltacon, Effekta
Weather Image Recognition - Python weather application using series of data
Weather Image Recognition - Python weather application using series of data
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.
Geospatial Data Visualization using PyGMT
Example script to visualize topographic data, earthquake data, and tomographic data on a map
Evaluate on three different ML model for feature selection using Breast cancer data.
Anomaly-detection-Feature-Selection Evaluate on three different ML model for feature selection using Breast cancer data. ML models: SVM, KNN and MLP.
Binjago - Set of tools aiding in analysis of stripped Golang binaries with Binary Ninja
Binjago 🥷 Set of tools aiding in analysis of stripped Golang binaries with Bina
A solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.
This project is intended to implement a solution designed to extract, transform and load Chicago crime data from an RDS instance to other services in AWS.
ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering
ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc
A data structure that extends pyspark.sql.DataFrame with metadata information.
MetaFrame A data structure that extends pyspark.sql.DataFrame with metadata info
sfgp is a package that aggregates individual scripts and notebooks, primarily written for the basic analysis tasks of genetics and pharmacogenomics data.
sfgp is a package that aggregates individual scripts and notebooks, primarily written for the basic analysis tasks of genetics and pharmacogenomics data.
🌍 Create 3d-printable STLs from satellite elevation data 🌏
mapa 🌍 Create 3d-printable STLs from satellite elevation data Installation pip install mapa Usage mapa uses numpy and numba under the hood to crunch
TheMachineScraper 🐱👤 is an Information Grabber built for Machine Analysis
TheMachineScraper 🐱👤 is a tool made purely for analysing machine data for any reason.
Soccerdata - Efficiently scrape soccer data from various sources
SoccerData is a collection of wrappers over soccer data from Club Elo, ESPN, FBr
Data from "Datamodels: Predicting Predictions with Training Data"
Data from "Datamodels: Predicting Predictions with Training Data" Here we provid
Data-depth-inference - Data depth inference with python
Welcome! This readme will guide you through the use of the code in this reposito
Catalogue data - A Python Scripts to prepare catalogue data
catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install
Code and outputs from analysis determining that the wordle game can always be won in six moves.
wordle_worst_case_analysis Code and outputs from analysis determining that the wordle game can always be won in six moves. This is for the general cas
Python/Selenium script to scrape data about university courses
university-courses Python/Selenium script to scrape data about university courses. Script first extracts URLs of each courses homepage, then trawls ea
Download Web-10K data by querying Bing Image Search
gpv2-web10k This repository contains the script to download images from the Web-10K dataset. The script takes in a list of queries, queries Bing Image
The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain
bct_file_generator_for_EasyGSH The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain
Convert monolithic Jupyter notebooks into Ploomber pipelines.
Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo
This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project
Common Voice Utils This repository collects together basic linguistic processing data for using dataset dumps from the Common Voice project. It aims t
This repo contains a powerful tool made using python which is used to visualize, analyse and finally assess the quality of the product depending upon the given observations
📈 Statistical Quality Control 📉 This repo contains a simple but effective tool made using python which can be used for quality control in statistica
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.
Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing
Upload comma-delimited files to biglocalnews.org in your GitHub Action
Upload comma-delimited files to biglocalnews.org in your GitHub Action Inputs api-key: Your biglocalnews.org API token. project-id: The identifier of
Used for data processing in machine learning, and help us to construct ML model more easily from scratch
Used for data processing in machine learning, and help us to construct ML model more easily from scratch. Can be used in linear model, logistic regression model, and decision tree.
A Login/Registration GUI Application with SQLite database for manipulating data.
Login-Register_Tk A Login/Registration GUI Application with SQLite database for manipulating data. What is this program? This program is a GUI applica
PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).
PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).
Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System
This repository contains code that gets data regarding financial disclosures from the Court Listener API main.py: contains driver code that interacts
Enable geospatial data mining through Google Earth Engine in Grasshopper 3D, via its most recent Hops component.
AALU_Geo Mining This repository is produced for a masterclass at the Architectural Association Landscape Urbanism programme. Requirements Rhinoceros (
MoRecon - A tool for reconstructing missing frames in motion capture data.
MoRecon - A tool for reconstructing missing frames in motion capture data.
NFCDS Workshop Beginners Guide Bioinformatics Data Analysis
Genomics Workshop FIXME: overview of workshop Code of Conduct All participants s
Malware-analysis-writeups - Some of my Malware Analysis writeups
About This repo contains some malware analysis writeups i've created over time m
Docov - Light-weight, recursive docstring coverage analysis for python modules
docov Light-weight, recursive docstring coverage analysis for python modules. Ov
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut
You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta
Reverse engineering the dengue virus (under development construction)
Reverse engineering the dengue virus (under development 🚧 ) What is dengue? Dengue is a viral infection transmitted to humans through the bite of inf
Proyecto - Análisis de texto de eventos históricos
Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Link de acceso: https:
Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics
Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb