3466 Repositories
Python data-pre-processing Libraries
Code & Data for Enhancing Photorealism Enhancement
Enhancing Photorealism Enhancement Stephan R. Richter, Hassan Abu AlHaija, Vladlen Koltun Paper | Website (with side-by-side comparisons) | Video (Pap
Here I plotted data for the average test scores across schools and class sizes across school districts.
HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re
Image Processing - Make noise images clean
影像處理-影像降躁化(去躁化) (Image Processing - Make Noise Images Clean) 得力於電腦效能的大幅提升以及GPU的平行運算架構,讓我們能夠更快速且有效地訓練AI,並將AI技術應用於不同領域。本篇將帶給大家的是 「將深度學習應用於影像處理中的影像降躁化 」,
A simple project on Data Visualization for CSCI-40 course.
Simple-Data-Visualization A simple project on Data Visualization for CSCI-40 course - the instructions can be found here SAT results in New York in 20
A small Python Library to process Game Boy Camera images
GameBEye GameBEye is a Python Library to process Game Boy Camera images. Source code 📁 : https://github.com/mtouzot/GameBEye Issues 🆘 : https://gith
Data and a Twitter bot for the EPA's DOCUMERICA (1972-1977) program.
documerica This repository holds JSON(L) artifacts and a few scripts related to managing archival data from the EPA's DOCUMERICA program. Contents: Ma
Scraping weather data using Python to receive umbrella reminders
A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.
Python App To Encrypt Data (image, text, all data)
Python App To Encrypt Data (image, text, all data)
This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.
From "Onion Not Found" to Guard Discovery (PETS'22) This repository holds the code and data for our PETS'22 paper titled 'From "Onion Not Found" to Gu
LoL API is a Python application made to serve League of Legends data.
LoL API is a Python application made to serve League of Legends data.
Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation
Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation Overview Consider the scenario in which advertisement
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
This is a backend for VCode Editor for saving & retriving data.
This is a backend for VCode Editor for saving & retriving data through the API.
psgresizer - a PySimpleGUI application that will resize your images and BASE64 encode them.
psgresizer A PySimpleGUI Application Resize your images quickly and easily with this GUI application. Resizes and encodes to Base64 so that the result
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework
VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-
Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)
mcc-mnc.com-webscraper Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX) A Python script for web scraping mcc-mnc.com Link: mcc
Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.
Exploring aircraft accidents in Brazil Occurrencies with aircraft in Brazil are investigated by the Center for Investigation and Prevention of Aircraf
Open-source linguistic ethnography tool for framing public opinion in mediatized groups.
Open-source linguistic ethnography tool for framing public opinion in mediatized groups. Table of Contents Installing Quickstart Links Installing Pyth
small package with utility functions for analyzing (fly) calcium imaging data
fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan
Tool for automatically reordering python imports. Similar to isort but uses static analysis more.
reorder_python_imports Tool for automatically reordering python imports. Similar to isort but uses static analysis more. Installation pip install reor
A tool for checking if the external data used in Flatpak manifests is still up to date
Flatpak External Data Checker This is a tool for checking for outdated or broken links of external data in Flatpak manifests. Motivation Flatpak apps
A Python module and command line utility for working with web archive data using the WACZ format specification
py-wacz The py-wacz repository contains a Python module and command line utility for working with web archive data using the WACZ format specification
A real data analysis and modeling project - restaurant inspections
A real data analysis and modeling project - restaurant inspections Jafar Pourbemany 9/27/2021 This project represents data analysis and modeling of re
Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures
Svector Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures. Easily chain your methods confidently with tons of add
Exploratory data analysis
Exploratory data analysis An Exploratory data analysis APP TAPIWA CHAMBOKO 🚀 About Me I'm a full stack developer experienced in deploying artificial
Simple Python image processing & automatization project for a simple web based game
What is this? Simple Python image processing & automatization project for a simple web based game Made using only Github Copilot (except the color and
A napari plugin to inspect data within a cisTEM project
napari-cistem A plugin to inspect data within a cisTEM project This napari plugin was generated with Cookiecutter using with @napari's cookiecutter-na
Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation
Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation This repository provides two web crawlers to label domain nam
edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.
edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.
A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel
Love-Calculator A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI, even For Adm
clesperanto is a graphical user interface for GPU-accelerated image processing.
clesperanto is a graphical user interface for a multi-platform multi-language framework for GPU-accelerated image processing. It is based on napari and the pyclesperanto-prototype.
Chinese Pre-Trained Language Models (CPM-LM) Version-I
CPM-Generate 为了促进中文自然语言处理研究的发展,本项目提供了 CPM-LM (2.6B) 模型的文本生成代码,可用于文本生成的本地测试,并以此为基础进一步研究零次学习/少次学习等场景。[项目首页] [模型下载] [技术报告] 若您想使用CPM-1进行推理,我们建议使用高效推理工具BMI
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation
OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma
Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array
shindo.py Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data stored in NumPy array Introduction Japa
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.
Python functions to run WASS stereo wave processing executables, and load and post process WASS output files.
wass-pyfuns Python functions to run the WASS stereo wave processing executables, and load and post process the WASS output files. General WASS (Waves
image-processing exercises.
image_processing Assignment 21 Checkered Board Create a chess table using numpy and opencv. view: Color Correction Reverse black and white colors with
A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals.
A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals. All your favorite hits in a simplified format.
Tool which allow you to detect and translate text.
Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr
This tool parses log data and allows to define analysis pipelines for anomaly detection.
logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit
Tools for analyzing data collected with a custom unity-based VR for insects.
unityvr Tools for analyzing data collected with a custom unity-based VR for insects. Organization: The unityvr package contains the following submodul
A tool for batch processing large fasta files and accompanying metadata table to upload to repositories via API
Fasta Uploader A tool for batch processing large fasta files and accompanying metadata table to repositories via API The python fasta_uploader.py scri
A collection of tools for biomedical research assay analysis in Python.
waltlabtools A collection of tools for biomedical research assay analysis in Python. Key Features Analysis for assays such as digital ELISA, including
A learning-based data collection tool for human segmentation
FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O
A collection of robust and fast processing tools for parsing and analyzing web archive data.
ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing web archive data. Resiliparse is part of the ChatNoir
Suite of tools for retrieving USGS NWIS observations and evaluating National Water Model (NWM) data.
Documentation OWPHydroTools GitHub pages documentation Motivation We developed OWPHydroTools with data scientists in mind. We attempted to ensure the
A tool and a library for SVG path data transformations.
SVG path data transformation toolkit A tool and a library for SVG path data transformations. Currently it supports a translation and a scaling. Usage
A python tool for batch importing excel/csv files into mysql database.
PySimpleGUI numpy pandas pymysql chardet
A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)
A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)
Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"
VoCapXLM Code for EMNLP2021 paper Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training Environment DockerFile: dancingso
NLP Overview
NLP-Overview Introduction The field of NPL encompasses a variety of topics which involve the computational processing and understanding of human langu
An Open-Source Toolkit for Prompt-Learning.
An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea
Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.
causal-bald | Abstract | Installation | Example | Citation | Reproducing Results DUE An implementation of the methods presented in Causal-BALD: Deep B
python's memory-saving dictionary data structure
ConstDict python代替的Dict数据结构 若字典不会增加字段,只读/原字段修改 使用ConstDict可节省内存 Dict()内存主要消耗的地方: 1、Dict扩容机制,预留内存空间 2、Dict也是一个对象,内部会动态维护__dict__,增加slot类属性可以节省内容 节省内存大小
This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here
uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain
LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language
LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language ⚖️ The library of Natural Language Processing for Brazilian legal lang
Streamz helps you build pipelines to manage continuous streams of data
Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on.
MongoDB utility to inflate the contents of small collection to a new larger collection
MongoDB Data Inflater ("data-inflater") The data-inflater tool is a MongoDB utility to automate the creation of a new large database collection using
Visualization Data Drug in thailand during 2014 to 2020
Visualization Data Drug in thailand during 2014 to 2020 Data sorce from ข้อมูลเปิดภาครัฐ สำนักงาน ป.ป.ส Inttroducing program Using tkinter module for
Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).
Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas
nrgpy is the Python package for processing NRG Data Files
nrgpy nrgpy is the Python package for processing NRG Data Files Website and source: https://github.com/nrgpy/nrgpy Documentation: https://nrgpy.github
The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based metabolomics.
MINT (Metabolomics Integrator) The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based m
Data science, Data manipulation and Machine learning package.
duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l
Extract the temperature data of each wire from the thermal imager raw data.
Wire-Tempurature-Detection Extract the temperature data of each wire from the thermal imager raw data. The motivation of this computer vision project
VevestaX is an open source Python package for ML Engineers and Data Scientists.
VevestaX Track failed and successful experiments as well as features. VevestaX is an open source Python package for ML Engineers and Data Scientists.
Connects to a local SenseCap M1 Helium Hotspot and pulls API Data.
sensecap_api_checker_HELIUM Connects to a local SenseCap M1 Helium Hotspot and pulls API Data.
scikit-multimodallearn is a Python package implementing algorithms multimodal data.
scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning
Generalized Data Weighting via Class-level Gradient Manipulation
Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Thank you for you
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
DSEE Codes for [Preprint] DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Ch
Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec
Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo
SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.
FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”
Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in
Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available for research purposes.
Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available f
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Indic TTS Samples can be found at https://peter-yh-wu.github.io/cross-
Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.
stereoEEG2speech We provide code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectro
Bayes-Newton—A Gaussian process library in JAX, with a unifying view of approximate Bayesian inference as variants of Newton's algorithm.
Bayes-Newton Bayes-Newton is a library for approximate inference in Gaussian processes (GPs) in JAX (with objax), built and actively maintained by Wil
The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualizing NFT data from OpenSea, using PostgreSQL and TimescaleDB.
Timescale NFT Starter Kit The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualiz
This is the accompanying repository for the Bloomberg Global Coal Countdown website.
This is the accompanying repository for the Bloomberg Global Coal Countdown (BGCC) website. Data Sources Dashboard Data Schema and Validation License
Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.
Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati
A tool for hiding data inside of images
Stegenography-tool a tool for hiding data inside of images Quick test: do python steg-encode.py test/message.txt test/covid19.png to generate the test
Planning Algorithms in AI and Robotics. MSc course at Skoltech Data Science program
Planning Algorithms in AI and Robotics course T2 2021-22 The Planning Algorithms in AI and Robotics course at Skoltech, MS in Data Science, during T2,
PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"
Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1) This repository contains the PyTorch implementation of "A Two
EOD Historical Data Python Library (Unofficial)
EOD Historical Data Python Library (Unofficial) https://eodhistoricaldata.com Installation python3 -m pip install eodhistoricaldata Note Demo API key
A selection of SQLite3 databases to practice querying from.
Dummy SQL Databases This is a collection of dummy SQLite3 databases, for learning and practicing SQL querying, generated with the VS Code extension Ge
This is an auto-ML tool specialized in detecting of outliers
Auto-ML tool specialized in detecting of outliers Description This tool will allows you, with a Dash visualization, to compare 10 models of machine le
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.
Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat
Data Poisoning based on Adversarial Attacks using Non-Robust Features
Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here
uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain
A terminal spreadsheet multitool for discovering and arranging data
VisiData v2.6.1 A terminal interface for exploring and arranging tabular data. VisiData supports tsv, csv, sqlite, json, xlsx (Excel), hdf5, and many
DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]
DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021] Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng
A large-scale database for graph representation learning
A large-scale database for graph representation learning
In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift.
ETL Pipeline for AWS Project Description In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift. The data is loaded from S3 t
Data on Free Food at MIT
MIT Free Food Timing Procrastinating research by plotting data on how long it takes emails on the free-food at mit edu mailing list to go through. Dat
Minimal working example of data acquisition with nidaqmx python API
Data Aquisition using NI-DAQmx python API Based on this project It is a minimal working example for data acquisition using the NI-DAQmx python API. It
A toolkit for geo ML data processing and model evaluation (fork of solaris)
An open source ML toolkit for overhead imagery. This is a beta version of lunular which may continue to develop. Please report any bugs through issues
Data cleaning tools for Business analysis
Datacleaning datacleaning tools for Business analysis This program is made for Vicky's work. You can use it, too. 数据清洗 该数据清洗工具是为了商业分析 这个程序是为了Vicky的工作而