4120 Repositories
Python data-build-tool Libraries
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data
Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi
Example of scraping a paginated API endpoint and dumping the data into a DB
Provider API Scraper Example Example of scraping a paginated API endpoint and dumping the data into a DB. Pre-requisits Python = 3.9 Pipenv Setup # i
Nest - A flexible tool for building and sharing deep learning modules
Nest - A flexible tool for building and sharing deep learning modules Nest is a flexible deep learning module manager, which aims at encouraging code
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Vector AI is a framework designed to make the process of building production grade vector based applications as quickly and easily as possible. Create
TagLab: an image segmentation tool oriented to marine data analysis
TagLab: an image segmentation tool oriented to marine data analysis TagLab was created to support the activity of annotation and extraction of statist
10x faster matrix and vector operations
Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo
Synthetic structured data generators
Join us on What is Synthetic Data? Synthetic data is artificially generated data that is not collected from real world events. It replicates the stati
High-quality implementations of standard and SOTA methods on a variety of tasks.
Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo
GDB python tool to pretty print and debug c++ xtensor containers
gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim
OpenPort scanner GUI tool (CNMAP)
CNMAP-GUI- OpenPort scanner GUI tool (CNMAP) as you know it is the advanced tool to find open port, firewalls and we also added here heartbleed scanni
MetaStalk is a tool that can be used to generate graphs from the metadata of JPEG, TIFF, and HEIC images
MetaStalk About MetaStalk is a tool that can be used to generate graphs from the metadata of JPEG, TIFF, and HEIC images, which are tested. More forma
A rule learning algorithm for the deduction of syndrome definitions from time series data.
README This project provides a rule learning algorithm for the deduction of syndrome definitions from time series data. Large parts of the algorithm a
Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data
JumpDetectR Name of QuantLet : JumpDetectR Published in : 'To be published as "Jump dynamics in high frequency crypto markets"' Description : 'Scala
abess: Fast Best-Subset Selection in Python and R
abess: Fast Best-Subset Selection in Python and R Overview abess (Adaptive BEst Subset Selection) library aims to solve general best subset selection,
SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).
SPRING This is the repo for SPRING (Symmetric ParsIng aNd Generation), a novel approach to semantic parsing and generation, presented at AAAI 2021. Wi
EODAG is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a unified API for data access regardless of the data provider
EODAG (Earth Observation Data Access Gateway) is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a unified API for data access regardless of the data provider
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring
NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.
Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. 👉 https:
missing-pixel-filler is a python package that, given images that may contain missing data regions (like satellite imagery with swath gaps), returns these images with the regions filled.
Missing Pixel Filler This is the official code repository for the Missing Pixel Filler by SpaceML. missing-pixel-filler is a python package that, give
ImageScraper is a cross-platform tool for downloading a specified count from xkcd, Astronomy Picture of the Day and Existential Comics
ImageScraper The ImageScraper is a cross-platform tool for downloading a specified count from xkcd, Astronomy Picture of the Day and Existential Comic
A tf publisher gui tool for ROS, which publish /tf_static message. The software is based on PyQt5.
tf_publisher_gui for ROS Introduction How to use cd catkin_ws/src git clone https://github.com/yinwu33/tf_publisher_gui.git cd catkin_ws catkin_make s
Techdegree Data Analysis Project 2
Basketball Team Stats Tool In this project you will be writing a program that reads from the "constants" data (PLAYERS and TEAMS) in constants.py. Thi
Python program for installing many tools automatically
Tool Installer is a script made with python which help user in installing tools automatically
Inject custom C++ code into GameMaker Studio 2 YYC builds
YYC Boost Inject custom C++ code into GameMaker Studio 2 YYC builds! WARNING: This tool is currently in an early stage of development and it is not gu
This repository requires you to solve a problem by writing some basic python code.
Can You Solve a Problem? A beginner friendly repository that requires you to solve familiar problems with python. This could be as simple as implement
B-Pkg is a simple tool in python for installing all basic package in termux
Basic-Pkg 👉🏻 Basic-Pkg 👈🏻 B-Pkg is a simple tool in python for installing all basic package in termux This is my first tool, I hope you will like
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.
PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.
arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaplex Candy Machine.
arweave-nft-uploader arweave-nft-uploader is a Python tool to improve the experience of uploading NFTs to the Arweave storage for use with the Metaple
Animation retargeting tool for Autodesk Maya. Retargets mocap to a custom rig with a few clicks.
Animation Retargeting Tool for Maya A tool for transferring animation data and mocap from a skeleton to a custom rig in Autodesk Maya. Installation: A
A CLI tool to disable and enable security standards controls in AWS Security Hub
Security Hub Controls CLI A CLI tool to disable and enable security standards controls in AWS Security Hub. It is designed to work together with AWS S
[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation
[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation
The implementation of the submitted paper "Deep Multi-Behaviors Graph Network for Voucher Redemption Rate Prediction" in SIGKDD 2021 Applied Data Science Track.
DMBGN: Deep Multi-Behaviors Graph Networks for Voucher Redemption Rate Prediction The implementation of the accepted paper "Deep Multi-Behaviors Graph
Kimimaro: Skeletonize Densely Labeled Images
Kimimaro: Skeletonize Densely Labeled Images # Produce SWC files from volumetric images. kimimaro forge labels.npy --progress # writes to ./kimimaro_o
ScreenTeX is a tool that grabs all text when taking a screenshot rather than getting an image.
The ScreenTeX project By: Seanpm2001 / ScreenTeX, Et; Al. Top README.md Read this article in a different language 🌐 List of languages Sorted by: A-Z
Lale is a Python library for semi-automated data science.
Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion.
Glue is a python project to link visualizations of scientific datasets across many files.
Glue Glue is a python project to link visualizations of scientific datasets across many files. Click on the image for a quick demo: Features Interacti
Transform-Invariant Non-Negative Matrix Factorization
Transform-Invariant Non-Negative Matrix Factorization A comprehensive Python package for Non-Negative Matrix Factorization (NMF) with a focus on learn
A collection of neat and practical data science and machine learning projects
Data Science A collection of neat and practical data science and machine learning projects Explore the docs » Report Bug · Request Feature Table of Co
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard.
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard. Features 🔥 Article monitor: helps you capture the trend at a
🌌 Economics Observatory Visualisation Repository
Economics Observatory Visualisation Repository Website | Visualisations | Data | Here you will find all the data visualisations and infographics attac
This tool analyzes the json files generated by stream-lnd-htlcs to find hidden channel demand.
analyze_lnd_htlc Introduction Rebalancing channels is an important part of running a Lightning Network node. While it would be great if all channels c
A Discord bot to easily and quickly format your JSON data
Invite PrettyJSON to your Discord server Table of contents About the project What is JSON? What is pretty printing? How to use Input options Command I
Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".
Mixup-Data-Dependency Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training". Running Alternating Line Exp
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".
Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer
KaziText is a tool for modelling common human errors.
KaziText KaziText is a tool for modelling common human errors. It estimates probabilities of individual error types (so called aspects) from grammatic
Tools for working with MARC data in Catalogue Bridge.
catbridge_tools Tools for working with MARC data in Catalogue Bridge. Borrows heavily from PyMarc
collect training and calibration data for gaze tracking
Collect Training and Calibration Data for Gaze Tracking This tool allows collecting gaze data necessary for personal calibration or training of eye-tr
Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.
Distant Supervision for Scene Graph Generation Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation. Introduction The pape
Automatic CPU speed & power optimizer for Linux
Automatic CPU speed & power optimizer for Linux based on active monitoring of laptop's battery state, CPU usage, CPU temperature and system load. Ultimately allowing you to improve battery life without making any compromises.
Unofficial Open Corporates CLI: OpenCorporates is a website that shares data on corporations under the copyleft Open Database License. This is an unofficial open corporates python command line tool.
Unofficial Open Corporates CLI OpenCorporates is a website that shares data on corporations under the copyleft Open Database License. This is an unoff
Home Assistant custom integration to fetch data from Powerpal
Powerpal custom component for Home Assistant Component to integrate with powerpal. This repository and integration is not affiliated with Powerpal. Th
Cryptocurrency Exchange Websocket Data Feed Handler
Cryptocurrency Exchange Websocket Data Feed Handler
Mini Tool to lovers of debe from eksisozluk (one of the most famous website -reffered as collaborative dictionary like reddit- in Turkey) for pushing debe (Most Liked Entries of Yesterday) to kindle every day via Github Actions.
debe to kindle Mini Tool to lovers of debe from eksisozluk (one of the most famous website -refered as collaborative dictionary like reddit- in Turkey
Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm.
Naive-Bayes Spam Classificator Module is created to build a spam filter using Python and the multinomial Naive Bayes algorithm. Main goal is to code a
This tool can be used to extract information from any website
WEB-INFO- This tool can be used to extract information from any website Install Termux and run the command --- $ apt-get update $ apt-get upgrade $ pk
Username reconnaisance tool that checks the availability of a specified username on over 200 websites.
Username reconnaisance tool that checks the availability of a specified username on over 200 websites. Installation & Usage Clone from Github: $ git c
Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.
Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video. You can chose the cha
A Python tool to generate and refresh Amazon access tokens.
amazon_auth A Python tool to generate and refresh Amazon access tokens. Description This tool generates and outputs Amazon access and refresh tokens f
Monitor creation, deletion and changes to LDAP objects live during your pentest or system administration!
LDAP Monitor Monitor creation, deletion and changes to LDAP objects live during your pentest or system administration! With this tool you can quickly
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.
A small command-line tool for interacting with GQL APIs
igqloo A small tool for interacting with GQL APIs Arguments, mutations, aliases are all supported. Other features, such as fragments, are left unsuppo
An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.
TikTok Scraper An utility library to scrape data from TikTok hassle-free Go to the website » View Demo · Report Bug · Request Feature About The Projec
ONYX SMTP Sender est un tool qui vous serviras à envoyer des email html à une liste d'email (en .txt) c'est la première version du tool et je le sors un peu à la rache donc si le logiciel est obsolète c'est normal j'y taff encore ;)
SMTP-Sender ONYX SMTP Sender est un tool qui vous serviras à envoyer des email html à une liste d'email (en .txt) c'est la première version du tool et
Tiny Git is a simplified version of Git with only the basic functionalities to gain better understanding of git internals.
Tiny Git is a simplified version of Git with only the basic functionalities to gain better understanding of git internals. Implemented Functi
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"
Time-Sensitive-QA The repo contains the dataset and code for NeurIPS2021 (dataset track) paper Time-Sensitive Question Answering dataset. The dataset
Official Repository for our ICCV2021 paper: Continual Learning on Noisy Data Streams via Self-Purified Replay
Continual Learning on Noisy Data Streams via Self-Purified Replay This repository contains the official PyTorch implementation for our ICCV2021 paper.
A tutorial presents several practical examples of how to build DAGs in Apache Airflow
Apache Airflow - Python Brasil 2021 Este tutorial apresenta vários exemplos práticos de como construir DAGs no Apache Airflow. Background Apache Airfl
International Space Station data with Python research 🌎
International Space Station data with Python research 🌎 Plotting ISS trajectory, calculating the velocity over the earth and more. Plotting trajector
Shellmon is a tool used to create and control a webshell remotely, created using the Python3
An Simple PHP Webshell Manager Description Shellmon is a tool used to create and control a webshell remotely, created using the Python3 programming la
Code and data for learning to search in local branching
Code and data for learning to search in local branching
Simple yet efficient tool used to check and sort tokens in terms of there validation.
Discord Token Checker Simple yet efficient tool used to check and sort tokens in terms of there validation.When the program is done,go to the "output"
Python package for processing UC module spectral data.
UC Module Python Package How To Install clone repo. cd UC-module pip install . How to Use uc.module.UC(measurment=str, dark=str, reference=str, heade
Trellox Tool is written in Python3 and designed to pull and list Trello boards.
TrelloX Trellox Tool is written in Python3 and designed to list and pull Trello boards. It can be used by penetration testers/bug bounty hunters to de
The Black shade analyser and comparison tool.
diff-shades The Black shade analyser and comparison tool. AKA Richard's personal take at a better black-primer (by stealing ideas from mypy-primer) :p
tetrados is a tool to generate a density of states using the linear tetrahedron method from a band structure.
tetrados tetrados is a tool to generate a density of states using the linear tetrahedron method from a band structure. Currently, only VASP calculatio
Easily report Instagram pages and close the page
Program Features - 📌 Delete target post on Instagram. - 📌 Delete Media Target post on Instagram - 📌 Complete deletion of the target account on Inst
A simple demonstration of integrating a sentiment analysis tool in a django project
sentiment-analysis A simple demonstration of integrating a sentiment analysis tool in a django project (watch the video .mp4) To run this project : pi
AI Flow is an open source framework that bridges big data and artificial intelligence.
Flink AI Flow Introduction Flink AI Flow is an open source framework that bridges big data and artificial intelligence. It manages the entire machine
Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .
TwitterScraper Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine . Screenshot Data Users Only
A D3.js plugin that produces flame graphs from hierarchical data.
d3-flame-graph A D3.js plugin that produces flame graphs from hierarchical data. If you don't know what flame graphs are, check Brendan Gregg's post.
wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information
Python based Wikidata framework for easy dataframe extraction wikirepo is a Python package that provides a framework to easily source and leverage sta
A Python framework to build Slack apps in a flash with the latest platform features.
Bolt for Python A Python framework to build Slack apps in a flash with the latest platform features. Read the getting started guide and look at our co
Python-based Space Physics Environment Data Analysis Software
pySPEDAS pySPEDAS is an implementation of the SPEDAS framework for Python. The Space Physics Environment Data Analysis Software (SPEDAS) framework is
Nature-inspired algorithms are a very popular tool for solving optimization problems.
Nature-inspired algorithms are a very popular tool for solving optimization problems. Numerous variants of nature-inspired algorithms have been develo
Framework to collect and process weather data from wttr.in.
Weathercrawler Automatic extraction and processing framework for weather data from wttr.in Installation tested with: Python 3.7.3 Python 3.9.4 git clo
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.
Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Parrot Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more t
EasyBuild is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way.
EasyBuild is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way.
MaD GUI is a basis for graphical annotation and computational analysis of time series data.
MaD GUI Machine Learning and Data Analytics Graphical User Interface MaD GUI is a basis for graphical annotation and computational analysis of time se
Multiview 3D object detection on MultiviewC dataset through moft3d.
Multiview Orthographic Feature Transformation for 3D Object Detection Multiview 3D object detection on MultiviewC dataset through moft3d. Introduction
A python tool used for hacking WhatsApp by diverting otp
W-HACK A python tool used for hacking WhatsApp by diverting otp You can hack WhatsApp easily with this tool Note:OTP expires after 5 seconds HOW TO IN
A utility for quickly cropping large collections of images.
Crop Tool A utility for quickly cropping large collections of images. Inspired by Derrick Schultz's dataset-tools. Setup It's suggested that you use A
Efficient matrix representations for working with tabular data
Efficient matrix representations for working with tabular data
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"
StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois
Benchmark datasets, data loaders, and evaluators for graph machine learning
Overview The Open Graph Benchmark (OGB) is a collection of benchmark datasets, data loaders, and evaluators for graph machine learning. Datasets cover
A simple tool for searching images inside a local folder with text/image input using CLIP
clip-search (WIP) A simple tool for searching images inside a local folder with text/image input using CLIP 10 results for "a blonde woman" in a folde