373 Repositories
Python large-scale-ml Libraries
Global Tracking Transformers, CVPR 2022
Global Tracking Transformers Global Tracking Transformers, Xingyi Zhou, Tianwei Yin, Vladlen Koltun, Philipp Krähenbühl, CVPR 2022 (arXiv 2203.13250)
JF⚡can - Super fast port scanning & service discovery using Masscan and Nmap. Scan large networks with Masscan and use Nmap's scripting abilities to discover information about services. Generate report.
Description Killing features Perform a large-scale scans using Nmap! Allows you to use Masscan to scan targets and execute Nmap on detected ports with
MAT: Mask-Aware Transformer for Large Hole Image Inpainting
MAT: Mask-Aware Transformer for Large Hole Image Inpainting (CVPR2022, Oral) Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia [Paper] News This
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This
Georeferencing large amounts of data for free.
Geolocate Georeferencing large amounts of data for free. Special thanks to @brunodepauloalmeida and the whole team for the contributions. How? It's us
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec
Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!
Easy-Translate is a script for translating large text files in your machine using the M2M100 models from Facebook/Meta AI. We also privide a script fo
A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.
MedMCQA MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering A large-scale, Multiple-Choice Question Answe
SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time
SentimentArcs - Emotion in Text An end-to-end pipeline based on Jupyter notebooks to detect, extract, process and anlayze emotion over time in text. E
This is the course repository for the Spring 2022 iteration of MACS 30123 "Large-Scale Computing for the Social Sciences" at the University of Chicago.
Large-Scale Computing for the Social Sciences Spring 2022 - MACS 30123/MAPS 30123/PLSC 30123 Instructor Information TA Information TA Information Cour
Guide to using pre-trained large language models of source code
Large Models of Source Code I occasionally train and publicly release large neural language models on programs, including PolyCoder. Here, I describe
Enterprise Scale NLP with Hugging Face & SageMaker Workshop series
Workshop: Enterprise-Scale NLP with Hugging Face & Amazon SageMaker Earlier this year we announced a strategic collaboration with Amazon to make it ea
code for Image Manipulation Detection by Multi-View Multi-Scale Supervision
MVSS-Net Code and models for ICCV 2021 paper: Image Manipulation Detection by Multi-View Multi-Scale Supervision Update 22.02.17, Pretrained model for
Large-scale language modeling tutorials with PyTorch
Large-scale language modeling tutorials with PyTorch 안녕하세요. 저는 TUNiB에서 머신러닝 엔지니어로 근무 중인 고현웅입니다. 이 자료는 대규모 언어모델 개발에 필요한 여러가지 기술들을 소개드리기 위해 마련하였으며 기본적으로
Framework for evaluating ANNS algorithms on billion scale datasets.
Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py
[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
[Project] [PDF] This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets" by Axel Sauer, Katja
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift
TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa
Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)
LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale
DeepGNN is a framework for training machine learning models on large scale graph data.
DeepGNN Overview DeepGNN is a framework for training machine learning models on large scale graph data. DeepGNN contains all the necessary features in
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].
Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.
Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for
Repository for DNN training, theory to practice, part of the Large Scale Machine Learning class at Mines Paritech
DNN Training, from theory to practice This repository is complementary to the deep learning training lesson given to les Mines ParisTech on the 11th o
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.
solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli
IPscan - This Script is Framework To automate IP process large scope For Bug Hunting
IPscan This Script is Framework To automate IP process large scope For Bug Hunti
Large-Scale Unsupervised Object Discovery
Large-Scale Unsupervised Object Discovery Huy V. Vo, Elena Sizikova, Cordelia Schmid, Patrick Pérez, Jean Ponce [PDF] We propose a novel ranking-based
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.
Large-scale Knowledge Graph Construction with Prompting
Large-scale Knowledge Graph Construction with Prompting across tasks (predictive and generative), and modalities (language, image, vision + language, etc.)
The mini-AlphaStar (mini-AS, or mAS) - mini-scale version (non-official) of the AlphaStar (AS)
A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II.
FewBit — a library for memory efficient training of large neural networks
FewBit FewBit — a library for memory efficient training of large neural networks. Its efficiency originates from storage optimizations applied to back
Code for the tech report Toward Training at ImageNet Scale with Differential Privacy
Differentially private Imagenet training Code for the tech report Toward Training at ImageNet Scale with Differential Privacy by Alexey Kurakin, Steve
Kartothek - a Python library to manage large amounts of tabular data in a blob store
Kartothek - a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.
Opencontactbook - Bulk-manage large numbers of vCard contacts with built-in geolocation
Open Contact Book Open Contact Book is a buiness-oriented, cross-platform, Pytho
GTK and Python based, simple multiple image editor tool
System Monitoring Center GTK3 and Python3 based, simple multiple image editor tool. Note: Development of this application is not completed yet. The ap
Split large XML files into smaller ones for easy upload
Split large XML files into smaller ones for easy upload. Works for WordPress Posts Import and other XML files.
A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.
Evolution Gym A large-scale benchmark for co-optimizing the design and control of soft robots. As seen in Evolution Gym: A Large-Scale Benchmark for E
WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking
WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link] Abstract In this work, we contribute a new million-scale Un
Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges for the practitioner
Sparse network learning with snlpy Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges
Python code for the paper How to scale hyperparameters for quickshift image segmentation
How to scale hyperparameters for quickshift image segmentation Python code for the paper How to scale hyperparameters for quickshift image segmentatio
A small scale relica of bank management system using the MySQL queries in the python language.
Bank_Management_system This is a Bank Management System Database Project. Abstract: The main aim of the Bank Management Mini project is to keep record
Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included.
open(LARGE) Storing, versioning, and downloading files from S3 made as easy as using open() in Python. Caching included. Motivation Oftentimes, especi
Continuous Security Group Rule Change Detection & Response at scale
Introduction Get notified of Security Group Changes across all AWS Accounts & Regions in an AWS Organization, with the ability to respond/revert those
Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets
Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including obl
GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form
GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.
NeurIPS workshop paper 'Counter-Strike Deathmatch with Large-Scale Behavioural Cloning'
Counter-Strike Deathmatch with Large-Scale Behavioural Cloning Tim Pearce, Jun Zhu Offline RL workshop, NeurIPS 2021 Paper: https://arxiv.org/abs/2104
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor. It is devel
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization
Th2En & Th2Zh: The large-scale datasets for Thai text cross-lingual summarization 📥 Download Datasets 📥 Download Trained Models INTRODUCTION TH2ZH (
The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.
Neural Machine Translation communication system The model is basically direct to convert one source language to another targeted language using encode
Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)
Time-aware Large Kernel (TaLK) Convolutions (Lioutas et al., 2020) This repository contains the source code, pre-trained models, as well as instructio
this repo store a Awoesome telegram bot for protect from your large group from bot attack.
this repo store a Awoesome telegram bot for protect from your large group from bot attack.
(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching
Scalable Cluster-Consistency Statistics for Robust Multi-Object Matching (3DV 2021 Oral Presentation) Filtering by Cluster Consistency (FCC) is a very
This repo is the official implementation for Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting
1 MAGNN This repo is the official implementation for Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting. 1.1 The frame
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi
I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive
I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this task very challenging.
Helper to organize your windows on your desktop.
The script of positionsing windows on the screen. How does it work? Select your window to move/res
A clean, easy to scale discord bot template
A clean, easy to scale discord bot template. Develope using nextcord library and can be use with any other discord.py forked library.
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis
Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s
Large scale PTM - PPI relation extraction
Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard
Trained T5 and T5-large model for creating keywords from text
text to keywords Trained T5-base and T5-large model for creating keywords from text. Supported languages: ru Pretraining Large version | Pretraining B
Forecast dynamically at scale with this unique package. pip install scalecast
🌄 Scalecast: Dynamic Forecasting at Scale About This package uses a scaleable forecasting approach in Python with common scikit-learn and statsmodels
A Simulation Environment to train Robots in Large Realistic Interactive Scenes
iGibson: A Simulation Environment to train Robots in Large Realistic Interactive Scenes iGibson is a simulation environment providing fast visual rend
Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility ICCV2021
Vis2Mesh This is the offical repository of the paper: Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Lear
CMS for everyone, easy to deploy and scale, robust modular system with many packages.
Django-Leonardo Full featured platform for fast and easy building extensible web applications. Don't waste your time searching stable solution for dai
This is a DemoCode for parsing through large log files and triggering an email whenever there's an error.
LogFileParserDemoCode This is a DemoCode for parsing through large log files and triggering an email whenever there's an error. There are a total of f
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'
Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll
Interactive dimensionality reduction for large datasets
BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen
NFT-Image-Generator - Utility to generate a large collection of unique images
NFT-Image-Generator Utility for creating a generative art collection from suppli
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'
Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll
TSIT: A Simple and Versatile Framework for Image-to-Image Translation
TSIT: A Simple and Versatile Framework for Image-to-Image Translation This repository provides the official PyTorch implementation for the following p
A web-based analysis toolkit for the System Usability Scale providing calculation, plotting, interpretation and contextualization utility
System Usability Scale Analysis Toolkit The System Usability Scale (SUS) Analysis Toolkit is a web-based python application that provides a compilatio
BRNet - code for Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss function
BRNet code for "Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss func
Codeflare - Scale complex AI/ML pipelines anywhere
Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation
Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf
The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.
Neural Machine Translation communication system The model is basically direct to convert one source language to another targeted language using encode
A simple python discord bot which give you a yogurt brand name, basing on a large database often updated.
YaourtBot A discord simple bot by Lopinosaurus Before using this code : ・Move env file to .env ・Change the channel ID on line 38 of bot.py to your #pi
An distributed automation framework.
Automation Kit Repository Welcome to the Automation Kit repository! Note: This package is progressing quickly but is not yet ready for full production
N-Omniglot is a large neuromorphic few-shot learning dataset
N-Omniglot [Paper] || [Dataset] N-Omniglot is a large neuromorphic few-shot learning dataset. It reconstructs strokes of Omniglot as videos and uses D
Official release of MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer axriv: http://arxiv.org/abs/2112.13513
MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis This is the official page of the MSHT with its experimental script and records. We de
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)
Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive
Manage large and heterogeneous data spaces on the file system.
signac - simple data management The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproduc
LynxKite: a complete graph data science platform for very large graphs and other datasets.
LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.
A set of tests for evaluating large-scale algorithms for Wasserstein-2 transport maps computation.
Continuous Wasserstein-2 Benchmark This is the official Python implementation of the NeurIPS 2021 paper Do Neural Optimal Transport Solvers Work? A Co
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"
Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart
CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images
CurriculumNet Introduction This repo contains related code and models from the ECCV 2018 CurriculumNet paper. CurriculumNet is a new training strategy
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".
RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan
A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.
A-ESRGAN: Training Real-World Blind Super-Resolution with Attention-based U-net Discriminators The authors are hidden for the purpose of double blind
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery This repository is the official implementati
OSLO: Open Source framework for Large-scale transformer Optimization
O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a
Django models and endpoints for working with large images -- tile serving
Django Large Image Models and endpoints for working with large images in Django -- specifically geared towards geospatial tile serving. DISCLAIMER: th
A simple CLI application helps you to find giant files that are eating up your system storage
Large file finder Sometimes it's very hard to find if some giant files are eating up your system storage. We might need to hunt those down. This simpl
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020] by Kaisiyuan Wang, Qianyi Wu, Linsen Song, Zhuoqian Yang, Wa
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
Large-scale pretraining for dialogue
A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle
Knover Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out eff
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ