154 Repositories
Python cuda-kernel Libraries
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
Poisson Image Editing - A Parallel Implementation Jiayi Weng (jiayiwen), Zixu Chen (zixuc) Poisson Image Editing is a technique that can fuse two imag
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch
RNNT loss in Pytorch - Numba JIT compiled (warprnnt_numba) Warp RNN Transducer Loss for ASR in Pytorch, ported from HawkAaron/warp-transducer and a re
Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.
VQGAN-CLIP-GENERATOR Overview This is a package (with available notebook) for running VQGAN+CLIP locally, with a focus on ease of use, good documentat
Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker
Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker A example FastAPI PyTorch Model deploy with nvidia/cuda base docker. Model
codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"
Eigenlearning This repo contains code for replicating the experiments of the paper A Theory of the Inductive Bias and Generalization of Kernel Regress
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃
This repository provides a library for efficient training of masked language models (MLM), built with fairseq. We fork fairseq to give researchers mor
Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.
Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng, Hongwei Yong, Lei Zhang, "Unfolded Deep Kernel Estimation for Blind Ima
Map Ookla server locations as a Kernel Density Estimation (KDE) geographic map plot.
Ookla Server KDE Plotting This notebook was created to map Ookla server locations as a Kernel Density Estimation (KDE) geographic map plot. Currently,
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj
This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)
Description This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs) in [Gardy et
Decorators for maximizing memory utilization with PyTorch & CUDA
torch-max-mem This package provides decorators for memory utilization maximization with PyTorch and CUDA by starting with a maximum parameter size and
A Python Jupyter Kernel in Slack. Just send Python code as a message.
Slack IPython bot 🤯 One Slack bot to rule them all. PyBot. Just send Python code as a message. Install pip install slack-ipython To start the bot, si
On the adaptation of recurrent neural networks for system identification
On the adaptation of recurrent neural networks for system identification This repository contains the Python code to reproduce the results of the pape
Snek-test - An operating system kernel made in python and assembly
pythonOS An operating system kernel made in python and assembly Wait what? It us
DaCe is a parallel programming framework that takes code in Python/NumPy and other programming languages
aCe - Data-Centric Parallel Programming Decoupling domain science from performance optimization. DaCe is a parallel programming framework that takes c
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。
QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可
Proof of concept of CVE-2022-21907 Double Free in http.sys driver, triggering a kernel crash on IIS servers
CVE-2022-21907 - Double Free in http.sys driver Summary An unauthenticated attacker can send an HTTP request with an "Accept-Encoding" HTTP request he
Instant neural graphics primitives: lightning fast NeRF and more
Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact
Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)
Time-aware Large Kernel (TaLK) Convolutions (Lioutas et al., 2020) This repository contains the source code, pre-trained models, as well as instructio
A Haskell kernel for IPython.
IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur
🎃 Core identification module of AI powerful point reading system platform.
ppReader-Kernel Intro Core identification module of AI powerful point reading system platform. Usage 硬件: Windows10、GPU:nvdia GTX 1060 、普通RBG相机 软件: con
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab
PySDM PySDM is a package for simulating the dynamics of population of particles. It is intended to serve as a building block for simulation systems mo
Awesome Graph Classification - A collection of important graph embedding, classification and representation learning papers with implementations.
A collection of graph classification methods, covering embedding, deep learning, graph kernel and factorization papers
Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance
Semi-supervised Deep Kernel Learning This is the code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data
Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs.
Lunar Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs. About Lunar can be modified to work
A Bot to Track Kernel Upstreams from kernel.org and Post it on Telegram Channel
Channel Kernel Tracker is the channel where the bot will be sending the updates in. Introduction This is a Telegram Bot to Track Kernel Upstreams kern
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.
The Ultimate PyTorch Source-Build Template Translations: 한국어 TL;DR PyTorch built from source can be x4 faster than a naïve PyTorch install. This repos
pythonOS: An operating system kernel made in python and assembly
pythonOS An operating system kernel made in python and assembly Wait what? It uses a custom compiler called snek that implements a part of python3.9 (
Python Jupyter kernel using Poetry for reproducible notebooks
Poetry Kernel Use per-directory Poetry environments to run Jupyter kernels. No need to install a Jupyter kernel per Python virtual environment! The id
NumPy aware dynamic Python compiler using LLVM
Numba A Just-In-Time Compiler for Numerical Functions in Python Numba is an open source, NumPy-aware optimizing compiler for Python sponsored by Anaco
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples
Welcome to the cuQuantum repository! This public repository contains two sets of files related to the NVIDIA cuQuantum SDK: samples: All C/C++ sample
nettrace is a powerful tool to trace network packet and diagnose network problem inside kernel.
nettrace nettrace is is a powerful tool to trace network packet and diagnose network problem inside kernel on TencentOS. It make use of eBPF and BCC.
Neural network for digit classification powered by cuda
cuda_nn_mnist Neural network library for digit classification powered by cuda Resources The library was built to work with MNIST dataset. python-mnist
Lightweight Cuda Renderer with Python Wrapper.
pyRender Lightweight Cuda Renderer with Python Wrapper. Compile Change compile.sh line 5 to the glm library include path. This library can be download
Fastshap: A fast, approximate shap kernel
fastshap: A fast, approximate shap kernel fastshap was designed to be: Fast Calculating shap values can take an extremely long time. fastshap utilizes
Robotics with GPU computing
Robotics with GPU computing Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to imple
An addernet CUDA version
Training addernet accelerated by CUDA Usage cd adder_cuda python setup.py install cd .. python main.py Environment pytorch 1.10.0 CUDA 11.3 benchmark
Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode
The source code for Adaptive Kernel Graph Neural Network at AAAI2022
AKGNN The source code for Adaptive Kernel Graph Neural Network at AAAI2022. Please cite our paper if you think our work is helpful to you: @inproceedi
Massively parallel Monte Carlo diffusion MR simulator written in Python.
Disimpy Disimpy is a Python package for generating simulated diffusion-weighted MR signals that can be useful in the development and validation of dat
NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles
NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles NewsMTSC is a dataset for target-dependent sentiment classification (TSC)
An echo kernel for JupyterLite
jupyterlite-echo-kernel An echo kernel for JupyterLite. Requirements JupyterLite = 0.1.0a10 Install To install the extension, execute: pip install ju
IDA Pro Python plugin to analyze and annotate Linux kernel alternatives
About This is an IDA Pro (Interactive Disassembler) plugin allowing to automatically analyze and annotate Linux kernel alternatives (content of .altin
Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU
GPU Docker NLP Application Deployment Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU, to setup the enviroment on
CUda Matrix Multiply library.
cumm CUda Matrix Multiply library. cumm is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I de
Self-Supervised Learning with Kernel Dependence Maximization
Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self
Code base for NeurIPS 2021 publication titled Kernel Functional Optimisation (KFO)
KernelFunctionalOptimisation Code base for NeurIPS 2021 publication titled Kernel Functional Optimisation (KFO) We have conducted all our experiments
A iot Bike sytem based on RaspberryPi, Ardiuino
Cyclic 's Kernel ---- A iot Bike sytem based on RaspberryPi, Ardiuino, etc 0x1 What is This? Cyclic 's Kernel is an independent System With self-produ
Prevent `CUDA error: out of memory` in just 1 line of code.
🐨 Koila Koila solves CUDA error: out of memory error painlessly. Fix it with just one line of code, and forget it. 🚀 Features 🙅 Prevents CUDA error
Realtime Viewer Mandelbrot set with Python and Taichi (cpu, opengl, cuda, vulkan, metal)
Mandelbrot-set-Realtime-Viewer- Realtime Viewer Mandelbrot set with Python and Taichi (cpu, opengl, cuda, vulkan, metal) Control: "WASD" - movement, "
Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".
#backdoor-HSIC (bd_HSIC) Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment". To generate
Kernel Point Convolutions
Created by Hugues THOMAS Introduction Update 27/04/2020: New PyTorch implementation available. With SemanticKitti, and Windows supported. This reposit
Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification
Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification T M Feroz Ali, Subhasis Chaudhuri, ICVGIP-20-21
Build and run Docker containers leveraging NVIDIA GPUs
NVIDIA Container Toolkit Introduction The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includ
Pytorch cuda extension of grid_sample1d
Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo
PointPillars inference with TensorRT
A project demonstrating how to use CUDA-PointPillars to deal with cloud points data from lidar.
Machine Learning Framework for Operating Systems - Brings ML to Linux kernel
KML: A Machine Learning Framework for Operating Systems & Storage Systems Storage systems and their OS components are designed to accommodate a wide v
Lightweight and Modern kernel for VK Bots
This is the kernel for creating VK Bots written in Python 3.9
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT
Learning kernels to maximize the power of MMD tests
Code for the paper "Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy" (arXiv:1611.04488; published at ICLR 2017), by Douga
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels
ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D
Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"
Reproducibility materials for "Importance of Kernel Bandwidth in Quantum Machine Learning" Repo structure: code contains Python scripts used to genera
Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel
Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel This repository is the official PyTorch implementation of BSRDM w
Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"
KSTER Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples" [paper]. Usage Download the processed datas
MLSpace: Hassle-free machine learning & deep learning development
MLSpace: Hassle-free machine learning & deep learning development
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework
VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).
Public implementation of the Convolutional Motif Kernel Network (CMKN) architecture
CMKN Implementation of the convolutional motif kernel network (CMKN) introduced in Ditz et al., "Convolutional Motif Kernel Network", 2021. Testing Yo
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).
GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily
GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily Abstract Graph Neural Networks (GNNs) are widely used on a
Simple tool, to update linux kernel on ubuntu
Kerbswap Simple tool, to update linux kernel on ubuntu Information At the moment, this tool only supports "Ubuntu" distributions, but will be expanded
OneFlow is a performance-centered and open-source deep learning framework.
OneFlow OneFlow is a performance-centered and open-source deep learning framework. Latest News Version 0.5.0 is out! First class support for eager exe
Driver Buddy Reloaded is an IDA Pro Python plugin that helps automate some tedious Windows Kernel Drivers reverse engineering tasks.
Driver Buddy Reloaded Quickstart Table of Contents Installation Usage About Driver Buddy Reloaded Finding DispatchDeviceControl Labelling WDM & WDF St
A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen.
Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch
This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I
GPU-accelerated image processing using cupy and CUDA
napari-cupy-image-processing GPU-accelerated image processing using cupy and CUDA This napari plugin was generated with Cookiecutter using with @napar
cairo_kernel is a simple Jupyter kernel for Cairo a smart contract programing language for STARKs.
cairo_kernel cairo_kernel is a simple Jupyter kernel for Cairo a smart contract programing language for STARKs. Installation Install virtualenv virtua
a reimplementation of Holistically-Nested Edge Detection in PyTorch
pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please
a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version
pytorch-liteflownet This is a personal reimplementation of LiteFlowNet [1] using PyTorch. Should you be making use of this work, please cite the paper
a reimplementation of UnFlow in PyTorch that matches the official TensorFlow version
pytorch-unflow This is a personal reimplementation of UnFlow [1] using PyTorch. Should you be making use of this work, please cite the paper according
an implementation of softmax splatting for differentiable forward warping using PyTorch
softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
3d-ken-burns This is a reference implementation of 3D Ken Burns Effect from a Single Image [1] using PyTorch. Given a single input image, it animates
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes This repository contains the source code accompanying the paper: FlexConv: C
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19) Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Ry
A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1
What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun
Collection of Docker images for ML/DL and video processing projects
Collection of Docker images for ML/DL and video processing projects. Overview of images Three types of images differ by tag postfix: base: Python with
An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch
This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch
revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Inferred Model-based Fuzzer
IMF: Inferred Model-based Fuzzer IMF is a kernel API fuzzer that leverages an automated API model inferrence techinque proposed in our paper at CCS. I
Fuzzer for Linux Kernel Drivers
difuze: Fuzzer for Linux Kernel Drivers This repo contains all the sources (including setup scripts), you need to get difuze up and running. Tested on
Code for the USENIX 2017 paper: kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels
kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels Blazing fast x86-64 VM kernel fuzzing framework with performant VM reloads for Linux, MacOS an
Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.
Linux_kernel_exploits Some Linux kernel exploits for various real world kernel vulnerabilities here. More exploits are yet to come. This repo contains
A Kernel fuzzer focusing on race bugs
Razzer: Finding kernel race bugs through fuzzing Environment setup $ source scripts/envsetup.sh scripts/envsetup.sh sets up necessary environment var
Fuzzing the Kernel Using Unicornafl and AFL++
Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch
pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according
The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".
BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B
Keval allows you to call arbitrary Windows kernel-mode functions from user mode, even (and primarily) on another machine.
Keval Keval allows you to call arbitrary Windows kernel-mode functions from user mode, even (and primarily) on another machine. The user mode portion
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"
KPAC: Kernel-Sharing Parallel Atrous Convolutional block This repository contains the official Tensorflow implementation of the following paper: Singl