3246 Python Data-app-performance Libraries

Secret messaging app which you can use to communicate with your friends by encrypting / decrypting secret messages or sending secret message through mail.

Secret-Whisper A Secret messaging app which you can use to communicate with your friends by encrypting / decrypting secret messages 🤫 or sending secr

3 Jan 1, 2022

This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

19 Oct 27, 2022

Lumen provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

Lumen project provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

120 Jan 4, 2023

Home Assistant for Opendata CWB. Get the weather forecast of the city in Taiwan.

Home assistant support for Opendata CWB. The readme in Traditional Chinese. This integration is based on OpenWeatherMap (@csparpa, pyowm) to develop.

11 Sep 30, 2022

A New, Interactive Approach to Learning Python

This is the repository for The Python Workshop, published by Packt. It contains all the supporting project files necessary to work through the course from start to finish.

231 Dec 26, 2022

Keras-1D-ACGAN-Data-Augmentation

Keras-1D-ACGAN-Data-Augmentation What is the ACGAN(Auxiliary Classifier GANs) ? Related Paper : [Abstract : Synthesizing high resolution photorealisti

7 Dec 23, 2022

DownTime-Score is a Small project aimed to Monitor the performance and the availabillity of a variety of the Vital and Critical Moroccan Web Portals

DownTime-Score DownTime-Score is a Small project aimed to Monitor the performance and the availabillity of a variety of the Vital and Critical Morocca

5 Apr 30, 2022

A Python script for rendering glTF files with V-Ray App SDK

V-Ray glTF viewer Overview The V-Ray glTF viewer is a set of Python scripts for the V-Ray App SDK that allow the parsing and rendering of glTF (.gltf

24 Dec 5, 2022

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

308 Jan 4, 2023

The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

DS3L This is the code for paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020. Setups The code is implem

36 Oct 19, 2022

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

127 Jan 4, 2023

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

182 Dec 30, 2022

Deep functional residue identification

DeepFRI Deep functional residue identification Citing @article {Gligorijevic2019, author = {Gligorijevic, Vladimir and Renfrew, P. Douglas and Koscio

156 Dec 25, 2022

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues Overview ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialo

69 Dec 29, 2022

pyprobables is a pure-python library for probabilistic data structures

pyprobables is a pure-python library for probabilistic data structures. The goal is to provide the developer with a pure-python implementation of common probabilistic data-structures to use in their work.

86 Dec 25, 2022

leafmap - A Python package for geospatial analysis and interactive mapping in a Jupyter environment.

A Python package for geospatial analysis and interactive mapping with minimal coding in a Jupyter environment

1.4k Jan 2, 2023

Code for the paper "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High"

Wietse de Vries • Martijn Bartelds • Malvina Nissim • Martijn Wieling Adapting Monolingual Models: Data can be Scarce when Language Similarity is High

5 Aug 2, 2021

EOReader is a multi-satellite reader allowing you to open optical and SAR data.

Remote-sensing opensource python library reading optical and SAR sensors, loading and stacking bands, clouds, DEM and index.

152 Dec 30, 2022

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

federated is the source code for the Bachelor's Thesis Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU) Federat

25 Nov 30, 2022

Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well as on a large scale cloud cluster.

3.6k Jan 9, 2023

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

55 Dec 27, 2022

Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

49 Nov 25, 2022

Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

19 Aug 4, 2021

An Unsupervised Graph-based Toolbox for Fraud Detection

An Unsupervised Graph-based Toolbox for Fraud Detection Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates s

99 Dec 11, 2022

Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

121 Dec 28, 2022

A powerful bot to copy your google drive data to your team drive

⚛️ Clonebot - Heroku version ⚡ CloneBot is a telegram bot that allows you to copy folder/team drive to team drives. One of the main advantage of this

269 Dec 23, 2022

A Lucid Framework for Transparent and Interpretable Machine Learning Models.

Currently a Beta-Version lucidmode is an open-source, low-code and lightweight Python framework for transparent and interpretable machine learning mod

15 Aug 12, 2022

Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi

109 Dec 30, 2022

This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

AWS Data Engineering Pipeline This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For

15 Jul 28, 2021

Donors data of Tamil Nadu Chief Ministers Relief Fund scrapped from https://ereceipt.tn.gov.in/cmprf/Interface/CMPRF/MonthWiseReport

Tamil Nadu Chief Minister's Relief Fund Donors Scrapped data from https://ereceipt.tn.gov.in/cmprf/Interface/CMPRF/MonthWiseReport Scrapper scrapper.p

5 May 18, 2021

A minimal Streamlit app showing how to launch and stop a FastAPI process on demand

Simple Streamlit + FastAPI Integration A minimal Streamlit app showing how to launch and stop a FastAPI process on demand. The FastAPI /run route simu

18 Jan 2, 2023

Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

iterable-subprocess Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be

5 Jul 10, 2022

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

181 Dec 18, 2022

Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons.

42 Nov 9, 2022

Sample Dockerized flask app deployed on Kubernetes on Azure using AKS

22 Sep 8, 2021

Graphsignal Logger

Graphsignal Logger Overview Graphsignal is an observability platform for monitoring and troubleshooting production machine learning applications. It h

143 Dec 5, 2022

Telegram bot + userbot for streaming audio in group calls.

Calls Music — Telegram bot + userbot for streaming audio in group calls ✍🏻 Requirements FFmpeg Python 3.7+ 🚀 Deployment 🛠 Configuration Copy exampl

30 May 17, 2021

Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once

206 Jan 2, 2023

Show Data: Show your dataset in web browser!

Show Data is to generate html tables for large scale image dataset, especially for the dataset in remote server. It provides some useful commond line tools and fully customizeble API reference to generate html table different tasks.

83 Nov 26, 2022

Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:

Stream Framework Activity Streams & Newsfeeds Stream Framework is a Python library which allows you to build activity streams & newsfeeds using Cassan

4.7k Jan 2, 2023

The programm for collecting data from Tinkoff API and building Excel table.

tinkproject The program for portfolio analysis via Tinkoff API Hello! This is my first project, please, don't judge me. This project was developed for

214 Dec 2, 2022

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

High-Performance Brain-to-Text Communication via Handwriting Overview This repo is associated with this manuscript, preprint and dataset. The code can

306 Jan 3, 2023

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

43 Nov 7, 2022

A curated list of programmatic weak supervision papers and resources

118 Jan 2, 2023

Deep learning toolbox based on PyTorch for hyperspectral data classification.

304 Dec 28, 2022

Code & Data for Enhancing Photorealism Enhancement

Intel ISL (Intel Intelligent Systems Lab)

1.1k Jan 8, 2023

A framework for cleaning Chinese dialog data

136 Dec 20, 2022

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

305 Dec 22, 2022

Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.

Esparto is a simple HTML and PDF document generator for Python. Its primary use is for generating shareable single page reports with content from popular analytics and data science libraries.

76 Dec 12, 2022

Simple but maybe too simple config management through python data classes. We use it for machine learning.

👩‍✈️ Coqpit Simple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization. Curre

67 Nov 29, 2022

The project help you to quickly build layouts in terminal，cross-platform

133 Nov 30, 2022

Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

This is the XLM-T repository, which includes data, code and pre-trained multilingual language models for Twitter. XLM-T - A Multilingual Language Mode

112 Dec 27, 2022

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more than just a paraphrasing model.

681 Jan 1, 2023

Hue Editor: Open source SQL Query Assistant for Databases/Warehouses

759 Jan 7, 2023

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

8.4k Jan 1, 2023

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

3.7k Jan 3, 2023

Cinder is Instagram's internal performance-oriented production version of CPython

Cinder is Instagram's internal performance-oriented production version of CPython 3.8. It contains a number of performance optimizations, including bytecode inline caching, eager evaluation of coroutines, a method-at-a-time JIT, and an experimental bytecode compiler that uses type annotations to emit type-specialized bytecode that performs better in the JIT.

2.2k Dec 30, 2022

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar

726 Dec 30, 2022

Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Jan 8, 2023

Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

8.1k Dec 30, 2022

Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions.

2.3k Jan 1, 2023

The best way to have DRY Django forms. The app provides a tag and filter that lets you quickly render forms in a div format while providing an enormous amount of capability to configure and control the rendered HTML.

django-crispy-forms The best way to have Django DRY forms. Build programmatic reusable layouts out of components, having full control of the rendered

4.6k Dec 31, 2022

Ray provides a simple, universal API for building distributed applications.

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

23.5k Jan 5, 2023

erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes

erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes. Diagrams are rendered using the venerable Graphviz library.

129 Jan 4, 2023

A Django app that integrates with Dramatiq.

django_dramatiq django_dramatiq is a Django app that integrates with Dramatiq. Requirements Django 1.11+ Dramatiq 0.18+ Example You can find an exampl

261 Dec 25, 2022

A simple app that provides django integration for RQ (Redis Queue)

Django-RQ Django integration with RQ, a Redis based Python queuing library. Django-RQ is a simple app that allows you to configure your queues in djan

1.6k Dec 28, 2022

A modular, high performance, headless e-commerce platform built with Python, GraphQL, Django, and React.

Saleor Commerce Customer-centric e-commerce on a modern stack A headless, GraphQL-first e-commerce platform delivering ultra-fast, dynamic, personaliz

17.7k Dec 31, 2022

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP] Unofficial Pytorch implementation of AdaSpeech 2. Requirements : All code written i

63 Dec 28, 2022

CLEAR algorithm for multi-view data association

CLEAR: Consistent Lifting, Embedding, and Alignment Rectification Algorithm The Matlab, Python, and C++ implementation of the CLEAR algorithm, as desc

30 Jan 2, 2023

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

36 Dec 2, 2022

Learning Representational Invariances for Data-Efficient Action Recognition

Learning Representational Invariances for Data-Efficient Action Recognition Official PyTorch implementation for Learning Representational Invariances

27 Nov 22, 2022

A Django blog app implemented in Wagtail

Puput Puput is a powerful and simple Django app to manage a blog. It uses the awesome Wagtail CMS as content management system. Puput is the catalan n

535 Jan 8, 2023

No effort, no worry, maximum performance.

Django Cachalot Caches your Django ORM queries and automatically invalidates them. Documentation: http://django-cachalot.readthedocs.io Table of Conte

979 Jan 3, 2023

Django app for handling the server headers required for Cross-Origin Resource Sharing (CORS)

django-cors-headers A Django App that adds Cross-Origin Resource Sharing (CORS) headers to responses. This allows in-browser requests to your Django a

4.8k Jan 5, 2023

Django app that enables staff to log in as other users using their own credentials.

Impostor Impostor is a Django application which allows staff members to login as a different user by using their own username and password. Login Logg

144 Dec 13, 2022

Django application and library for importing and exporting data with admin integration.

django-import-export django-import-export is a Django application and library for importing and exporting data with included admin integration. Featur

2.6k Jan 7, 2023

Packages of Example Data for The Effect

causaldata This repository will contain R, Stata, and Python packages, all called causaldata, which contain data sets that can be used to implement th

103 Dec 24, 2022

hyppo is an open-source software package for multivariate hypothesis testing.

hyppo (HYPothesis Testing in PythOn, pronounced "Hippo") is an open-source software package for multivariate hypothesis testing.

137 Dec 18, 2022

LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog， Language Modelling, and other related tasks using these models.

2.5k Jan 3, 2023

A command line tool for memorizing algorithms in Python by typing them.

Algo Drills A command line tool for memorizing algorithms in Python by typing them. In alpha and things will change. How it works Type out an algorith

43 Dec 2, 2022

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

SharinGAN Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation" The official project we

23 Oct 19, 2022

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

15k Jan 2, 2023

Research shows Google collects 20x more data from Android than Apple collects from iOS. Block this non-consensual telemetry using pihole blocklists.

pihole-antitelemetry Research shows Google collects 20x more data from Android than Apple collects from iOS. Block both using these pihole lists. Proj

290 Jan 9, 2023

Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

SARS-CoV-2 processing requests Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide. Prerequisites This autom

17 Aug 13, 2022

CLI and Streamlit applications to create APIs from Excel data files within seconds, using FastAPI

FastAPI-Wrapper CLI & APIness Streamlit App Arvindra Sehmi, Oxford Economics Ltd. | Website | LinkedIn (Updated: 21 April, 2021) fastapi-wrapper is mo

49 Dec 3, 2022

Xarray backend to Copernicus Sentinel-1 satellite data products

xarray-sentinel WARNING: this product is a "technology preview" / pre-Alpha Xarray backend to explore and load Copernicus Sentinel-1 satellite data pr

191 Dec 15, 2022

A scikit-learn-compatible module for estimating prediction intervals.

|Anaconda|_ MAPIE - Model Agnostic Prediction Interval Estimator MAPIE allows you to easily estimate prediction intervals using your favourite sklearn

584 Dec 27, 2022

Chat app for Django, powered by Django Channels, Websockets & Asyncio

Django Private Chat2 New and improved https://github.com/Bearle/django-private-chat Chat app for Django, powered by Django Channels, Websockets & Asyn

205 Dec 30, 2022

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Fine-Grained R2R Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation. C

34 Nov 15, 2022

🏅 The Most Comprehensive List of Kaggle Solutions and Ideas 🏅

🏅 Collection of Kaggle Solutions and Ideas 🏅

2.3k Jan 8, 2023

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

Coreferee Author: Richard Paul Hudson, msg systems ag 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 German 1.2.3 Polish 1

169 Dec 21, 2022

stock data on eink with raspberry

small python skript to display tradegate data on a waveshare e-ink important you need locale "de_AT.UTF-8 UTF-8" installed. do so in raspi-config's Lo

24 Feb 22, 2022

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

64 Dec 17, 2022

Focus on Algorithm Design, Not on Data Wrangling

The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.

37 Nov 25, 2022

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022

skweak: A software toolkit for weak supervision applied to NLP tasks

Labelled data remains a scarce resource in many practical NLP scenarios. This is especially the case when working with resource-poor languages (or text domains), or when using task-specific labels without pre-existing datasets. The only available option is often to collect and annotate texts by hand, which is expensive and time-consuming.

Norsk Regnesentral (Norwegian Computing Center)

850 Dec 28, 2022

Python Data-app-performance Resources

Python data-app-performance Libraries

Secret messaging app which you can use to communicate with your friends by encrypting / decrypting secret messages or sending secret message through mail.

This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Lumen provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

Home Assistant for Opendata CWB. Get the weather forecast of the city in Taiwan.

A New, Interactive Approach to Learning Python

Keras-1D-ACGAN-Data-Augmentation

DownTime-Score is a Small project aimed to Monitor the performance and the availabillity of a variety of the Vital and Critical Moroccan Web Portals

A Python script for rendering glTF files with V-Ray App SDK

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Deep functional residue identification

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

pyprobables is a pure-python library for probabilistic data structures

leafmap - A Python package for geospatial analysis and interactive mapping in a Jupyter environment.

Code for the paper "Adapting Monolingual Models: Data can be Scarce when Language Similarity is High"

EOReader is a multi-satellite reader allowing you to open optical and SAR data.

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

Orchest is a browser based IDE for Data Science.

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

Procedural 3D data generation pipeline for architecture

Image data augmentation scheduler for albumentations transforms

An Unsupervised Graph-based Toolbox for Fraud Detection

Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

A powerful bot to copy your google drive data to your team drive

A Lucid Framework for Transparent and Interpretable Machine Learning Models.

Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Donors data of Tamil Nadu Chief Ministers Relief Fund scrapped from https://ereceipt.tn.gov.in/cmprf/Interface/CMPRF/MonthWiseReport

A minimal Streamlit app showing how to launch and stop a FastAPI process on demand

Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Aggregating gridded data (xarray) to polygons

Sample Dockerized flask app deployed on Kubernetes on Azure using AKS

Graphsignal Logger

Telegram bot + userbot for streaming audio in group calls.

Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once

Show Data: Show your dataset in web browser!

Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:

The programm for collecting data from Tinkoff API and building Excel table.

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

A curated list of programmatic weak supervision papers and resources

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Code & Data for Enhancing Photorealism Enhancement

A framework for cleaning Chinese dialog data

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Simple HTML and PDF document generator for Python - with built-in support for popular data analysis and plotting libraries.

Simple but maybe too simple config management through python data classes. We use it for machine learning.

The project help you to quickly build layouts in terminal，cross-platform

Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Hue Editor: Open source SQL Query Assistant for Databases/Warehouses

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Cinder is Instagram's internal performance-oriented production version of CPython

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Data manipulation and transformation for audio signal processing, powered by PyTorch

Statsmodels: statistical modeling and econometrics in Python

Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.

The best way to have DRY Django forms. The app provides a tag and filter that lets you quickly render forms in a div format while providing an enormous amount of capability to configure and control the rendered HTML.

Ray provides a simple, universal API for building distributed applications.

erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes

A Django app that integrates with Dramatiq.

A simple app that provides django integration for RQ (Redis Queue)

A modular, high performance, headless e-commerce platform built with Python, GraphQL, Django, and React.

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

CLEAR algorithm for multi-view data association

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

Learning Representational Invariances for Data-Efficient Action Recognition

A Django blog app implemented in Wagtail

No effort, no worry, maximum performance.

Django app for handling the server headers required for Cross-Origin Resource Sharing (CORS)

Django app that enables staff to log in as other users using their own credentials.

Django application and library for importing and exporting data with admin integration.

Packages of Example Data for The Effect

hyppo is an open-source software package for multivariate hypothesis testing.