2632 Python Cryptocurrency-candlesticks-data Libraries

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Imbalanced Dataset Sampler Introduction In many machine learning applications, we often come across datasets where some types of data may be seen more

2k Jan 8, 2023

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.

D2L.ai: Interactive Deep Learning Book with Multi-Framework Code, Math, and Discussions Book website | STAT 157 Course at UC Berkeley | Latest version

16k Jan 3, 2023

Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)

Hierarchical neural-net interpretations (ACD) 🧠 Produces hierarchical interpretations for a single prediction made by a pytorch neural network. Offic

111 Jan 3, 2023

Implementation of linear CorEx and temporal CorEx.

Correlation Explanation Methods Official implementation of linear correlation explanation (linear CorEx) and temporal correlation explanation (T-CorEx

34 Nov 15, 2022

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

24 Dec 5, 2022

WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

176 Dec 28, 2022

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

16 Jul 16, 2022

Jackrabbit Relay is an API endpoint for stock, forex and cryptocurrency exchanges that accept REST webhooks.

JackrabbitRelay Jackrabbit Relay is an API endpoint for stock, forex and cryptocurrency exchanges that accept REST webhooks. Disclaimer Please note RA

23 Jan 4, 2023

Pro Football Reference Game Data Webscraper

Pro Football Reference Game Data Webscraper Code Copyright Yeetzsche This is a simple Pro Football Reference Webscraper that can either collect all ga

6 Dec 21, 2022

A Tools that help Data Scientists and ML engineers train and deploy ML models.

Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers

73 Oct 17, 2022

Run with one command grafana, prometheus, and a python script to collect and display cryptocurrency prices and track your wallet balance.

CryptoWatch Track your favorite crypto coin price and your wallet balance. Install Create .env: ADMIN_USER=admin ADMIN_PASSWORD=admin Configure you

13 Dec 13, 2022

Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging

BERT Got a Date: Introducing Transformers to Temporal Tagging Satya Almasian*, Dennis Aumiller*, and Michael Gertz Heidelberg University Contact us vi

54 Dec 4, 2022

Python library which makes it possible to dynamically mask/anonymize data using JSON string or python dict rules in a PySpark environment.

pyspark-anonymizer Python library which makes it possible to dynamically mask/anonymize data using JSON string or python dict rules in a PySpark envir

6 Jun 30, 2022

High capacity, high availability, well connected, fast lightning node.

LND ⚡ Routing High capacity, high availability, well connected, fast lightning node. We aim to become a top liquidity provider for the lightning netwo

18 Dec 16, 2022

Meli Data Challenge 2021 - First Place Solution

My solution for the Meli Data Challenge 2021

23 Mar 9, 2022

The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

249 Dec 22, 2022

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

730 Jan 9, 2023

An esoteric data type built entirely of NaNs.

NaNsAreNumbers An esoteric data type built entirely of NaNs. Installation pip install nans_are_numbers Explanation A floating point number is just co

72 Jan 1, 2023

Labelling platform for text using distant supervision

With DataQA, you can label unstructured text documents using rule-based distant supervision.

245 Aug 5, 2022

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

86 Oct 28, 2022

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Pytorch Recurrent Variational Autoencoder Model: This is the implementation of Samuel Bowman's Generating Sentences from a Continuous Space with Kim's

347 Nov 14, 2022

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Anchored CorEx: Hierarchical Topic Modeling with Minimal Domain Knowledge Correlation Explanation (CorEx) is a topic model that yields rich topics tha

592 Dec 18, 2022

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

TextAttack 🐙 Generating adversarial examples for NLP models [TextAttack Documentation on ReadTheDocs] About • Setup • Usage • Design About TextAttack

2.2k Jan 3, 2023

Faker is a Python package that generates fake data for you.

Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in yo

15.2k Jan 1, 2023

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

95 Dec 28, 2022

PySpark Cheat Sheet - learn PySpark and develop apps faster

This cheat sheet will help you learn PySpark and write PySpark apps faster. Everything in here is fully functional PySpark code you can run or adapt to your programs.

168 Jan 1, 2023

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

1.3k Dec 30, 2022

The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

5 Mar 27, 2022

Python reader for Linked Data in HDF5 files

Linked Data are becoming more popular for user-created metadata in HDF5 files.

8 May 17, 2022

Django package to log request values such as device, IP address, user CPU time, system CPU time, No of queries, SQL time, no of cache calls, missing, setting data cache calls for a particular URL with a basic UI.

django-web-profiler's documentation: Introduction: django-web-profiler is a django profiling tool which logs, stores debug toolbar statistics and also

77 Oct 29, 2022

Data from popular CS:GO website hltv.org

Welcome to hltv-data 👋 🎮 Data from popular CS:GO website hltv.org Install pip install hltv-data Usage The public methods can be reached using HLTVCl

28 Dec 23, 2022

Obsidian tools - a Python package for analysing an Obsidian.md vault

obsidiantools is a Python package for getting structured metadata about your Obsidian.md notes and analysing your vault.

153 Jan 4, 2023

A demo of a data science project using Kedro

iris Overview This is your new Kedro project, which was generated using Kedro 0.17.4. Take a look at the Kedro documentation to get started. Rules and

14 Oct 14, 2022

django-dashing is a customisable, modular dashboard application framework for Django to visualize interesting data about your project. Inspired in the dashboard framework Dashing

django-dashing django-dashing is a customisable, modular dashboard application framework for Django to visualize interesting data about your project.

703 Dec 22, 2022

Produces a summary CSV report of an Amber Electric customer's energy consumption and cost data.

Amber Electric Usage Summary This is a command line tool that produces a summary CSV report of an Amber Electric customer's energy consumption and cos

12 May 26, 2022

Rafael Project- Classifying rockets to different types using data science algorithms.

Rocket-Classify Rafael Project- Classifying rockets to different types using data science algorithms. In this project we received data base with data

5 Sep 18, 2021

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ)

Real2CAD-3DV Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ) Group Member: Yue Pan, Yuanwen Yue, Bingxin Ke, Yujie He

24 Jun 22, 2022

A daily updated JSON dataset of all the Open House London venues, events, and metadata

Open House London listings data All of it. Automatically scraped hourly with updates committed to git, autogenerated per-day CSV's, and autogenerated

4 Jan 1, 2022

lightweight, fast and robust columnar dataframe for data analytics with online update

streamdf Streamdf is a lightweight data frame library built on top of the dictionary of numpy array, developed for Kaggle's time-series code competiti

23 May 19, 2022

CS 506 - Computational Tools for Data Science

CS 506 - Computational Tools for Data Science Code, slides, and notes for Boston University CS506 Fall 2021 The Final Project Repository can be found

14 Mar 23, 2022

An all-inclusive Python framework for the Riot Games League of Legends API. We focus on making the data easy and fun to work with, while providing all the tools necessary to create a website or do data analysis.

Cassiopeia A Python adaptation of the Riot Games League of Legends API (https://developer.riotgames.com/). Cassiopeia is the sister library to Orianna

473 Jan 7, 2023

Learn how to responsibly deliver value with ML.

Made With ML Applied ML · MLOps · Production Join 30K+ developers in learning how to responsibly deliver value with ML. 🔥 Among the top MLOps reposit

32k Dec 30, 2022

Visions provides an extensible suite of tools to support common data analysis operations

Visions And these visions of data types, they kept us up past the dawn. Visions provides an extensible suite of tools to support common data analysis

168 Dec 28, 2022

An open-source Python project series where beginners can contribute and practice coding.

Python Mini Projects A collection of easy Python small projects to help you improve your programming skills. Table Of Contents Aim Of The Project Cont

491 Jan 4, 2023

An unofficial python API for trading on the DeGiro platform, with the ability to get real time data and historical data.

DegiroAPI An unofficial API for the trading platform Degiro written in Python with the ability to get real time data and historical data for products.

5 Dec 16, 2022

jsoooooooon derulo - Make sure your 'jason derulo' is featured as the first part of your json data

jsonderulo Make sure your 'jason derulo' is featured as the first part of your json data Install: # python pip install jsonderulo poetry add jsonderul

3 Sep 13, 2021

Tools to help record data from Qiskit jobs

archiver4qiskit Tools to help record data from Qiskit jobs. Install with pip install git+https://github.com/NCCR-SPIN/archiver4qiskit.git Import the

0 Dec 10, 2021

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

18 Dec 9, 2022

Stor is a community-driven green cryptocurrency based on a proof of space and time consensus algorithm.

Stor Blockchain Stor is a community-driven green cryptocurrency based on a proof of space and time consensus algorithm. For more information, see our

15 May 18, 2022

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

18 Sep 10, 2022

Python solutions to Codeforces problems

CodeForces This repository is dedicated to my Python solutions for CodeForces problems. Feel free to copy, contribute and/or comment. If you find any

15 Dec 20, 2022

Python code written to utilize the Korlan usb2can hardware to send and receive data over the can-bus on a 2008 Nissan 350z

nissan_ecu_hacking Python code written to utilize the Korlan usb2can hardware to send and receive data over the can-bus on a 2008 Nissan 350z My goal

11 Sep 24, 2022

Pokemon catch events project to demonstrate data pipeline on AWS

Pokemon Catches Data Pipeline This is a sample project to practice end-to-end data project; Terraform is used to deploy infrastructure; Kafka is the t

4 Sep 3, 2021

Sample project showing reliable data ingestion application using FastAPI and dramatiq

Create and deploy a reliable data ingestion service with FastAPI, SQLModel and Dramatiq This is the source code for the data ingestion service explain

31 Nov 30, 2022

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code

A Python framework for creating reproducible, maintainable and modular data science code.

7.9k Jan 1, 2023

Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

GD-VCR Code for Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning (EMNLP 2021). Research Questions and Aims: How well can a model perform o

24 Oct 13, 2022

A simple script & container to pull COVID data from covidlive.com.au and post a summary to a slack channel

CovidLive AU Summary Slackbot This bot is a very simple slackbot that pulls data, summarises and posts up to date AU COVID stats to a provided slack c

3 Dec 18, 2021

Python wrapper for Coinex APIs

coinexpy - Python wrapper for Coinex APIs Through coinexpy you can simply buy or sell crypto in your Coinex account Features place limit order place m

16 Jan 2, 2023

Related resources for our EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning

Plan-then-Generate: Controlled Data-to-Text Generation via Planning Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier Code

61 Jan 3, 2023

A visualization of people a user follows on Twitter

Twitter-Map This software allows the user to create maps of Twitter accounts. Installation git clone [email protected]:OGreenwood672/Twitter-Map.git cd T

12 Jul 20, 2022

DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key

Ditch YouTubeAPI (BETA) DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key Detai

23 Dec 22, 2022

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

PremiershipPlayerAnalysis Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data. No

5 Sep 6, 2021

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

323 Jan 1, 2023

Home Assistant integration for energy consumption data from UK SMETS (Smart) meters using the Hildebrand Glow API.

Hildebrand Glow (DCC) Integration Home Assistant integration for energy consumption data from UK SMETS (Smart) meters using the Hildebrand Glow API. T

153 Dec 30, 2022

Fix datetime EXIF data in photos downloaded from Google Takeout

fix-google-takeout Warning Use at your own risk. Backup your photos. Overview Google takeout for photos

20 Nov 5, 2022

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）

DataCLUE 以数据为中心的AI测评(DataCLUE) DataCLUE: A Chinese Data-centric Language Evaluation Benchmark 内容导引章节描述简介介绍以数据为中心的AI测评(DataCLUE)的背景任务描述任务描述实验结果

135 Dec 22, 2022

load .txt to train YOLOX, same as Yolo others

YOLOX train your data you need generate data.txt like follow format (per line- one image). prepare one data.txt like this: img_path1 x1,y1,x2,y2,clas

18 Aug 18, 2022

A simple Python library to integrate with the Heron Data API

Heron Python This library provides easy access to the Heron Data API from applications written in Python. Documentation No language-specific docs are

11 Nov 11, 2022

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

44 Dec 27, 2022

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

18 Nov 21, 2022

Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

time-series-kafka-demo Mock stream producer for time series data using Kafka. I walk through this tutorial and others here on GitHub and on my Medium

26 Nov 15, 2022

Lux Academy & Data Science East Africa Python Boot Camp, Building and Deploying Flask Application Using Docker Demo App.

Flask and Docker Application Demo A Docker image is a read-only, inert template that comes with instructions for deploying containers. In Docker, ever

11 Oct 29, 2022

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

105 Jan 3, 2023

In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqtrade so much yet.

My Freqtrade stuff In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqt

104 Dec 31, 2022

A library for interacting with Path of Exile game and economy data, and a unique loot filter generation framework.

wraeblast A library for interfacing with Path of Exile game and economy data, and a set of item filters geared towards trade league players. Filter Ge

29 Aug 28, 2022

We protect the privacy of the data on your computer by using the camera of your Debian based Pardus operating system. 🕵️

Pardus Lookout We protect the privacy of the data on your computer by using the camera of your Debian based Pardus operating system. The application i

19 Nov 18, 2022

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

105 Jan 3, 2023

Functional Data Analysis, or FDA, is the field of Statistics that analyses data that depend on a continuous parameter.

Functional Data Analysis Python package

Grupo de Aprendizaje Automático - Universidad Autónoma de Madrid

184 Dec 27, 2022

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

114 Dec 29, 2022

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.

3.1k Dec 31, 2022

Bot for Telegram data Analysis

Bot Scraper for telegram This bot use an AI to Work powered by BOG Team you must do the following steps to make the bot functional: Install the requir

8 Nov 28, 2022

Kyrie Eleison - The best and unique way to encrypt some data or a file safely

Encrypt your important data and files easily and safely with Kyrie Eleison.

39 Oct 27, 2022

Case studies with Bayesian methods

8 Nov 26, 2022

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

The baseline code is for EDA: Easy Data Augmentation techniques for boosting performance on text classification tasks

81 Dec 9, 2022

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

img2dataset Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports

1.4k Jan 1, 2023

✨Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.

✨A Python framework to explore, label, and monitor data for NLP projects

1.5k Jan 2, 2023

Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

81 Dec 9, 2022

CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that data.

CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that

29 Aug 9, 2022

TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

1.6k Jan 6, 2023

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

FLAML - Fast and Lightweight AutoML

2.2k Jan 9, 2023

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT Speech@FIT Reverb Database.

Add_noise_and_rir_to_speech The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal

7 Oct 30, 2022

skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.

skimpy Welcome Welcome to skimpy! skimpy is a light weight tool that provides summary statistics about variables in data frames within the console. Th

267 Dec 29, 2022

The Devils Eye is an OSINT tool that searches the Darkweb for onion links and descriptions that match with the users query without requiring the use for Tor.

The Devil's Eye searches the darkweb for information relating to the user's query and returns the results including .onion links and their description

135 Dec 31, 2022

Python based tool to extract forensic info from EventTranscript.db (Windows Diagnostic Data)

EventTranscriptParser EventTranscriptParser is python based tool to extract forensically useful details from EventTranscript.db (Windows Diagnostic Da

24 Nov 18, 2022

A Python interface between Earth Engine and xarray for processing weather and climate data

wxee What is wxee? wxee was built to make processing gridded, mesoscale time series weather and climate data quick and easy by integrating the data ca

160 Dec 31, 2022

A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data structure.

40 Dec 12, 2022

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

86 Dec 7, 2022

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning Overview This code is for paper: Not All Unlabeled Data are Equa

22 Nov 23, 2022

Goblyn is a Python tool focused to enumeration and capture of website files metadata.

Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se

46 Nov 22, 2022

Python Cryptocurrency-candlesticks-data Resources

Python cryptocurrency-candlesticks-data Libraries

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.

Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)

Implementation of linear CorEx and temporal CorEx.

The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

WRENCH: Weak supeRvision bENCHmark

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Jackrabbit Relay is an API endpoint for stock, forex and cryptocurrency exchanges that accept REST webhooks.

Pro Football Reference Game Data Webscraper

A Tools that help Data Scientists and ML engineers train and deploy ML models.

Run with one command grafana, prometheus, and a python script to collect and display cryptocurrency prices and track your wallet balance.

Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging

Python library which makes it possible to dynamically mask/anonymize data using JSON string or python dict rules in a PySpark environment.

High capacity, high availability, well connected, fast lightning node.

Meli Data Challenge 2021 - First Place Solution

The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

An esoteric data type built entirely of NaNs.

Labelling platform for text using distant supervision

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

Faker is a Python package that generates fake data for you.

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

PySpark Cheat Sheet - learn PySpark and develop apps faster

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

Python reader for Linked Data in HDF5 files

Django package to log request values such as device, IP address, user CPU time, system CPU time, No of queries, SQL time, no of cache calls, missing, setting data cache calls for a particular URL with a basic UI.

Data from popular CS:GO website hltv.org

Obsidian tools - a Python package for analysing an Obsidian.md vault

A demo of a data science project using Kedro

django-dashing is a customisable, modular dashboard application framework for Django to visualize interesting data about your project. Inspired in the dashboard framework Dashing

Produces a summary CSV report of an Amber Electric customer's energy consumption and cost data.

Rafael Project- Classifying rockets to different types using data science algorithms.

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ)

A daily updated JSON dataset of all the Open House London venues, events, and metadata

lightweight, fast and robust columnar dataframe for data analytics with online update

CS 506 - Computational Tools for Data Science

An all-inclusive Python framework for the Riot Games League of Legends API. We focus on making the data easy and fun to work with, while providing all the tools necessary to create a website or do data analysis.

Learn how to responsibly deliver value with ML.

Visions provides an extensible suite of tools to support common data analysis operations

An open-source Python project series where beginners can contribute and practice coding.

An unofficial python API for trading on the DeGiro platform, with the ability to get real time data and historical data.

jsoooooooon derulo - Make sure your 'jason derulo' is featured as the first part of your json data

Tools to help record data from Qiskit jobs

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

Stor is a community-driven green cryptocurrency based on a proof of space and time consensus algorithm.

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Python solutions to Codeforces problems

Python code written to utilize the Korlan usb2can hardware to send and receive data over the can-bus on a 2008 Nissan 350z

Pokemon catch events project to demonstrate data pipeline on AWS

Sample project showing reliable data ingestion application using FastAPI and dramatiq

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code

Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

A simple script & container to pull COVID data from covidlive.com.au and post a summary to a slack channel

Python wrapper for Coinex APIs

Related resources for our EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning

A visualization of people a user follows on Twitter

DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key

Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Home Assistant integration for energy consumption data from UK SMETS (Smart) meters using the Hildebrand Glow API.

Fix datetime EXIF data in photos downloaded from Google Takeout

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）

load .txt to train YOLOX, same as Yolo others

A simple Python library to integrate with the Heron Data API

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

Lux Academy & Data Science East Africa Python Boot Camp, Building and Deploying Flask Application Using Docker Demo App.

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

In this Github repository I will share my freqtrade files with you. I want to help people with this repository who don't know Freqtrade so much yet.

A library for interacting with Path of Exile game and economy data, and a unique loot filter generation framework.

We protect the privacy of the data on your computer by using the camera of your Debian based Pardus operating system. 🕵️

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.