3094 Python Swahili-nlp-data Libraries

A simple Python tool to transfer data from MySQL to SQLite 3.

MySQL to SQLite3 A simple Python tool to transfer data from MySQL to SQLite 3. This is the long overdue complimentary tool to my SQLite3 to MySQL. It

126 Jan 3, 2023

Obsei is a low code AI powered automation tool.

Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .

782 Dec 31, 2022

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Pretrained BigBird Model for Korean What is BigBird • How to Use • Pretraining • Evaluation Result • Docs • Citation 한국어 | English What is BigBird? Bi

183 Dec 14, 2022

A forecasting system dedicated to smart city data

smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and

1 Nov 8, 2021

[ICCV21] Official implementation of the "Social NCE: Contrastive Learning of Socially-aware Motion Representations" in PyTorch.

Social-NCE + CrowdNav Website | Paper | Video | Social NCE + Trajectron | Social NCE + STGCNN This is an official implementation for Social NCE: Contr

125 Dec 23, 2022

Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

156 Dec 21, 2022

Sample code to extract data directly from the NetApp AIQUM MySQL Database

This sample code shows how to connect to the AIQUM Database and pull user quota details from it. AIQUM Requirements: 1. AIQUM 9.7 or higher. 2. An

1 Nov 8, 2021

Complete the code of prefix-tuning in low data setting

Prefix Tuning Note: 作者在论文中提到使用真实的word去初始化prefix的操作（Initializing the prefix with activations of real words，significantly improves generation）。我在使用作者提供的

4 Jul 11, 2022

Getting started with Python, Dash and Plot.ly for the Data Dashboards team

data_dashboards Getting started with Python, Dash and Plot.ly for the Data Dashboards team Getting started MacOS users: # Install the pyenv version ma

Department for Levelling Up, Housing and Communities

1 Nov 8, 2021

Sample data for the napari image viewer.

napari-demo-data Sample data for the napari image viewer. This napari plugin was generated with Cookiecutter using @napari's cookiecutter-napari-plugi

1 Nov 8, 2021

An open-source NLP library: fast text cleaning and preprocessing.

An open-source NLP library: fast text cleaning and preprocessing

21 Mar 18, 2022

PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Dynamic Data Augmentation with Gating Networks This is an official PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

3 Oct 26, 2022

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

4 Jul 13, 2022

Code & Data for Enhancing Photorealism Enhancement

Enhancing Photorealism Enhancement Stephan R. Richter, Hassan Abu AlHaija, Vladlen Koltun Paper | Website (with side-by-side comparisons) | Video (Pap

1.1k Dec 31, 2022

Here I plotted data for the average test scores across schools and class sizes across school districts.

HW_02 Here I plotted data for the average test scores across schools and class sizes across school districts. Average Test Score by Race This graph re

7 Oct 27, 2021

A simple project on Data Visualization for CSCI-40 course.

Simple-Data-Visualization A simple project on Data Visualization for CSCI-40 course - the instructions can be found here SAT results in New York in 20

8 Oct 27, 2021

Data and a Twitter bot for the EPA's DOCUMERICA (1972-1977) program.

documerica This repository holds JSON(L) artifacts and a few scripts related to managing archival data from the EPA's DOCUMERICA program. Contents: Ma

2 Oct 27, 2021

Scraping weather data using Python to receive umbrella reminders

A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.

1 Aug 23, 2022

Python App To Encrypt Data (image, text, all data)

1 Oct 29, 2021

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Realistic Few-Shot Relation Extraction This repository contains code to reproduce the results in the paper "Towards Realistic Few-Shot Relation Extrac

8 Nov 9, 2022

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

From "Onion Not Found" to Guard Discovery (PETS'22) This repository holds the code and data for our PETS'22 paper titled 'From "Onion Not Found" to Gu

3 May 4, 2022

LoL API is a Python application made to serve League of Legends data.

1 Nov 6, 2021

Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation

Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation Overview Consider the scenario in which advertisement

2 Nov 18, 2021

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

This is a backend for VCode Editor for saving & retriving data.

This is a backend for VCode Editor for saving & retriving data through the API.

1 Nov 22, 2021

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-

9 Sep 18, 2022

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

mcc-mnc.com-webscraper Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX) A Python script for web scraping mcc-mnc.com Link: mcc

1 Nov 7, 2021

Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.

Exploring aircraft accidents in Brazil Occurrencies with aircraft in Brazil are investigated by the Center for Investigation and Prevention of Aircraf

5 Dec 14, 2021

small package with utility functions for analyzing (fly) calcium imaging data

fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan

3 Dec 14, 2022

A tool for checking if the external data used in Flatpak manifests is still up to date

Flatpak External Data Checker This is a tool for checking for outdated or broken links of external data in Flatpak manifests. Motivation Flatpak apps

76 Dec 24, 2022

A Python module and command line utility for working with web archive data using the WACZ format specification

py-wacz The py-wacz repository contains a Python module and command line utility for working with web archive data using the WACZ format specification

14 Oct 24, 2022

A real data analysis and modeling project - restaurant inspections

A real data analysis and modeling project - restaurant inspections Jafar Pourbemany 9/27/2021 This project represents data analysis and modeling of re

2 Aug 21, 2022

Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures

Svector Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures. Easily chain your methods confidently with tons of add

5 Dec 9, 2022

Exploratory data analysis

Exploratory data analysis An Exploratory data analysis APP TAPIWA CHAMBOKO 🚀 About Me I'm a full stack developer experienced in deploying artificial

1 Nov 7, 2021

Simple NLP based project without any use of AI

1 Apr 26, 2022

A napari plugin to inspect data within a cisTEM project

napari-cistem A plugin to inspect data within a cisTEM project This napari plugin was generated with Cookiecutter using with @napari's cookiecutter-na

1 Nov 7, 2021

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation This repository provides two web crawlers to label domain nam

1 Nov 5, 2021

edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

8 Dec 12, 2022

A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel

Love-Calculator A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI, even For Adm

1 Mar 22, 2022

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

74 Dec 22, 2022

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

shindo.py Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data stored in NumPy array Introduction Japa

3 Sep 23, 2022

NLP codes implemented with Pytorch (w/o library such as huggingface)

NLP_scratch NLP codes implemented with Pytorch (w/o library such as huggingface) scripts ├── models: Neural Network models ├── data: codes for dataloa

3 Dec 28, 2021

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.

2.6k Jan 8, 2023

A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals.

A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals. All your favorite hits in a simplified format.

2 Aug 3, 2022

Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

176 Nov 28, 2022

This tool parses log data and allows to define analysis pipelines for anomaly detection.

logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit

32 Nov 27, 2022

Tools for analyzing data collected with a custom unity-based VR for insects.

unityvr Tools for analyzing data collected with a custom unity-based VR for insects. Organization: The unityvr package contains the following submodul

1 Dec 14, 2022

This is a text summarizing tool written in Python

Summarize Written by: Ling Li Ya This is a text summarizing tool written in Python. User Guide Some things to note: The application is accessible here

2 Feb 18, 2022

A collection of tools for biomedical research assay analysis in Python.

waltlabtools A collection of tools for biomedical research assay analysis in Python. Key Features Analysis for assays such as digital ELISA, including

1 Apr 18, 2022

A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

4 Jun 24, 2022

A collection of robust and fast processing tools for parsing and analyzing web archive data.

ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing web archive data. Resiliparse is part of the ChatNoir

24 Nov 29, 2022

Suite of tools for retrieving USGS NWIS observations and evaluating National Water Model (NWM) data.

Documentation OWPHydroTools GitHub pages documentation Motivation We developed OWPHydroTools with data scientists in mind. We attempted to ensure the

36 Dec 11, 2022

A tool and a library for SVG path data transformations.

SVG path data transformation toolkit A tool and a library for SVG path data transformations. Currently it supports a translation and a scaling. Usage

2 Mar 7, 2022

Measuring if attention is explanation with ROAR

NLP ROAR Interpretability Official code for: Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Toke

19 Nov 13, 2022

NLP Overview

NLP-Overview Introduction The field of NPL encompasses a variety of topics which involve the computational processing and understanding of human langu

1 Jan 13, 2022

An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

2.3k Jan 7, 2023

Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

13 Oct 7, 2022

python's memory-saving dictionary data structure

ConstDict python代替的Dict数据结构若字典不会增加字段，只读/原字段修改使用ConstDict可节省内存 Dict()内存主要消耗的地方： 1、Dict扩容机制，预留内存空间 2、Dict也是一个对象，内部会动态维护__dict__，增加slot类属性可以节省内容节省内存大小

1 Nov 3, 2021

This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

1 Nov 3, 2021

Streamz helps you build pipelines to manage continuous streams of data

Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on.

1.1k Dec 28, 2022

MongoDB utility to inflate the contents of small collection to a new larger collection

MongoDB Data Inflater ("data-inflater") The data-inflater tool is a MongoDB utility to automate the creation of a new large database collection using

3 Nov 28, 2021

This library is testing the ethics of language models by using natural adversarial texts.

prompt2slip This library is testing the ethics of language models by using natural adversarial texts. This tool allows for short and simple code and v

9 Dec 28, 2021

Visualization Data Drug in thailand during 2014 to 2020

Visualization Data Drug in thailand during 2014 to 2020 Data sorce from ข้อมูลเปิดภาครัฐ สำนักงาน ป.ป.ส Inttroducing program Using tkinter module for

1 Jan 5, 2022

Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

9 Nov 3, 2021

nrgpy is the Python package for processing NRG Data Files

nrgpy nrgpy is the Python package for processing NRG Data Files Website and source: https://github.com/nrgpy/nrgpy Documentation: https://nrgpy.github

23 Dec 8, 2022

The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based metabolomics.

MINT (Metabolomics Integrator) The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based m

0 May 4, 2022

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

2 Sep 27, 2022

Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

3 Oct 19, 2022

Extract the temperature data of each wire from the thermal imager raw data.

Wire-Tempurature-Detection Extract the temperature data of each wire from the thermal imager raw data. The motivation of this computer vision project

1 Nov 3, 2021

Spert NLP Relation Extraction API deployed with torchserve for inference

SpERT torchserve Spert_torchserve is the Relation Extraction model (SpERT)Span-based Entity and Relation Transformer API deployed with pytorch/serve.

1 Nov 24, 2021

VevestaX is an open source Python package for ML Engineers and Data Scientists.

VevestaX Track failed and successful experiments as well as features. VevestaX is an open source Python package for ML Engineers and Data Scientists.

24 Dec 14, 2022

Connects to a local SenseCap M1 Helium Hotspot and pulls API Data.

sensecap_api_checker_HELIUM Connects to a local SenseCap M1 Helium Hotspot and pulls API Data.

1 Nov 3, 2021

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning

2 Sep 9, 2022

Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Thank you for you

671 Jan 3, 2023

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo

Building and Urban Data Science (BUDS) Group

5 Dec 2, 2022

SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu

12 Jun 4, 2022

[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

4 Nov 3, 2021

Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available for research purposes.

Data and Code for paper Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph is available f

5 Nov 10, 2022

Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.

stereoEEG2speech We provide code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectro

15 Nov 11, 2022

The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualizing NFT data from OpenSea, using PostgreSQL and TimescaleDB.

Timescale NFT Starter Kit The Timescale NFT Starter Kit is a step-by-step guide to get up and running with collecting, storing, analyzing and visualiz

102 Dec 24, 2022

This is the accompanying repository for the Bloomberg Global Coal Countdown website.

This is the accompanying repository for the Bloomberg Global Coal Countdown (BGCC) website. Data Sources Dashboard Data Schema and Validation License

7 Jun 1, 2022

Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati

44 Dec 6, 2022

This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

73 Dec 28, 2022

A tool for hiding data inside of images

Stegenography-tool a tool for hiding data inside of images Quick test: do python steg-encode.py test/message.txt test/covid19.png to generate the test

2 Nov 2, 2021

Planning Algorithms in AI and Robotics. MSc course at Skoltech Data Science program

Planning Algorithms in AI and Robotics course T2 2021-22 The Planning Algorithms in AI and Robotics course at Skoltech, MS in Data Science, during T2,

6 Sep 21, 2022

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

31 Nov 7, 2022

EOD Historical Data Python Library (Unofficial)

EOD Historical Data Python Library (Unofficial) https://eodhistoricaldata.com Installation python3 -m pip install eodhistoricaldata Note Demo API key

20 Dec 22, 2022

A selection of SQLite3 databases to practice querying from.

Dummy SQL Databases This is a collection of dummy SQLite3 databases, for learning and practicing SQL querying, generated with the VS Code extension Ge

1 Feb 26, 2022

This is an auto-ML tool specialized in detecting of outliers

Auto-ML tool specialized in detecting of outliers Description This tool will allows you, with a Dash visualization, to compare 10 models of machine le

1 Nov 3, 2021

Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

7 Nov 26, 2022

Data Poisoning based on Adversarial Attacks using Non-Robust Features

Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER

1 Nov 2, 2021

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

1 Nov 2, 2021

A terminal spreadsheet multitool for discovering and arranging data

VisiData v2.6.1 A terminal interface for exploring and arranging tabular data. VisiData supports tsv, csv, sqlite, json, xlsx (Excel), hdf5, and many

6.2k Jan 4, 2023

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

0 Feb 13, 2022

Python Swahili-nlp-data Resources

Python swahili-nlp-data Libraries

A simple Python tool to transfer data from MySQL to SQLite 3.

Obsei is a low code AI powered automation tool.

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

A forecasting system dedicated to smart city data

[ICCV21] Official implementation of the "Social NCE: Contrastive Learning of Socially-aware Motion Representations" in PyTorch.

Parametric Contrastive Learning (ICCV2021)

Sample code to extract data directly from the NetApp AIQUM MySQL Database

Complete the code of prefix-tuning in low data setting

Getting started with Python, Dash and Plot.ly for the Data Dashboards team

Sample data for the napari image viewer.

An open-source NLP library: fast text cleaning and preprocessing.

PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Code & Data for Enhancing Photorealism Enhancement

Here I plotted data for the average test scores across schools and class sizes across school districts.

A simple project on Data Visualization for CSCI-40 course.

Data and a Twitter bot for the EPA's DOCUMERICA (1972-1977) program.

Scraping weather data using Python to receive umbrella reminders

Python App To Encrypt Data (image, text, all data)

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

LoL API is a Python application made to serve League of Legends data.

Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

This is a backend for VCode Editor for saving & retriving data.

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.

small package with utility functions for analyzing (fly) calcium imaging data

A tool for checking if the external data used in Flatpak manifests is still up to date

A Python module and command line utility for working with web archive data using the WACZ format specification

A real data analysis and modeling project - restaurant inspections

Svector (pronounced Swag-tor) provides extension methods to pyrsistent data structures

Exploratory data analysis

Simple NLP based project without any use of AI

A napari plugin to inspect data within a cisTEM project

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

NLP codes implemented with Pytorch (w/o library such as huggingface)

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals.

Tool which allow you to detect and translate text.

This tool parses log data and allows to define analysis pipelines for anomaly detection.

Tools for analyzing data collected with a custom unity-based VR for insects.

This is a text summarizing tool written in Python

A collection of tools for biomedical research assay analysis in Python.

A learning-based data collection tool for human segmentation

A collection of robust and fast processing tools for parsing and analyzing web archive data.

Suite of tools for retrieving USGS NWIS observations and evaluating National Water Model (NWM) data.

A tool and a library for SVG path data transformations.

Measuring if attention is explanation with ROAR

NLP Overview

An Open-Source Toolkit for Prompt-Learning.

Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

python's memory-saving dictionary data structure

This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

Streamz helps you build pipelines to manage continuous streams of data

MongoDB utility to inflate the contents of small collection to a new larger collection

This library is testing the ethics of language models by using natural adversarial texts.

Visualization Data Drug in thailand during 2014 to 2020

Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).

nrgpy is the Python package for processing NRG Data Files

The Metabolomics Integrator (MINT) is a post-processing tool for liquid chromatography-mass spectrometry (LCMS) based metabolomics.

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Data science, Data manipulation and Machine learning package.

Extract the temperature data of each wire from the thermal imager raw data.

Spert NLP Relation Extraction API deployed with torchserve for inference

VevestaX is an open source Python package for ML Engineers and Data Scientists.

Connects to a local SenseCap M1 Helium Hotspot and pulls API Data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Generalized Data Weighting via Class-level Gradient Manipulation

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec