19 Repositories
Python massive-summ Libraries
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor
QDax is a tool to accelerate Quality-Diveristy (QD) algorithms through hardware accelerators and massive parallelism
QDax: Accelerated Quality-Diversity QDax is a tool to accelerate Quality-Diveristy (QD) algorithms through hardware accelerators and massive paralleli
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed
Code for Massive-scale Decoding for Text Generation using Lattices
Massive-scale Decoding for Text Generation using Lattices Jiacheng Xu, Greg Durrett TL;DR: a new search algorithm to construct lattices encoding many
Script to generate a massive volume of data in sql, csv, json or xml format
DataGenerator Made with Python Open for pull requests 1. Dependencies To install required dependencies run pip install -r requirements.txt 2. Executi
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset This repository contains links to data and code to fetch and reproduce
GMailBomber is a form of Internet abuse which is perpetrated through the sending of massive volumes of email to a specific email address with the goal of overflowing the mailbox and overwhelming the mail server hosting the address, making it into some form of denial of service attack.
GMailBomber is a form of Internet abuse which is perpetrated through the sending of massive volumes of email to a specific email address with the goal of overflowing the mailbox and overwhelming the mail server hosting the address, making it into some form of denial of service attack.
apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly.
Please consider citing the manuscript if you use apricot in your academic work! You can find more thorough documentation here. apricot implements subm
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb
Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.
(ACMMM 2021 Oral) SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment This repository shows two tasks: Face landmark detection and Fac
Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.
(ACMMM 2021 Oral) SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment This repository shows two tasks: Face landmark detection and Fac
Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.
Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data. So, if you don't need the whole corpus, but just a suitable subset (indeed, a cor(pus sub)set, this is what Corset will do for you--and the reason of the name of the tool.
A New Approach to Overgenerating and Scoring Abstractive Summaries
We provide the source code for the paper "A New Approach to Overgenerating and Scoring Abstractive Summaries" accepted at NAACL'21. If you find the code useful, please cite the following paper.
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale
XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf
Automated Phrase Mining from Massive Text Corpora in Python.
Automated Phrase Mining from Massive Text Corpora in Python.
This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.
This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo
Library to scrape and clean web pages to create massive datasets.
lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr
pyinfra automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more.
pyinfra automates/provisions/manages/deploys infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployme