Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Related tags

Text Data & NLP loa

Overview

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

This is the project page for Finding Label and Model Errors in Perception Data With Learned Observation Assertions.

Please read the paper for full technical details.

Installation

In the root directory, run

pip install -e .

Examples

We provide an example of the Lyft Level 5 percetion dataset. We have provided model predictions for convenience, but you will need to download the dataset here.

All of the scripts are available in examples/lyft_level5. In order to run the scripts, do the following:

Set the data directories in constants.py.
Learn the priors with learn_priors.py.
Run LOA with prior_lyft.py.

You can visualize the results with viz_track.py.

Citation

If you find this project useful, please cite us at

@article{kang2021finding,
  title={Finding Label and Model Errors in Perception Data With Learned Observation Assertions},
  author={Kang, Daniel and Arechiga, Nikos and Pillai, Sudeep and Bailis, Peter and Zaharia, Matei},
}

and contact us if you deploy LOA!

Comments

Missing predictions from `preds.p`

Do you have a reference to the model predictions (or model parameters) that you used and serialized in preds.p. Or if you can't share that information, as much detail as possible about the model that you used, accuracy of the prediction model, and format of the prediction output file (preds.p)? It is currently hard to run the code/duplicate your results without those model predictions.

opened by jonathanzhang99 2
[Setup] Update requirements.txt to include necessary
The previous requirements.txt included direct references and packages that are needed in the dependency tree. Pruned to make sure that the dev environment is cleaner and easier.

Tested with python 3.9.1. Ran setup with:

python -m venv env source env/bin/activate pip install -r requirements.txt python examples/lyft_level5/learn_priors.py python examples/lyft_level5/prior_lyft.py
opened by jonathanzhang99 2

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI, torch2trt to accelerate. our model support for int8, dynamic input and profiling. (Nvidia-Alibaba-TensoRT-hackathon2021)

Ultra_Fast_Lane_Detection_TensorRT An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI to accelerate. our model support for in

121 Dec 27, 2022

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

31 Nov 7, 2022

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Regression Free Model Update Code for the paper: Regression Bugs Are In Your Mod

2 Feb 17, 2022

Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Wav2Vec2CTC With KenLM Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription. Assuming you'v

65 Sep 21, 2022

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

9 Nov 7, 2022

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

TextAttack 🐙 Generating adversarial examples for NLP models [TextAttack Documentation on ReadTheDocs] About • Setup • Usage • Design About TextAttack

2.2k Jan 3, 2023

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

37 Dec 4, 2022

This repository is home to the Optimus data transformation plugins for various data processing needs.

Transformers Optimus's transformation plugins are implementations of Task and Hook interfaces that allows execution of arbitrary jobs in optimus. To i

37 Dec 14, 2022

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Related tags

Overview

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

Installation

Examples

Citation

You might also like...

An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI, torch2trt to accelerate. our model support for int8, dynamic input and profiling. (Nvidia-Alibaba-TensoRT-hackathon2021)

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

This repository is home to the Optimus data transformation plugins for various data processing needs.

Comments

Missing predictions from `preds.p`

[Setup] Update requirements.txt to include necessary

Owner

Stanford Future Data Systems

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Label data using HuggingFace's transformers and automatically get a prediction service

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

A library for finding knowledge neurons in pretrained transformer models.

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

CorNet Correlation Networks for Extreme Multi-label Text Classification