a dnn ai project to classify which food people are eating on audio recordings

Marco Tröster

Last update: Oct 24, 2021

Related tags

Deep Learning eat-challenge

Overview

Deep Learning - EAT Challenge

About

This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objective to be learned is a classification task telling which food people are eating on audio recordings.

Students

This project was created by:

Benjamin Möckl
Julian Göser
Marco Tröster

EAT Dataset Setup

For your convenience, the download of all external project assets (dataset and evaluation metrics) has been automated by a shell script. After executing the script you should be ready to run / develop the project code.

# download and unpack the dataset and metric files
./init_dataset_and_metrics.sh <dataset zip password>

How to Run

First, cache the input dataset as TFRecord files for a training session (e.g. naive training). This should massively improve your training performance (especially with low CPU / GPU resources).

# cache the preprocessed audio dataset as TFRecord file
python src/main.py preprocess_dataset naive

Now, you can launch a training session (e.g. naive training).

# process a training session
python src/main.py run_training naive

After that you can sample all inputs of the unknown test dataset using a trained model and export the prediction results for EAT challenge submission.

# evaluate the results for submission
python src/main.py eval_results naive

Valid training configurations are:

naive
noisy
autoenc
amplitude

Remark: Use a GPU empowered machine for amplitude training (although it won't be too rewarding anyways). Tested on Ubuntu 20.04. For running on Windows, the keras ModelCheckpoint Callback has to be switched to our SaveBestAccuracyCallback.

Training Results

Training	Approach Description	Test Acc.	Real Acc.
Naive	Train on audio melspectrograms using Conv2D	0.41	0.36
Noisy	Train on audio melspectrograms using custom noisy Conv2D	0.44	0.39
Amplitude	Train on audio amplitude using Conv1D	0.23	?.??
AutoEnc	Train on audio melspectrograms using an Auto Encoder	0.25	?.??

You might also like...

An energy estimator for eyeriss-like DNN hardware accelerator

a dnn ai project to classify which food people are eating on audio recordings

Related tags

Overview

Deep Learning - EAT Challenge

About

Students

EAT Dataset Setup

How to Run

Training Results

You might also like...

An energy estimator for eyeriss-like DNN hardware accelerator

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

ML model to classify between cats and dogs

Learning Open-World Object Proposals without Learning to Classify

Classify the disease status of a plant given an image of a passion fruit

An implementation of a discriminant function over a normal distribution to help classify datasets.

A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Owner

Marco Tröster

Facestar dataset. High quality audio-visual recordings of human conversational speech.

A two-stage U-Net for high-fidelity denoising of historical recordings

Real-time analysis of intracranial neurophysiology recordings.

Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically.

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary.

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

PyTorchMemTracer - Depict GPU memory footprint during DNN training of PyTorch