A novel Engagement Detection with Multi-Task Training (ED-MTT) system

Onur Çopur

Last update: Nov 11, 2022

Related tags

Deep Learning ED-MTT

Overview

ED-MTT

A novel Engagement Detection with Multi-Task Training (ED-MTT) system which minimizes MSE and triplet loss together to determine the engagement level of students in an e-learning environment. You can check the colab notebook bellow for detailed explanatoins about data loading and code execution.

Introduction & Problem Definition

With the Covid-19 outbreak, the online working and learning environments became essential in our lives. For this reason, automatic analysis of non-verbal communication becomes crucial in online environments.

Engagement level is a type of social signal that can be predicted from facial expression and body pose. To this end, we propose an end-to-end deep learning-based system that detects the engagement level of the subject in an e-learning environment.

The engagement level feedback is important because:

Make aware students of their performance in classes.
Will help instructors to detect confusing or unclear parts of the teaching material.

Model Architecture

The proposed system first extracts features with OpenFace, then aggregates frames in a window for calculating feature statistics as additional features. Finally, uses Bi-LSTM for generating vector embeddings from input sequences. In this system, we introduce a triplet loss as an auxiliary task and design the system as a multi-task training framework by taking inspiration from, where self-supervised contrastive learning of multi-view facial expressions was introduced. To the best of our knowledge, this is a novel approach in engagement detection literature. The key novelty of this work is the multi-task training framework using triplet loss together with Mean Squared Error (MSE). The main contributions of this paper are as follows:

Multi-task training with triplet and MSE losses introduces an additional regularization and reduces over-fitting due to very small sample size.
Using triplet loss mitigates the label reliability problem since it measures relative similarity between samples.
A system with lightweight feature extraction is efficient and highly suitable for real-life applications.

Dataset

We evaluate the performance of ED-MTT on a publicly available ``Engagement in The Wild'' dataset which is comprised of separated training and validation sets.

The dataset is comprised of 78 subjects (25 females and 53 males) whose ages are ranged from 19 to 27. Each subject is recorded while watching an approximately 5 minutes long stimulus video of a Korean Language lecture.

Results

We compare the performance of ED-MTT with 9 different works from the state-of-the-art which will be reviewed in the rest of this section. Our results show that ED-MTT outperforms these state-of-the-art methods with at least a 5.74% improvement on MSE.

Repository structure

ED-MTT
│   README.md
│   Engagement_Labels.txt
|   ED-MTT.ipynb

└───code
│   │   dataloader.py
|   |   model.py
|   |   train.py
|   |   test.py
│   │   fix_path.py
|   |   utils.py
|   |   requirements.txt

└───configs
    │   batchnorm_default.yaml
    │   sweep.yaml

Running the Code

To train the experiments and manage the experiments, we used PyTorch Lightning together with Weights&Biases. All the detailed explonations to;

Load data and pre-trained weights,
Train the model from scratch,
Manage expriments and hyper-parameter search with wandb,
Reproduce the results presented in the paper,

are shown in ED-MTT.ipynb colab notebook.

Comments

How to run demo for this project

Dear Mr.CopurOnur,

I'm developing a tracking student engagement in an E-learning system for my thesis project. Luckily, I found your work and very interested in it. But I don't know how to run this project for testing purpose such as: using computer's webcam to test, so could you please guide me that how to do that. Have a nice day!

Best regards,

Son Nguyen

opened by sonnguyen1996 1

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

3 Oct 7, 2022

Multi-task yolov5 with detection and segmentation based on yolov5

YOLOv5DS Multi-task yolov5 with detection and segmentation based on yolov5(branch v6.0) decoupled head anchor free segmentation head README中文 Ablation

150 Dec 30, 2022

E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

End-to-end Music Remastering System This repository includes source code and pre

37 Dec 15, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

489 Jan 7, 2023

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

562 Dec 27, 2022

Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

CasRel-pytorch-reimplement Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The o

170 Dec 1, 2022

The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

134 Dec 16, 2022

A novel Engagement Detection with Multi-Task Training (ED-MTT) system

Related tags

Overview

ED-MTT

Introduction & Problem Definition

Model Architecture

Dataset

Results

Repository structure

Running the Code

You might also like...

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Multi-task yolov5 with detection and segmentation based on yolov5

E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Comments

How to run demo for this project

Owner

Onur Çopur

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

PyToch implementation of A Novel Self-supervised Learning Task Designed for Anomaly Segmentation

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?