Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Yunfei Liu

Last update: Dec 10, 2022

Related tags

Deep Learning PnP-GA

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Our paper is accepted by ICCV2021.

Picture: Overview of the proposed Plug-and-Play (PnP) adaption framework for generalizing gaze estimation to a new domain.

Picture: The proposed architecture.

Results

Input	Method	D_E→D_M	D_E→D_D	D_G→D_M	D_G→D_D
Face	Baseline	8.767	8.578	7.662	8.977
Face	Baseline + PnP-GA	5.529 ↓36.9%	5.867 ↓31.6%	6.176 ↓19.4%	7.922 ↓11.8%
Face	ResNet50	8.017	8.310	8.328	7.549
Face	ResNet50 + PnP-GA	6.000 ↓25.2%	6.172 ↓25.7%	5.739 ↓31.1%	7.042 ↓6.7%
Face	SWCNN	10.939	24.941	10.021	13.473
Face	SWCNN + PnP-GA	8.139 ↓25.6%	15.794 ↓36.7%	8.740 ↓12.8%	11.376 ↓15.6%
Face + Eye	CA-Net	--	--	21.276	30.890
Face + Eye	CA-Net + PnP-GA	--	--	17.597 ↓17.3%	16.999 ↓44.9%
Face + Eye	Dilated-Net	--	--	16.683	18.996
Face + Eye	Dilated-Net + PnP-GA	--	--	15.461 ↓7.3%	16.835 ↓11.4%

This repository contains the official PyTorch implementation of the following paper:

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation
Yunfei Liu, Ruicong Liu, Haofei Wang, Feng Lu

Abstract: Deep neural networks have significantly improved appearance-based gaze estimation accuracy. However, it still suffers from unsatisfactory performance when generalizing the trained model to new domains, e.g., unseen environments or persons. In this paper, we propose a plugand-play gaze adaptation framework (PnP-GA), which is an ensemble of networks that learn collaboratively with the guidance of outliers. Since our proposed framework does not require ground-truth labels in the target domain, the existing gaze estimation networks can be directly plugged into PnP-GA and generalize the algorithms to new domains. We test PnP-GA on four gaze domain adaptation tasks, ETH-to-MPII, ETH-to-EyeDiap, Gaze360-to-MPII, and Gaze360-to-EyeDiap. The experimental results demonstrate that the PnP-GA framework achieves considerable performance improvements of 36.9%, 31.6%, 19.4%, and 11.8% over the baseline system. The proposed framework also outperforms the state-of-the-art domain adaptation approaches on gaze domain adaptation tasks.

Resources

Material related to our paper is available via the following links:

Paper: https://arxiv.org/abs/2107.13780
Project: https://liuyunfei.net/publication/iccv2021_pnp-ga/
Code: https://github.com/DreamtaleCore/PnP-GA

System requirements

Only Linux is tested, Windows is under test.
64-bit Python 3.6 installation.

Playing with pre-trained networks and training

Config

You need to modify the config.yaml first, especially xxx/image, xxx/label, and xxx_pretrains params.

xxx/image represents the path of label file.

xxx/root represents the path of image file.

xxx_pretrains represents the path of pretrained models.

A example of label file is data folder. Each line in label file is conducted as:

p00/face/1.jpg 0.2558059438789034,-0.05467275933864655 -0.05843388117618364,0.46745964684693614 ... ...

Where our code reads image data form os.path.join(xxx/root, "p00/face/1.jpg") and reads ground-truth labels of gaze direction from the rest in label file.

Train

We provide three optional arguments, which are --oma2, --js and --sg. They repersent three different network components, which could be found in our paper.

--source and --target represent the datasets used as the source domain and the target domain. You can choose among eth, gaze360, mpii, edp.

--i represents the index of person which is used as the training set. You can set it as -1 for using all the person as the training set.

--pics represents the number of target domain samples for adaptation.

We also provide other arguments for adjusting the hyperparameters in our PnP-GA architecture, which could be found in our paper.

For example, you can run the code like:

python3 adapt.py --i 0 --pics 10 --savepath path/to/save --source eth --target mpii --gpu 0 --js --oma2 --sg

Test

--i, --savepath, --target are the same as training.

--p represents the index of person which is used as the training set in the adaptation process.

For example, you can run the code like:

python3 test.py --i -1 --p 0 --savepath path/to/save --target mpii

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{liu2021PnP_GA,
  title={Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation},
  author={Liu, Yunfei and Liu, Ruicong and Wang, Haofei and Lu, Feng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Contact

If you have any questions, feel free to E-mail me via: lyunfei(at)buaa.edu.cn

You might also like...

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Comments

gazeto3d

(pytorch3d) C:\Users\Admin\Desktop\GAZE\PnP-GA-main>python test.py --i -1 --p 0 --savepath path/to/save --target mpii Test Set: cross-mpii Read data [Read Data]: Total num: 37652 Model building Test 1 Traceback (most recent call last): File "C:\Users\Admin\Desktop\GAZE\PnP-GA-main\test.py", line 118, in accs += angular(gaze, gts.cpu().numpy()[k]) File "C:\Users\Admin\Desktop\GAZE\PnP-GA-main\test.py", line 31, in angular label = gazeto3d(label) File "C:\Users\Admin\Desktop\GAZE\PnP-GA-main\test.py", line 18, in gazeto3d gaze_gt[0] = -np.cos(gaze[1]) * np.sin(gaze[0]) IndexError: index 1 is out of bounds for axis 0 with size 1

opened by liushifu12138 7
How to eliminate the influence of different head poses

dear sir, I want to know how to eliminate the influence of different head poses. In eyediap, ETH-XGaze and gaze360, there are varieties of head poses. It is mentioned in the paper that "we eliminated the influence of different head poses through rotating the virtual camera and wrapping the images". Does that mean the rotation of head pose of all the three axes are eliminated or just one axe (like MPIIGaze or ETH-XGaze processed)?. Could you give me some example codes or explaination? Thanks very much!

opened by davidyinnan 5
image size

hi, thank you for your excellent work, and i want to know the original image size is 448448 or 224224. i saw "fimg = cv2.resize(fimg, (448, 448))/255.0" in released code. So, original image size is 448*448？ Thanks, Yong

opened by yongwuML 1
Something about data

Could you provide the code for label conversion of four datasets? By the way, what's the meaning of the something like this "1_A_DS_S_1" in your example of label files? Thank you! I like your work!

opened by Nico-Rain 1

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Related tags

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Resources

System requirements

Playing with pre-trained networks and training

Config

Train

Test

Citation

Contact

You might also like...

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Certifiable Outlier-Robust Geometric Perception

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Comments

gazeto3d

How to eliminate the influence of different head poses

image size

Something about data

Owner

Yunfei Liu

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

The codes and models in 'Gaze Estimation using Transformer'.

Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

Implementation of gaze tracking and demo

Implementation of gaze tracking and demo

Shitty gaze mouse controller