AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Angel de Paula

Last update: Jun 8, 2022

Related tags

Deep Learning BERT-tweets-sexims-classification

Overview

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Description

This repository contains the code for the paper Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models. This paper will be published at the SEPLN-WS-IberLEF 2021 (the 3rd Workshop on Iberian Languages Evaluation Forum at the SEPLN 2021 Conference) scientific event. Descriptions of the implementation and the dataset are contained in the paper (link: Paper is soon...).

Paper Abstract

The popularity of social media has created problems such as hate speech and sexism. The identification and classification of sexism in social media are very relevant tasks, as they would allow building a healthier social environment. Nevertheless, these tasks are considerably challenging. This work proposes a system to use multilingual and monolingual BERT and data points translation and ensemble strategies for sexism identification and classification in English and Spanish. It was conducted in the context of the sEXism Identification in Social neTworks shared 2021 (EXIST 2021) task, proposed by the Iberian Languages Evaluation Forum (IberLEF). The proposed system and its main components are described, and an in-depth hyperparameters analysis is conducted. The main results observed were: (i) the system obtained better results than the baseline model (multilingual BERT); (ii) ensemble models obtained better results than monolingual models; and (iii) the E6 model (ensemble model considering all individual models and the best standardized values) obtained the best accuracies and F1-scores for both tasks. This work obtained first place in both tasks at EXIST, with the highest accuracies (0.780 for task 1 and 0.658 for task 2) and F1-scores (F1-binary of 0.780 for task 1 and F1-macro of 0.579 for task 2).

Credits

EXIST shared Task Organizers

Task website: http://nlp.uned.es/exist2021/

Contact: [email protected]

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

125 Jan 4, 2023

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

12 Dec 14, 2022

VD-BERT: A Unified Vision and Dialog Transformer with BERT

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Related tags

Overview

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

Description

Paper Abstract

Credits

You might also like...

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

VD-BERT: A Unified Vision and Dialog Transformer with BERT

I-BERT: Integer-only BERT Quantization

Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Code for pre-training CharacterBERT models (as well as BERT models).

Intrusion Detection System using ensemble learning (machine learning)

Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".

Owner

Angel de Paula

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

A simple python library for fast image generation of people who do not exist.

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)