RaceBERT -- A transformer based model to predict race and ethnicty from names

Last update: Nov 2, 2022

Related tags

Overview

RaceBERT -- A transformer based model to predict race and ethnicty from names

Installation

pip install racebert

Using a virtual environment is highly recommended! You may need to install pytorch as instructed here: https://pytorch.org/get-started/locally/

Paper

Todo

Usage

raceBERT predicts race (U.S census race) and ethnicity from names.

from racebert import RaceBERT

model = RaceBERT()

# To predict race
model.predict_race("Barack Obama")

>>> {"label": "nh_black", "score": 0.5196923613548279}

The race categories are:

Race	Label
Non-hispanic White	nh_white
Hispanic	hispanic
Non-hispanic Black	nh_black
Asian & Pacific Islander	api
American Indian & Alaskan Native	aian

# Predict ethnicity
model.predict_ethnicty("Arjun Gupta")

>>> {"label": "Asian,IndianSubContinent", "score": 0.9612812399864197}

The ethnicity categories are:

Ethnicity
GreaterEuropean,British
GreaterEuropean,WestEuropean,French
GreaterEuropean,WestEuropean,Italian
GreaterEuropean,WestEuropean,Hispanic
GreaterEuropean,Jewish
GreaterEuropean,EastEuropean
Asian,IndianSubContinent
Asian,GreaterEastAsian,Japanese
GreaterAfrican,Muslim
Asian,GreaterEastAsian,EastAsian
GreaterEuropean,WestEuropean,Nordic
GreaterEuropean,WestEuropean,Germanic
GreaterAfrican,Africans

GPU

If you have a GPU, you can speed up the computation by specifying the CUDA device when you instantiate the model.

from racebert import RaceBERT

model = RaceBERT(device=0)

# predict race in batch
model.predict_race(["Barack Obama", "George Bush"])

>>>
[
        {"label": "nh_black", "score": 0.5196923613548279},
        {"label": "nh_white", "score": 0.8365859389305115}
]

# predict ethnicity in batch
model.predict_ethnicity(["Barack Obama", "George Bush"])

HuggingFace

Alternatively, you can work with the transformers models hosted on the huggingface hub directly.

Race Model: https://huggingface.co/pparasurama/raceBERT
Ethnicity Model: https://huggingface.co/pparasurama/raceBERT-ethnicity

Please refer to the transformers documentation.

Comments

Not support multi gpus training

Hi Prasanna,

I find the following line intends to run the model on multi gpus: https://github.com/parasurama/raceBERT/blob/261861b55733fa69b812edb99a2d4c19c908d4f0/models/nameBERT_train.py#L80

However, when I run python ./models/nameBERT_train.py char-tokenizer roberta, only GPU0 is taken (I have 4 gpus on one machine).

Do you know how to solve this problem? Thanks.

opened by zhiyuanpeng 0
Is raceBERT trained by both wiki and florida datasets?

Hi Prasanna,

Thanks for your sharing. In your paper, the raceBERT used in table 4 is trained by only florida dataset or both wiki and florida datasets? Thanks.

opened by zhiyuanpeng 0

This code is 3d-CNN model that can predict environmental value

Predict-environmental-value-3dCNN This code is 3d-CNN model that can predict environmental value. Firstly, I built a model that can create a lot of bu

1 Jan 6, 2022

Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified

Diabetes-Feature-Engineering Aim Developing a machine learning model that can pr

0 Feb 23, 2022

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

NLP_0-project Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures1. We are a "democratic" and c

3 Mar 16, 2022

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

489 Jan 7, 2023

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

6 Dec 8, 2022

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

3 Mar 30, 2022

PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

MuseMorphose This repository contains the official implementation of the following paper: Shih-Lun Wu, Yi-Hsuan Yang MuseMorphose: Full-Song and Fine-

142 Jan 8, 2023

Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

96 Nov 9, 2022

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

20 Jan 3, 2023

RaceBERT -- A transformer based model to predict race and ethnicty from names

Related tags

Overview

RaceBERT -- A transformer based model to predict race and ethnicty from names

Installation

Paper

Usage

GPU

HuggingFace

You might also like...

This code is 3d-CNN model that can predict environmental value

Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified

Group project for MFIN7036. Our goal is to predict firm profitability with text-based competition measures.

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

Emotional conditioned music generation using transformer-based model.

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Comments

Not support multi gpus training

Is raceBERT trained by both wiki and florida datasets?

Owner

Prasanna Parasurama

A Kernel fuzzer focusing on race bugs

LSTM model trained on a small dataset of 3000 names written in PyTorch

Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary.