NER for Indian languages

Overview

CL-NERIL: A Cross-Lingual Model for NER in Indian Languages

Code for the paper - https://arxiv.org/abs/2111.11815

Setup

  1. Setup a virtual environment
  2. The implementation is based on Python 3.7. To install the dependencies used, run: pip install -r requirements.txt
  3. Run the code

Steps to run

  1. Add the aligned weakly labeled data for every target language to the corresponding aligned data directory. Sample data files for 3 Indian languages are provided.
  2. Download CoNLL-2003 English [en] data and save it to the 'en' directory in source/data. This is used for training the Teacher model.
  3. For training and evaluation, run as follows
$ cd source
$ ./scripts/run_clneril.sh

The results and logs will be stored in the output directory specified.

Citation

If you find this repo helpful, please cite the following:

@misc{prabhakar2021clneril,
      title={CL-NERIL: A Cross-Lingual Model for NER in Indian Languages}, 
      author={Akshara Prabhakar and Gouri Sankar Majumder and Ashish Anand},
      year={2021},
      eprint={2111.11815},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Credits

The code framework is adapted from Teacher-Student.

You might also like...
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer vision can be used to identify cognates known to exist, and perhaps lead linguists to evidence of unknown cognates.

Learn other languages ​​using artificial intelligence with python.
Learn other languages ​​using artificial intelligence with python.

The main idea of ​​the project is to facilitate the learning of other languages. We created a simple AI that will interact with you. Just ask questions that if she knows, she will answer.

Framework for creating and running trading strategies. Blatantly stolen copy of qtpylib to make it work for Indian markets.
Framework for creating and running trading strategies. Blatantly stolen copy of qtpylib to make it work for Indian markets.

_• Kinetick Trade Bot Kinetick is a framework for creating and running trading strategies without worrying about integration with broker and data str

🚧 finCLI's own News API. No more limited API calls. Unlimited credible and latest information on BTC, Ethereum, Indian and Global Finance.
🚧 finCLI's own News API. No more limited API calls. Unlimited credible and latest information on BTC, Ethereum, Indian and Global Finance.

🚧 finCLI's own News API. No more limited API calls. Unlimited credible and latest information on BTC, Ethereum, Indian and Global Finance.

🇮🇳 A Indian Flag Animation Project Made With Python
🇮🇳 A Indian Flag Animation Project Made With Python

🇮🇳 A Indian Flag Animation Project Made With Python

MITMSDR for INDIAN ARMY cybersecurity hackthon

There mainly three things here: MITMSDR spectrum Manual reverse shell MITMSDR Installation Clone the project and run the setup file: ./setup One of th

Indian Space Research Organisation API With Python

ISRO Indian Space Research Organisation API Installation pip install ISRO Usage import isro isro.spacecrafts() # returns spacecrafts data isro.lau

Rapid Sms Bomber For Indian Number.

Bombzilla Rapid Sms Bomber For Indian Number. Installation git clone https://github.com/sarv99/Bombzilla cd Bombzilla chmod +x setup.sh ./setup.sh Af

PyHoroscope - Observational Indian lunisolar calendar, horoscope and matching using the Swiss ephemeris

PyHoroscope Observational Indian lunisolar calendar, horoscope and matching usin

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks
Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch
天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch

天池中药说明书实体识别挑战冠军方案;中文命名实体识别;NER; BERT-CRF & BERT-SPAN & BERT-MRC;Pytorch

Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

Negative Sampling for NER Unlabeled entity problem is prevalent in many NER scenarios (e.g., weakly supervised NER). Our paper in ICLR-2021 proposes u

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

 Few-NERD: Not Only a Few-shot NER Dataset
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

TPlinker for NER 中文/英文命名实体识别
TPlinker for NER 中文/英文命名实体识别

本项目是参考 TPLinker 中HandshakingTagging思想,将TPLinker由原来的关系抽取(RE)模型修改为命名实体识别(NER)模型。

A Unified Generative Framework for Various NER Subtasks.

This is the code for ACL-ICJNLP2021 paper A Unified Generative Framework for Various NER Subtasks. Install the package in the requirements.txt, then u

Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

Owner
Akshara P
IT undergrad @ NITK, Surathkal
Akshara P
A Unified Generative Framework for Various NER Subtasks.

This is the code for ACL-ICJNLP2021 paper A Unified Generative Framework for Various NER Subtasks. Install the package in the requirements.txt, then u

null 177 Jan 5, 2023
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

null 76 Dec 21, 2022
Robust Self-augmentation for NER with Meta-reweighting

Robust Self-augmentation for NER with Meta-reweighting

Lam chi 17 Nov 22, 2022
An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit

Pengfei Liu 388 Dec 18, 2022
Codes for "Template-free Prompt Tuning for Few-shot NER".

EntLM The source codes for EntLM. Dependencies: Cuda 10.1, python 3.6.5 To install the required packages by following commands: $ pip3 install -r requ

null 77 Dec 27, 2022
TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

null 98 Dec 15, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
null 190 Jan 3, 2023
This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

Akari Asai 59 Dec 28, 2022
Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

Punctuation Restoration using Transformer Models This repository contins official implementation of the paper Punctuation Restoration using Transforme

Tanvirul Alam 142 Jan 1, 2023