803 Python Twitter-nlp Libraries

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Simplemma: a simple multilingual lemmatizer for Python Purpose Lemmatization is the process of grouping together the inflected forms of a word so they

70 Dec 29, 2022

Faster, modernized fork of the language identification tool langid.py

py3langid py3langid is a fork of the standalone language identification tool langid.py by Marco Lui. Original license: BSD-2-Clause. Fork license: BSD

12 Nov 5, 2022

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

TWINT - Twitter Intelligence Tool No authentication. No API. No limits. Twint is an advanced Twitter scraping tool written in Python that allows for s

14.2k Jan 3, 2023

Quick insights from Zoom meeting transcripts using Graph + NLP

Transcript Analysis - Graph + NLP This program extracts insights from Zoom Meeting Transcripts (.vtt) using TigerGraph and NLTK. In order to run this

7 Sep 17, 2022

Dynamic Twitter banner, to show off your spotify status. Banner updated every 5 minutes.

Spotify Twitter Banner Dynamic Twitter banner, to show off your spotify status. Banner updated every 5 minutes. Installation and Usage Install the dep

23 Jan 5, 2023

Unofficial Python library for using the Polish Wordnet (plWordNet / Słowosieć)

Polish Wordnet Python library Simple, easy-to-use and reasonably fast library for using the Słowosieć (also known as PlWordNet) - a lexico-semantic da

12 Dec 23, 2022

A Japanese tokenizer based on recurrent neural networks

Nagisa is a python module for Japanese word segmentation/POS-tagging. It is designed to be a simple and easy-to-use tool. This tool has the following

325 Jan 5, 2023

Crawl BookCorpus

These are scripts to reproduce BookCorpus by yourself.

590 Jan 3, 2023

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

BERT-of-Theseus Code for paper "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing". BERT-of-Theseus is a new compressed BERT by progre

284 Nov 25, 2022

Basic yet complete Machine Learning pipeline for NLP tasks

Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol

20 Aug 22, 2022

Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Introduction Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization,

1 Apr 25, 2022

Twitter API with fastAPI

Twitter API with fastAPI Content Forms Cookies and headers management Files edition Status codes HTTPExceptions Docstrings or documentation Deprecate

1 Dec 21, 2021

Sentiment Analysis web app using Streamlit - American Airlines Tweets

Analyse des sentiments à partir des Tweets L'application est développée par Streamlit L'analyse sentimentale est effectuée sur l'ensemble de données d

2 Feb 4, 2022

Twitter feed of newly published articles in Limnology

limnopapers Code to monitor limnology RSS feeds and tweet new articles. Scope The keywords and journal choices herein aim to focus on limnology (the s

7 Dec 20, 2022

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Language Models are Few-shot Multilingual Learners Paper This is the source code of the paper [Arxiv] [ACL Anthology]: This code has been written usin

45 Nov 21, 2022

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

Breame ( British English and American English) Breame is a lightweight Python package with a number of utility tools to aid in the detection of words

8 Oct 10, 2022

Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

FREE_7773 Repo containing material for the NYU class (Master of Engineering) I teach on NLP, ML Sys etc. For context on what the class is trying to ac

90 Dec 19, 2022

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

58 Dec 23, 2022

Contains links to publicly available datasets for modeling health outcomes using speech and language.

speech-nlp-datasets Contains links to publicly available datasets for modeling various health outcomes using speech and language. Speech-based Corpora

77 Dec 7, 2022

This is a Python package to create a snowflake identifier similar to Discord's or Twitter's.

snowflake2 Based on falcondai and fenhl's Python snowflake tool, but with documentation and simliarities to Discord. Installation instructions Install

2 Mar 19, 2022

A PyTorch-based model pruning toolkit for pre-trained language models

English | 中文说明 TextPruner是一个为预训练语言模型设计的模型裁剪工具包，通过轻量、快速的裁剪方法对模型进行结构化剪枝，从而实现压缩模型体积、提升模型速度。其他相关资源：知识蒸馏工具TextBrewer：https://github.com/airaria/TextBrewe

231 Jan 8, 2023

Based on falcondai and fenhl's Python snowflake tool, but with documentation and simliarities to Discord.

python-snowflake-2 Based on falcondai and fenhl's Python snowflake tool, but with documentation and simliarities to Discord. Docs make_snowflake This

2 Mar 19, 2022

BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

41 Dec 27, 2022

BERT-based Financial Question Answering System

BERT-based Financial Question Answering System In this example, we use Jina, PyTorch, and Hugging Face transformers to build a production-ready BERT-b

61 Sep 18, 2022

Infinity: a Twitter retweet bot that can be used by anyone

INSTAMATE Requires Firefox Instapy Python3 How To Use? Fork the repository Add your credentials in the bot.py file Save commits Clone your fork cd int

3 Jun 23, 2022

Chinese NER with albert/electra or other bert descendable model (keras)

Chinese NLP (albert/electra with Keras) Named Entity Recognization Project Structure ./ ├── NER │ ├── __init__.py │ ├── log

2 Nov 20, 2022

Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

135 Dec 29, 2022

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

564 Jan 8, 2023

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

Subformer This repository contains the code for the Subformer. To help overcome this we propose the Subformer, allowing us to retain performance while

10 Dec 27, 2022

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo

30 Jul 13, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

77.1k Dec 31, 2022

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine

7.8k Jan 9, 2023

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

2.1k Dec 28, 2022

Awesome Treasure of Transformers Models Collection

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

577 Jan 7, 2023

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

28 May 25, 2021

Powerful unsupervised domain adaptation method for dense retrieval.

Powerful unsupervised domain adaptation method for dense retrieval

191 Dec 28, 2022

Converts a Bangla numeric string to literal words.

Bangla Number in Words Converts a Bangla numeric string to literal words. Install $ pip install banglanum2words Usage

3 Aug 29, 2022

Text completion with Hugging Face and TensorFlow.js running on Node.js

Katana ML Text Completion 🤗 Description Runs with with Hugging Face DistilBERT and TensorFlow.js on Node.js distilbert-model - converter from Hugging

2 Nov 4, 2022

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

2 Jan 18, 2022

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

1.4k Jan 1, 2023

📖 GitHub action schedular (cron) that posts a Hadith every hour on Twitter & Facebook.

Hadith Every Hour 📖 A bot that posts a Hadith every hour on Twitter & Facebook (Every 3 hours for now to avoid spamming) Follow on Twitter @HadithEve

13 Dec 14, 2022

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

keytotext Idea is to build a model which will take keywords as inputs and generate sentences as outputs. Potential use case can include: Marketing Sea

364 Jan 3, 2023

A toolkit for document-level event extraction, containing some SOTA model implementations

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le

84 Dec 15, 2022

This repos is auto action which generating a wordcloud made by Twitter.

auto_tweet_wordcloud This repos is auto action which generating a wordcloud made by Twitter. Preconditions Install Python dependencies pip install -r

0 Apr 29, 2022

Leverage Twitter API v2 to analyze tweet metrics such as impressions and profile clicks over time.

Tweetmetric Tweetmetric allows you to track various metrics on your most recent tweets, such as impressions, retweets and clicks on your profile. The

29 Oct 18, 2022

RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into

2 Sep 16, 2022

Ethereum transactions and wallet information for people you follow on Twitter.

ethFollowing Ethereum transactions and wallet information for people you follow on Twitter. Set up Setup python environment (requires python 3.8): vir

2 Dec 28, 2021

NLP tool to extract emotional phrase from tweets 🤩

Emotional phrase extractor Extract phrase in the given text that is used to express the sentiment. Capturing sentiment in language is important in the

38 Oct 17, 2022

A bot that updates about the most subscribed artist' channels on YouTube

A bot that updates about the most subscribed artist' channels on YouTube. A weekly top chart report is provided every Monday. It posts updates on Twitter

5 Dec 14, 2022

Python API Client for Twitter API v2

🐍 Python Client For Twitter API v2 🚀 Why Twitter Stream ? Twitter-Stream.py a python API client for Twitter API v2 now supports FilteredStream, Samp

31 Nov 19, 2022

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

TwitterSuicideML Scripts for reproducing the Machine Learning analysis of the paper: Detecting Potentially Harmful and Protective Suicide-related Cont

3 Oct 17, 2022

A Phyton script working for stream twits from twitter by tweepy to mongoDB

twitter-to-mongo A python script that uses the Tweepy library to pull Tweets with specific keywords from Twitter's Streaming API, and then stores the

3 Feb 3, 2022

Continuously update some NLP practice based on different tasks.

NLP_practice We will continuously update some NLP practice based on different tasks. prerequisites Software pytorch = 1.10 torchtext = 0.11.0 sklear

0 Jan 5, 2022

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Trapper (Transformers wRAPPER) Trapper is an NLP library that aims to make it easier to train transformer based models on downstream tasks. It wraps h

42 Sep 21, 2022

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

922 Dec 10, 2021

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye, you can search with various keywords and usernames on Twitter.

19 Dec 12, 2022

This is a project of data parallel that running on NLP tasks.

2 Dec 12, 2021

Hashformers is a framework for hashtag segmentation with transformers.

Hashtag segmentation is the task of automatically inserting the missing spaces between the words in a hashtag. Hashformers applies Transformer models

41 Nov 9, 2022

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

20 May 17, 2022

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

20 May 17, 2022

This repository contains Python scripts for extracting linguistic features from Filipino texts.

Filipino Text Linguistic Feature Extractors This repository contains scripts for extracting linguistic features from Filipino texts. The scripts were

1 Oct 5, 2021

Chatbot in 200 lines of code using TensorLayer

Seq2Seq Chatbot This is a 200 lines implementation of Twitter/Cornell-Movie Chatbot, please read the following references before you read the code: Pr

820 Dec 17, 2022

End-To-End Memory Network using Tensorflow

MemN2N Implementation of End-To-End Memory Networks with sklearn-like interface using Tensorflow. Tasks are from the bAbl dataset. Get Started git clo

339 Oct 27, 2022

Tensorflow implementation of Character-Aware Neural Language Models.

Character-Aware Neural Language Models Tensorflow implementation of Character-Aware Neural Language Models. The original code of author can be found h

751 Dec 26, 2022

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

3 May 25, 2022

Course project of NLP@UCAS

NaiveMT Prepare Clone this repository git clone [email protected]:Poeroz/NaiveMT.git Install Please first install PyTorch = 1.5.0, then type the followi

2 Apr 24, 2022

Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

Simple bots or Simbots is a library designed to create simple chat bots using the power of python. This library utilises Intent, Entity, Relation and

14 Dec 15, 2021

Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming

Using Streaming Twitter Data with Kafka and Spark Reading streams of Twitter data, publishing them to Kafka topic, process message using Kafka Stream

1 Dec 6, 2021

A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

29 Nov 26, 2022

An assignment from my grad-level data mining course demonstrating some experience with NLP/neural networks/Pytorch

NLP-Pytorch-Assignment An assignment from my grad-level data mining course (before I started personal projects) demonstrating some experience with NLP

0 Feb 6, 2022

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

GPU Docker NLP Application Deployment Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU, to setup the enviroment on

9 Oct 14, 2022

Natural Language Processing Tasks and Examples.

Natural Language Processing Tasks and Examples With the advancement of A.I. technology in recent years, natural language processing technology has bee

53 Dec 20, 2022

An automated bot for twitter using Tweepy!

Tweeby An automated bot for twitter using Tweepy! About This bot will look for tweets that contain certain hashtags, if found. It'll send them a messa

1 Dec 6, 2021

Pipeline for training LSA models using Scikit-Learn.

Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j

23 Sep 5, 2022

This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

FALLABOUT-SRMMIC 21 POETRY-GENERATION HINGLISH DESCRIPTION We have developed a NLP(natural language processing) model which automatically generates a

7 Sep 28, 2021

Pre-Training with Whole Word Masking for Chinese BERT

7.7k Dec 31, 2022

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

English|简体中文 ERNIE是百度开创性提出的基于知识增强的持续学习语义理解框架，该框架将大数据预训练与多源丰富知识相结合，通过持续学习技术，不断吸收海量文本数据中词汇、结构、语义等方面的知识，实现模型效果不断进化。ERNIE在累积 40 余个典型 NLP 任务取得 SOTA 效果，并在 G

5.4k Jan 3, 2023

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

This repository contains the resources in our paper "Revisiting Pre-trained Models for Chinese Natural Language Processing", which will be published i

463 Dec 30, 2022

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

SentAugment SentAugment is a data augmentation technique for semi-supervised learning in NLP. It uses state-of-the-art sentence embeddings to structur

363 Dec 30, 2022

Sorce code and datasets for "K-BERT: Enabling Language Representation with Knowledge Graph",

K-BERT Sorce code and datasets for "K-BERT: Enabling Language Representation with Knowledge Graph", which is implemented based on the UER framework. R

834 Jan 9, 2023

Modified GPT using average pooling to reduce the softmax attention memory constraints.

NLP-GPT-Upsampling This repository contains an implementation of Open AI's GPT Model. In particular, this implementation takes inspiration from the Ny

1 Dec 3, 2021

Code for Editing Factual Knowledge in Language Models

KnowledgeEditor Code for Editing Factual Knowledge in Language Models (https://arxiv.org/abs/2104.08164). @inproceedings{decao2021editing, title={Ed

86 Nov 28, 2022

Code for the paper "Are Sixteen Heads Really Better than One?"

Are Sixteen Heads Really Better than One? This repository contains code to reproduce the experiments in our paper Are Sixteen Heads Really Better than

143 Dec 14, 2022

The official implementation of "BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?, ACL 2021 main conference"

BERT is to NLP what AlexNet is to CV This is the official implementation of BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Iden

20 Nov 3, 2022

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

2.1k Dec 28, 2022

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

efficient-task-transfer This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection".

26 Dec 24, 2022

Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

340 Jan 3, 2023

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Pattern-Exploiting Training (PET) This repository contains the code for Exploiting Cloze Questions for Few-Shot Text Classification and Natural Langua

1.4k Dec 30, 2022

EasyTransfer is designed to make the development of transfer learning in NLP applications easier.

EasyTransfer is designed to make the development of transfer learning in NLP applications easier. The literature has witnessed the success of applying

819 Jan 3, 2023

Dataset for the Research2Clinics @ NeurIPS 2021 Paper: What Do You See in this Patient? Behavioral Testing of Clinical NLP Models

Behavioral Testing of Clinical NLP Models This repository contains code for testing the behavior of clinical prediction models based on patient letter

2 Sep 20, 2022

Behavioral Testing of Clinical NLP Models

Behavioral Testing of Clinical NLP Models This repository contains code for testing the behavior of clinical prediction models based on patient letter

2 Sep 20, 2022

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

3 Jun 28, 2022

Python Twitter-nlp Resources

Python twitter-nlp Libraries

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Faster, modernized fork of the language identification tool langid.py

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

Quick insights from Zoom meeting transcripts using Graph + NLP

Dynamic Twitter banner, to show off your spotify status. Banner updated every 5 minutes.

Unofficial Python library for using the Polish Wordnet (plWordNet / Słowosieć)

A Japanese tokenizer based on recurrent neural networks

Crawl BookCorpus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Basic yet complete Machine Learning pipeline for NLP tasks

Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Twitter API with fastAPI

Sentiment Analysis web app using Streamlit - American Airlines Tweets

Twitter feed of newly published articles in Limnology

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Contains links to publicly available datasets for modeling health outcomes using speech and language.

This is a Python package to create a snowflake identifier similar to Discord's or Twitter's.

A PyTorch-based model pruning toolkit for pre-trained language models

Based on falcondai and fenhl's Python snowflake tool, but with documentation and simliarities to Discord.

BERT, LDA, and TFIDF based keyword extraction in Python

BERT-based Financial Question Answering System

Infinity: a Twitter retweet bot that can be used by anyone

Chinese NER with albert/electra or other bert descendable model (keras)

Label data using HuggingFace's transformers and automatically get a prediction service

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Awesome Treasure of Transformers Models Collection

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Powerful unsupervised domain adaptation method for dense retrieval.

Converts a Bangla numeric string to literal words.

Text completion with Hugging Face and TensorFlow.js running on Node.js

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

📖 GitHub action schedular (cron) that posts a Hadith every hour on Twitter & Facebook.

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

A toolkit for document-level event extraction, containing some SOTA model implementations

This repos is auto action which generating a wordcloud made by Twitter.

Leverage Twitter API v2 to analyze tweet metrics such as impressions and profile clicks over time.

RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Ethereum transactions and wallet information for people you follow on Twitter.

NLP tool to extract emotional phrase from tweets 🤩

A bot that updates about the most subscribed artist' channels on YouTube

Python API Client for Twitter API v2

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

A Phyton script working for stream twits from twitter by tweepy to mongoDB

Continuously update some NLP practice based on different tasks.

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

This is a project of data parallel that running on NLP tasks.

Hashformers is a framework for hashtag segmentation with transformers.

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

This repository contains Python scripts for extracting linguistic features from Filipino texts.

Chatbot in 200 lines of code using TensorLayer

End-To-End Memory Network using Tensorflow

Tensorflow implementation of Character-Aware Neural Language Models.

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

Course project of NLP@UCAS

Simple bots or Simbots is a library designed to create simple bots using the power of python. This library utilises Intent, Entity, Relation and Context model to create bots .

Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming

A Chinese to English Neural Model Translation Project

An assignment from my grad-level data mining course demonstrating some experience with NLP/neural networks/Pytorch

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

Natural Language Processing Tasks and Examples.

An automated bot for twitter using Tweepy!

Pipeline for training LSA models using Scikit-Learn.

This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

Pre-Training with Whole Word Masking for Chinese BERT

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)