Machine learning models from Singapore's NLP research community

AI Singapore | AI Makerspace

Last update: Dec 17, 2022

Related tags

Text Data & NLP sgnlp

Overview

SG-NLP

Machine learning models from Singapore's natural language processing (NLP) research community.

sgnlp is a Python package that allows you to easily get started on using various (NLP) models implemented using the Pytorch and Transfromers frameworks.

We have an accompanying demo site where you can interact with our models and get a better understanding on how they work.

Installation

Python >= 3.8

pip install sgnlp

Documentation

Visit our documentation for tutorials.

License

Code and models from this project are released under the MIT License unless otherwise stated. If a model's code is under a separate license, it can be found in the respective model's folder.

Comments

Change demo api to use gevent worker
Using multiple workers of the default type 'sync' in gunicorn is not working on Kubernetes

Workers constantly terminated due to signal 9

Try gevent to see if it works out
opened by jonheng 2
UFD use case tutorial and usability improvement
Added additional tutorial on how to use UFD to train and evaluate on custom dataset

Bug fix for UFD parse_args_and_load_config util function

Added feature to create folder if folder doesn't exist

Added some train args param in eval args param to improve usability

Made caching optional

Added validation to make debugging easier

Added links to config file examples for reccon models
opened by vincenttzc 1
Wrong assert comparison for SenticGCN dataclass
Latest SenticGCN implementation for the Dev branch. In the dataclass.py, post_init method in SenticGCNTrainArgs, there are the following assertions,

assert self.repeats > 1, "Repeats value must be at least 1." assert self.patience > 1, "Patience value must be at least 1."

The comparison operator should be >= instead.
bug
opened by raymondng76 0
47 centralized logging
Create a centralized logger for 'sgnlp' base logger

'sgnlp' logger is created from a config json and is init a the 'sgnlp' module init.py

Replace all logging method call with their own script specific logger
opened by raymondng76 0
Add parent class for preprocessor
[x] Create a module named sgnlp.base

[x] Add abstractmethods for preprocess, save, load

[x] Add batch iteration to parent __call__

[x] Parent __call__ should return a dictionary

enhancement
opened by jonheng 0
46 senticgcn bugfix
Add multi-word aspect support

Update documentation to reflect multi-word support

Update unit tests

Update usage example to include multi-word support
opened by raymondng76 0
Fix multi-word aspect issue with Sentic-GCN preprocessor

The current implementation of preprocessor matches a single aspect index for the purpose of matching postprocessor output. The aspect index field for process_input payload should be expended to handle aspects with multiple indexes.
bug

opened by raymondng76 0
Add Sentic-GCN demo_api to SGNlp
Close #43

This pull request is to add Sentic-GCN demo_api models to sgnlp. Includes the follow components:

model_card

api.py

dockerfiles

requirements.txt

usage.py
opened by K-WeiMing 0
Add Sentic-GCN to SGNlp
close #41

This pull request is to add Sentic-GCN models to sgnlp. Includes the follow components:

Models

Configs

Tokenizers

Embedding models

Trainer/Evaluator

Unit test

documentation

Does not include demo_api as it is covered in another issue tickets.
opened by raymondng76 0
download_pretrained for demo API does not cache downloaded files/models
To allow the containers to start up quicker, models and files were downloaded and cached during build time.

Recent changes in the huggingface transformers package has broken this functionality:

Released in v4.22.0

Issue

Possible choices moving forward:

Write a simple caching utility function

Stick to versions of transformers before 4.22.0
opened by jonheng 0
Add Stance Detection model

Paper: https://aclanthology.org/2020.emnlp-main.108.pdf

Prof: Jiang Jing from SMU

Repo: GitHub - jefferyYu/DualHierarchicalTransformer: Predicting Stance and Rumor Veracity via Dual Hierarchical Transformer

opened by atenzer 0

Releases(v0.4.0)

v0.4.0(Oct 7, 2022)

New model: Coherence Momentum Model
Source code(tar.gz)
Source code(zip)
v0.3.0(Apr 22, 2022)
New models:

Sentic GCN

LIF

UFD

Source code(tar.gz)
Source code(zip)
v0.2.0(Oct 19, 2021)
New models:

RST Pointer

GEC

Source code(tar.gz)
Source code(zip)
v0.1.1(Aug 26, 2021)

Bug fix on rumour detection module paths
Source code(tar.gz)
Source code(zip)
v0.1.0(Aug 26, 2021)

Removed UFD for further review.

Refactoring and improvements to LSR and Rumour detection models.
Source code(tar.gz)
Source code(zip)
v0.0.1(Aug 5, 2021)
Initial release of sgnlp.

Models included:

RECCON

LSR

UFD

Rumour detection twitter

Source code(tar.gz)
Source code(zip)

Owner

AI Singapore | AI Makerspace

Grow local AI talents and empowering start-ups, SMEs and enterprises with AI components, frameworks, platforms and advisory services.

GitHub

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

5.1k Dec 26, 2022

An open-source NLP research library, built on PyTorch.

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic

11.4k Jan 1, 2023

An open-source NLP research library, built on PyTorch.

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic

9.7k Feb 18, 2021

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

2 Sep 27, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

NLP, Machine learning

Netflix-recommendation-system NLP, Machine learning About Recommendation algorithms are at the core of the Netflix product. It provides their members

6 Jan 12, 2022

NLP - Machine learning

Flipkart-product-reviews NLP - Machine learning About Product reviews is an essential part of an online store like Flipkart’s branding and marketing.

1 Oct 29, 2021

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

1.6k Dec 27, 2022

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

1.1k Feb 14, 2021

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre-trained tasks. This library provides a standard, flexible and extensible framework to deploy the prompt-learning pipeline. OpenPrompt supports loading PLMs directly from huggingface transformers. In the future, we will also support PLMs implemented by other libraries.

2.3k Jan 8, 2023

Neural-Machine-Translation - Implementation of revolutionary machine translation models

Neural Machine Translation Framework: PyTorch Repository contaning my implementa

1 Feb 17, 2022

Ongoing research training transformer language models at scale, including: BERT & GPT-2

What is this fork of Megatron-LM and Megatron-DeepSpeed This is a detached fork of https://github.com/microsoft/Megatron-DeepSpeed, which in itself is

316 Jan 3, 2023

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Megatron (1 and 2) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.

3.5k Dec 30, 2022

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nijmegen Licensed under GPLv3 (See http://www.gnu.org/licenses/gpl-3.0.html

122 Nov 17, 2022

Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

1.8k Dec 27, 2022

Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

1.5k Feb 18, 2021

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

?? The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

15k Jan 2, 2023

Interpretable Models for NLP using PyTorch

This repo is deprecated. Please find the updated package here. https://github.com/EdGENetworks/anuvada Anuvada: Interpretable Models for NLP using PyT

19 Dec 17, 2022

Machine learning models from Singapore's NLP research community

Related tags

Overview

SG-NLP

Installation

Documentation

License

Comments

Releases(v0.4.0)

v0.4.0(Oct 7, 2022)

v0.3.0(Apr 22, 2022)

v0.2.0(Oct 19, 2021)

v0.1.1(Aug 26, 2021)

v0.1.0(Aug 26, 2021)

v0.0.1(Aug 5, 2021)

Owner

AI Singapore | AI Makerspace

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

An open-source NLP research library, built on PyTorch.

An open-source NLP research library, built on PyTorch.

Awesome-NLP-Research (ANLP)

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

NLP, Machine learning

NLP - Machine learning

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Neural-Machine-Translation - Implementation of revolutionary machine translation models

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Super easy library for BERT based NLP models

Super easy library for BERT based NLP models

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Interpretable Models for NLP using PyTorch