for a paper about leveraging discourse markers for training new models

Related tags

Deep Learning TSLM-DISCOURSE-MARKERS

Overview

TSLM-DISCOURSE-MARKERS

Scope

This repository contains:

(1) Code to extract discourse markers from wikipedia (TSA).

(1) Code to extract significant discoßurse markers from predictions over a sample

Usage

Evaluation code:

Installation

Using pip:

pip install git+ssh://[email protected]/IBM/tslm-discourse-markers.git#egg=tslm-discourse-markers

Alternatively, you can first clone the code, and install the requirements:

1. git clone [email protected]:IBM/tslm-discousrse-markers.git
2. cd tslm-discourse-markers
3. pip install -r requirements.txt

You also need to download fasttext model: curl https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin -o ~/Downloads/lid.176.bin and spacy english model: python -m spacy download en_core_web_sm

Running

Citing tslm-discourse-markers

If you are using tslm-discourse-markers in a publication, please cite the following paper:

Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin,Ranit Aharonov and Noam Slonim 2022 Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis. AAAI-2022.

Model

SenDM model can be found at: https://huggingface.co/ibm/tslm-discourse-markers

Loading dataset

import datasets

directory = 'dataset/WIKI_ENGLISH' datasets.load_dataset('csv', data_files={folder: [f'{directory}/{folder}/{folder}_*.csv.gz'] for folder in ['train', 'dev','test']})

Contributing

This project welcomes external contributions, if you would like to contribute please see further instructions here

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Fork the repo
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Changelog

Major changes are documented here.

Notes

If you have any questions or issues you can create a new issue here.

License

This code is distributed under Apache License 2.0. If you would like to see the detailed LICENSE click here.

Authors

The YASO dataset was collected by Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin, Ranit Aharonov and Noam Slonim.

The code was written by Ilya Shnayderman.

Leveraging OpenAI's Codex to solve cornerstone problems in Music

Music-Codex Leveraging OpenAI's Codex to solve cornerstone problems in Music Please NOTE: Presented generated samples were created by OpenAI's Codex P

2 Mar 11, 2022

The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Voxelator The all new way to turn your boring vector meshes into the new fad in town; Voxels! Notes: I have not tested this on a rotated mesh. With fu

6 Feb 3, 2022

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

78 Dec 27, 2022

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]

PINTO_model_zoo Please read the contents of the LICENSE file located directly under each folder before using the model. My model conversion scripts ar

2.4k Jan 5, 2023

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Res2Net The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture" Our paper is accepted by IEEE Transactions o

928 Dec 29, 2022

Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

31 Dec 5, 2022

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

27 Mar 30, 2022

Ever felt tired after preprocessing the dataset, and not wanting to write any code further to train your model? Ever encountered a situation where you wanted to record the hyperparameters of the trained model and able to retrieve it afterward? Models Playground is here to help you do that. Models playground allows you to train your models right from the browser.

Models Playground 🗂️ Upload a Preprocessed Dataset 🌠 Choose whether to perform Classification or Regression 🦹 Enter the Dependent Variable ?

19 Dec 10, 2022

for a paper about leveraging discourse markers for training new models

Related tags

Overview

TSLM-DISCOURSE-MARKERS

Scope

Usage

Citing tslm-discourse-markers

Model

Loading dataset

Contributing

Changelog

Notes

License

Authors

You might also like...

Leveraging OpenAI's Codex to solve cornerstone problems in Music

The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Code for pre-training CharacterBERT models (as well as BERT models).

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

Owner

International Business Machines

This repository contains the accompanying code for Deep Virtual Markers for Articulated 3D Shapes, ICCV'21

Motion and Shape Capture from Sparse Markers

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

Build and run Docker containers leveraging NVIDIA GPUs

PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)