TDmatch is a Python library developed to perform matching tasks in three categories:

Overview

TDmatch

TDmatch is a Python library developed to perform matching tasks in three categories:

  • Text to Data which matches tuples of a table to text docuemts
  • Text to Structured text matches hierarchical taxonomy concepts to text docuemtns
  • Text to Text matches two copora of text documents

Folder notebooks contains notebooks for running different scenarios.

First, the model creates a graph from document copora, next it trains a word embedding model on random walks generated by tracersing the graph and fainally, by employing the generated model we can match metadata between two corpora.

We used 5 datasets in testing different tasks:

  • Two fact checking datasets: Politifact and Snopes which we use for Text to Text matching. These datasets are presented in That-is-a-Known-Lie. We also used STS dataset from GLUE as a text-to-text matching dataset.
  • Two datasets for Text to Data matching: IMDB which is created form IMDB top 1000 movies of all time. CoronaCheck dataset is presented in Scrutinizer

How to run

Use the notebook for the required task to generate the results for the required dataset.

All the notebooks have the similar structure:

  1. Creating the gaph
  2. (optional) Expanding the graph with external sources
  3. (optional) Compressing the graph with MSP
  4. Generating random walks on the graph and training Word embedding model on random walks.
  5. Matching metadata nodes with model and printing the results.

Using SSuM compression

  • First install the library following instructions Here
  • Use the code in SSuM block to generate input
  • Generate the compressed graph: ./run.sh input_path compression_ratio reconstruction_error

Expanding with ConceptNet

You might also like...
Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

This library provides an abstraction to perform Model Versioning using Weight & Biases.
This library provides an abstraction to perform Model Versioning using Weight & Biases.

Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod

Dense matching library based on PyTorch
Dense matching library based on PyTorch

Dense Matching A general dense matching library based on PyTorch. For any questions, issues or recommendations, please contact Prune at prune.truong@v

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Lightweight tool to perform MITM attack on local network

ARPSpy - A lightweight tool to perform MITM attack Using many library to perform ARP Spoof and auto-sniffing HTTP packet containing credential. (Never

This is the official implementation for
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

Owner
Naser Ahmadi
Naser Ahmadi
I created My own Virtual Artificial Intelligence named genesis, He can assist with my Tasks and also perform some analysis,,

Virtual-Artificial-Intelligence-genesis- I created My own Virtual Artificial Intelligence named genesis, He can assist with my Tasks and also perform

AKASH M 1 Nov 5, 2021
This project uses ViT to perform image classification tasks on DATA set CIFAR10.

Vision-Transformer-Multiprocess-DistributedDataParallel-Apex Introduction This project uses ViT to perform image classification tasks on DATA set CIFA

Kaicheng Yang 3 Jun 3, 2022
A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

AoxiangFan 11 Nov 7, 2022
Deep Learning Visuals contains 215 unique images divided in 23 categories

Deep Learning Visuals contains 215 unique images divided in 23 categories (some images may appear in more than one category). All the images were originally published in my book "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide".

Daniel Voigt Godoy 1.3k Dec 28, 2022
Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

Riley Baker 1 Jan 18, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
Vehicle direction identification consists of three module detection , tracking and direction recognization.

Vehicle-direction-identification Vehicle direction identification consists of three module detection , tracking and direction recognization. Algorithm

null 5 Nov 15, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.6k Dec 31, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.6k Jan 6, 2023
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

null 9.3k Feb 12, 2021