Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction

Jieun Han

Last update: Oct 6, 2022

Related tags

Deep Learning company_cluster

Overview

RE results graph visualization and company clustering

Installation

pip install -r requirements.txt
python -m nltk.downloader stopwords
python3.7 main.py

1. Paragraph-Level Relation Extraction using rule-based and SSAN

|- df4rule.py

Prerequiste
- You need csv files that are generated with finiancial_news_api
- Those files should be located in "visualization_code/rule_base_datasets/*.csv"
This code extracts relations with rule-based patterns.
- (S + V + O) -> (head: S, relation: V, tail: O )

|- df4ssan.py

Prerequiste
- We recommend you run SSAN independently, and make sure all relation extraction.json file from SSAN code saved in "output/*/SSAN_result_all_relation.json"
This code convert json file to dataframe and concat all the dataframes from various companies.

2. Graph visualization by degree and betweeness centrality using networkx

|- visualize_cent.py

output
- degree_centrality: "./graph_png/degree.png"
- betweenness_centrality: "./graph_png/between.png"

3. Get embedding vector with Node2vec Company clustering with K-means and GMM

|- node.py

|-similarity.py

output
- consine similarity: "./similarity_result/consine_similarity.csv"
- l2 norm: "./similarity_result/l2_norm.csv"

|- company_cluster.py

GMM (soft clustering) k: number of clusters

main.py company_clustering(com_list, com_vec, 4, 'gmm')
K-means (hard clustering)

main.py company_clustering(com_list, com_vec, 4, 'kmeans')

4. Visualize with PCA and TSNE

|-cluster_visualize.py

output
- PCA: "./graph_png/company_cluster_pca.png"
- TSNE: "./graph_png/company_cluster_tsne.png"

Output

degree_centrality: "./graph_png/degree.png"
betweenness_centrality: "./graph_png/between.png"
consine similarity: "./similarity_result/consine_similarity.csv"
l2 norm: "./similarity_result/l2_norm.csv"
PCA: "./graph_png/company_cluster_pca.png"
TSNE: "./graph_png/company_cluster_tsne.png"

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

MRefG Wanli Li and Tieyun Qian: "Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction", IJCNN 2021 1. Requirements To reproduc

5 Jul 26, 2022

PURE: End-to-End Relation Extraction

PURE: End-to-End Relation Extraction This repository contains (PyTorch) code and pre-trained models for PURE (the Princeton University Relation Extrac

657 Jan 9, 2023

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License 🎓 Introduction REval is a simple framework for

13 Jan 6, 2023

Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

109 Nov 29, 2022

A project for developing transformer-based models for clinical relation extraction

Clinical Relation Extration with Transformers Aim This package is developed for researchers easily to use state-of-the-art transformers models for ext

101 Dec 19, 2022

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

74 Nov 29, 2022

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Relation Classification via Multi-Level Attention CNNs It's a implement of this paper：Relation Classification via Multi-Level Attention CNNs. Training

2 Nov 4, 2022

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

5 Jul 18, 2022

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

297 Dec 27, 2022

Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction

Related tags

Overview

RE results graph visualization and company clustering

Installation

1. Paragraph-Level Relation Extraction using rule-based and SSAN

2. Graph visualization by degree and betweeness centrality using networkx

3. Get embedding vector with Node2vec Company clustering with K-means and GMM

4. Visualize with PCA and TSNE

Output

You might also like...

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

PURE: End-to-End Relation Extraction

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

A project for developing transformer-based models for clinical relation extraction

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

Owner

Jieun Han

Code for the paper "Relation of the Relations: A New Formalization of the Relation Extraction Problem"

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"