Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

AWS Samples

Last update: Jan 1, 2022

Related tags

Deep Learning amazon-sagemaker-build-a-knowledge-graph-pipeline

Overview

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

This repository provides a pipeline to create a knowledge graph from raw texts. The pipeline concatenate major steps including:

Data processing: transform labeled text data to the Subject-Predicate-Object (SPO) format
Training: use a RNN-based algorithm to train an AI model to predict SPOs from given texts
Create a Neptune database: if the training metric (F1-Score) passes the threshold, create a Neptune database
Batch Transform: use the model trained in the Training step to do inferences on the test data
Bulk load: transform the inference results to the format which can be recognized by the bulkload function of Neptune, and load the transformed data to the Neptune database.

Prerequisites

Create an AWS account or use an existing AWS account.
Create a SageMaker Notebook instance. When you set up the notebook instance, you need to pay attention to following configurations:
1. IAM role: you should attach policies of AmazonSageMakerFullAccess, IAMFullAccess, AmazonS3FullAccess, AmazonSNSFullAccess and NeptuneFullAccess to the IAM role.
2. Network: in order to access the Neptune database created in the pipeline, a VPC is required to run the notebook.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

You might also like...

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

6 Nov 14, 2022

Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

6 Dec 29, 2021

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

67 Dec 20, 2022

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

4 May 20, 2022

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

EasyDatas An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results Installation pip install git+https

4 Dec 14, 2021

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos. By adopting a unified pipeline-based API design, PyKale enforces standardization and minimalism, via reusing existing resources, reducing repetitions and redundancy, and recycling learning models across areas.

370 Dec 27, 2022

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

ood-text-emnlp Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them" Files fine_tune.py is used to finetune the GPT-2 mo

19 Oct 28, 2022

Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

20 Oct 18, 2022

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

Related tags

Overview

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

Prerequisites

Security

License

You might also like...

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Artificial intelligence technology inferring issues and logically supporting facts from raw text

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

Generate images from texts. In Russian. In PaddlePaddle

Owner

AWS Samples

Hough Transform and Hough Line Transform Using OpenCV

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

Point Cloud Denoising input segmentation output raw point-cloud valid/clear fog rain de-noised Abstract Lidar sensors are frequently used in environme

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision (ICCV 2021)

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022