Library extending Jupyter notebooks to integrate with Apache TinkerPop and RDF SPARQL.

Amazon Web Services

Last update: Dec 28, 2022

Related tags

Deep Learning jupyter graph sparql neptune gremlin jupyter-widgets

Overview

Graph Notebook: easily query and visualize graphs

The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs.

Visualizing Gremlin queries:

Visualizing openCypher queries

Visualizing SPARQL queries:

Instructions for connecting to the following graph databases:

Endpoint	Graph model	Query language
Gremlin Server	property graph	Gremlin
Blazegraph	RDF	SPARQL
Amazon Neptune	property graph or RDF	Gremlin or SPARQL

We encourage others to contribute configurations they find useful. There is an additional-databases folder where more information can be found.

Features

Notebook cell 'magic' extensions in the IPython 3 kernel

%%sparql - Executes a SPARQL query against your configured database endpoint.

%%gremlin - Executes a Gremlin query against your database using web sockets. The results are similar to those a Gremlin console would return.

%%opencypher or %%oc Executes an openCypher query against your database.

%%graph_notebook_config - Sets the executing notebook's database configuration to the JSON payload provided in the cell body.

%%graph_notebook_vis_options - Sets the executing notebook's vis.js options to the JSON payload provided in the cell body.

%%neptune_ml - Set of commands to integrate with NeptuneML functionality. Documentation

TIP 👉 There is syntax highlighting for %%sparql, %%gremlin and %%oc cells to help you structure your queries more easily.

Notebook line 'magic' extensions in the IPython 3 kernel

%gremlin_status - Obtain the status of Gremlin queries. Documentation

%sparql_status - Obtain the status of SPARQL queries. Documentation

%opencypher_status or %oc_status - Obtain the status of openCypher queries.

%load - Generate a form to submit a bulk loader job. Documentation

%load_ids - Get ids of bulk load jobs. Documentation

%load_status - Get the status of a provided load_id. Documentation

%neptune_ml - Set of commands to integrate with NeptuneML functionality. You can find a set of tutorial notebooks here. Documentation

%status - Check the Health Status of the configured host endpoint. Documentation

%seed - Provides a form to add data to your graph without the use of a bulk loader. Supports both RDF and Property Graph data models.

%stream_viewer - Interactively explore the Neptune CDC stream (if enabled)

%graph_notebook_config - Returns a JSON payload that contains connection information for your host.

%graph_notebook_host - Set the host endpoint to send queries to.

%graph_notebook_version - Print the version of the graph-notebook package

%graph_notebook_vis_options - Print the Vis.js options being used for rendered graphs

TIP 👉 You can list all the magics installed in the Python 3 kernel using the %lsmagic command.

TIP 👉 Many of the magic commands support a --help option in order to provide additional information.

Example notebooks

This project includes many example Jupyter notebooks. It is recommended to explore them. All of the commands and features supported by graph-notebook are explained in detail with examples within the sample notebooks. You can find them here. As this project has evolved, many new features have been added. If you are already familiar with graph-notebook but want a quick summary of new features added, a good place to start is the Air-Routes notebooks in the 02-Visualization folder.

Keeping track of new features

It is recommended to check the ChangeLog.md file periodically to keep up to date as new features are added.

Prerequisites

You will need:

Python 3.6.13-3.9.7
RDFLib 5.0.0
A graph database that provides one or more of:
- A SPARQL 1.1 endpoint
- An Apache TinkerPop Gremlin Server compatible endpoint
- An endpoint compatible with openCypher

Installation

# pin specific versions of required dependencies
pip install rdflib==5.0.0

# install the package
pip install graph-notebook

# install and enable the visualization widget
jupyter nbextension install --py --sys-prefix graph_notebook.widgets
jupyter nbextension enable  --py --sys-prefix graph_notebook.widgets

# copy static html resources
python -m graph_notebook.static_resources.install
python -m graph_notebook.nbextensions.install

# copy premade starter notebooks
python -m graph_notebook.notebooks.install --destination ~/notebook/destination/dir  

# start jupyter
python -m graph_notebook.start_notebook --notebooks-dir ~/notebook/destination/dir

Connecting to a graph database

Gremlin Server

In a new cell in the Jupyter notebook, change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. Optionally, modify traversal_source if your graph traversal source name differs from the default value. For a local Gremlin server (HTTP or WebSockets), you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 8182,
  "ssl": false,
  "gremlin": {
    "traversal_source": "g"
  }
}

To setup a new local Gremlin Server for use with the graph notebook, check out additional-databases/gremlin server

Blazegraph

Change the configuration using %%graph_notebook_config and modify the fields for host, port, and ssl. For a local Blazegraph database, you can use the following command:

%%graph_notebook_config
{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "sparql"
  }
}

You can also make use of namespaces for Blazegraph by specifying the path graph-notebook should use when querying your SPARQL like below:

%%graph_notebook_config

{
  "host": "localhost",
  "port": 9999,
  "ssl": false,
  "sparql": {
    "path": "blazegraph/namespace/foo/sparql"
  }
}

This will result in the url localhost:9999/blazegraph/namespace/foo/sparql being used when executing any %%sparql magic commands.

To setup a new local Blazegraph database for use with the graph notebook, check out the Quick Start from Blazegraph.

Amazon Neptune

Change the configuration using %%graph_notebook_config and modify the defaults as they apply to your Neptune cluster:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "DEFAULT",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

To setup a new Amazon Neptune cluster, check out the Amazon Web Services documentation.

When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow this guide.

Authentication (Amazon Neptune)

If you are running a SigV4 authenticated endpoint, ensure that your configuration has auth_mode set to IAM:

%%graph_notebook_config
{
  "host": "your-neptune-endpoint",
  "port": 8182,
  "auth_mode": "IAM",
  "load_from_s3_arn": "",
  "ssl": true,
  "aws_region": "your-neptune-region"
}

Additionally, you should have the following Amazon Web Services credentials available in a location accessible to Boto3:

Access Key ID
Secret Access Key
Default Region
Session Token (OPTIONAL. Use if you are using temporary credentials)

These variables must follow a specific naming convention, as listed in the Boto3 documentation

A list of all locations checked for Amazon Web Services credentials can also be found here.

Contributing Guidelines

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Comments

[BUG] Neptune_ML widget error in 2.0.9
Describe the bug Starting in version 2.0.9 the neptune_ml widget is having an issue where the json values being passed in are getting the following error

{'error': JSONDecodeError('Expecting value: line 1 column 1 (char 0)',)}

To Reproduce Steps to reproduce the behavior:

Run through the 01-Introduction-to-Node-Classification-Gremlin notebook

When you get to the export step the error occurs

Additional context This is not a problem in version 2.0.7
bug
opened by bechbd 22
Cannot install: No module named 'graph_notebook'
Describe the bug I cannot install your graph notebook. I receive the error No module named 'graph_notebook' when following your installation steps.

To Reproduce Steps to reproduce the behavior:

Create virtual environment with venv: python -m venv .env

Activate the virtual environment (.e.g, source .env/bin/activate).

Upgrade pip to 20.3.1: pip install -U pip)

Install requirements: pip install -r requirements.txt

Per instructions, install and enable the visualization widget: jupyter nbextension install --py --sys-prefix graph_notebook.widgets I receive the ERROR: ModuleNotFoundError: No module named 'graph_notebook'

I have tried running the jupyter nbextension install --py --sys-prefix graph_notebook.widgets command from the src directory and I received the same error.

Desktop (please complete the following information):

OS: OS X 11 (Big Sur)

Browser Safari

terminal: running zsh

bug
opened by wdduncan 13

Documentation for specifying sparql paths on Blazegraph

Does PR https://github.com/aws/graph-notebook/pull/49 fix issues #39 and #45 ? If so, can you please post documentation? I've tried:

%%graph_notebook_config
{
  "host": "http://kg-hub-rdf.berkeleybop.io",
  "port": 80,
  "auth_mode": "DEFAULT",
  "iam_credentials_provider_type": "ROLE",
  "load_from_s3_arn": "",
  "ssl": false,
  "aws_region": "us-east-1"
  "sparql": {
       "blazegraph/sparql"
   }
}

and

%%graph_notebook_config
{
  "host": "http://kg-hub-rdf.berkeleybop.io",
  "port": 80,
  "auth_mode": "DEFAULT",
  "iam_credentials_provider_type": "ROLE",
  "load_from_s3_arn": "",
  "ssl": false,
  "aws_region": "us-east-1"
  "sparql_path": "blazegraph/sparql"
}

But I receive syntax errors.

opened by wdduncan 12

[BUG] Some gremlin queries not generating graphs in Air-Routes-Gremlin.ipynb
Describe the bug Several of the cells in the Air-Routes-Gremlin.ipynb do not generate results in the graph tab.

To Reproduce Steps to reproduce the behavior:

Go to Air-Routes-Gremlin.ipynb

Scroll down to the text "The next query also produces a result that is fun to explore using the Graph tab"

Run the "my_node_labels" cell

Run the gremlin query cell

There is only a Console and Query Metadata tabs.

Expected behavior A graph tab with interesting results.

Screenshots

Desktop (please complete the following information):

OS: Ubuntu 20.04

Browser: Chrome

Version: 97.0.4692.99

Additional context Latest version of graph-notebook 3.1.1 Backend gremlin-server using instructions here. Seeded with %seed in notebook.
bug
opened by holleyism 10
[BUG] SPARQL load error due to lack of escaping apostrophe '
Describe the bug on Jupyter notebook from the Sagemaker notebook instance launched from Neptune Console UI. On notebook /Neptune/02-Visualization/Air-Routes-SPARQ.ipynb, section "Let's load some RDF Data". After choosing SPARQL and Airport from the drop down list and clicking the submit button, got following error:

Loading data set airports with language sparql 1/3: 0_nodes.txt { "requestId": "0cbc40eb-51e1-eb7a-2b7f-d3544414d259", "code": "MalformedQueryException", "detailedMessage": "Malformed query: Lexical error at line 261, column 38. Encountered: \" \" (32), after : \"Hare\"" }

I suspect that it is the processing of "O'Hare Airport" causing the error when the data loader can't handle unescaped Apostrophe "'". The same error also occurred in the EPL-SPARQL notebook when loading the Football data set where "St. Mary's Park" and "St. James Park" caused same error message. After modifying the query to triple quote """St. Mary's Park", the load worked.
bug
opened by xiaokunx 8
[BUG] Graph visualization does not support multivalue properties
Graph visualization does not support multivalue properties

Steps to reproduce the behavior:

Set up a graph where vertices have multivalue properties (i.e. have set cardinality)

query g.V().outE().inV().path().by(elementMap()). This works but you cannot see all values of the multivalue properties.

If you change the query to use valueMap g.V().outE().inV().path().by(valueMap()), the visualization does not render properly. Some vertices are drawn but they do not represent the graph.

Expected behavior Graph is visualized correctly even when valueMap() is used and multivalue properties can be viewed in the visualization "Details" box

Screenshots This is how the graph looks when using elementMap()

This is how the graph looks when I use valueMap()

Desktop (please complete the following information):

OS: macOS 12.6

Browser: Chrome 105.0.5195.125

Version: graph-notebook 3.6.0

bug
opened by FsecureSamiTikka 6
Identity Graph ETL notebook
Issue #, if available: N/A

Description of changes:

Add a new Identity Graph sample notebook demonstrating how to set up an AWS Glue based ETL pipeline.

Added utility library to setup the ETL pipeline

Added AWS Glue ETL scripts

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
opened by abhishekpradeepmishra 6
[BUG] Cannot issue Gremlin queries
Describe the bug Using the steps described in the setup results in an error:

{ 'error': GremlinServerError ( '597: No signature of method: org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph.addV() is applicable for argument types: (String) values: [person] Possible solutions: any(), any(groovy.lang.Closure), wait(), open(), tx(), find()' ) }

To Reproduce Steps to reproduce the behavior:

Download the Gremlin Server from https://tinkerpop.apache.org/ and unzip it. The remaining steps in this section assume you have made your working directory the place where you performed the unzip.

In conf/tinkergraph-empty.properties, change the ID manager from LONG to ANY to enable IDs that include text strings.
gremlin.tinkergraph.vertexIdManager=ANY

Optionally add another line doing the same for edge IDs.
gremlin.tinkergraph.edgeIdManager=ANY

To enable HTTP as well as Web Socket connections to the Gremlin Server, edit the file /conf/gremlin-server.yaml and change
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer

to

channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer

This will allow you to access the Gremlin Server from Jupyter using commands like curl as well as using the %%gremlin cell magic. This step is optional if you do not need HTTP connectivity to the server.

Start the Gremlin server bin/gremlin-server.sh start

Connecting to a local Gremlin Server from Jupyter:

In the Jupyter Notebook disable SSL using %%graph_notebook_config and change the host to localhost. Keep the other defaults even though they are not used for configuring the Gremlin Server.

%%graph_notebook_config { "host": "localhost", "port": 8182, "ssl": false, "gremlin": { "traversal_source": "g", "username": "", "password": "", "message_serializer": "graphsonv3" } }

If the Gremlin Server you wish to connect to is remote, replacing localhost with the IP address or DNS of the remote server should work. This assumes you have access to that server from your local machine.

Validate connection.

%status

Issue query (from here)

%%gremlin g.addV('person').property('name', 'dan') .addV('person').property('name', 'mike') .addV('person').property('name', 'saikiran')

Expected behavior The vertices should be added with no error messages.

Desktop (please complete the following information):

Gremlin Server version: 3.6.1

JupyterLab version: 3.5.1

OS: Windows 10 / Debian Bullseye

Browser: Firefox, Chrome

bug
opened by whitehorsesoft 5
[BUG] Cell magic not found
Cell magic not found Cell magic such as %%graph_notebook_config and %%status not found.

To Reproduce Steps to reproduce the behavior:

Run docker image jupyter/minimal-notebook.

Open in browser

Verify normal notebook functionality works without graph-notebook.

Install graph-notebook using the commands found here:

# pin specific versions of required dependencies pip install rdflib==5.0.0 # install the package pip install graph-notebook

Attempt to change configuration according to directions here:

%%graph_notebook_config { "host": "localhost", "port": 8182, "ssl": false, "gremlin": { "traversal_source": "g", "username": "", "password": "", "message_serializer": "graphsonv3" } }

Error occurs: "UsageError: Cell magic %%graph_notebook_config not found."

Verify error happens even after restarting kernel and restarting docker container. Expected behavior The cell magic should work as expected.

Desktop (please complete the following information):

OS: Win10/WSL/Ubuntu

Browser Chrome

Version latest

bug
opened by whitehorsesoft 5
Support for virtuoso sparql endpoint

Is your feature request related to a problem? Please describe. I've been trying to have graph-notebook connect to a virtuoso sparql endpoint, without success.

Describe the solution you'd like Support for virtuoso sparql endpoints. Or, if already possible, documentation about connection setup.
question

opened by Jefwillems 5
Error displaying widget

Describe the bug When I run tutorial notebooks and the result should be visualized, it gives me an error: "Error displaying widget". For queries without a path, it gives tabs "Console" and "Query Metadata" without any problems.

To Reproduce I run it on JupyterLab 3.2.8 (I tried it with 3.4.2 but the result was the same). Python version 3.9.7

Expected behavior I would love to see the results from the queries. Any idea of what could help will be appreciated.

Screenshot
bug needs information

opened by anezkakot 4
Sizing of edges based on a property

Is your feature request related to a problem? Please describe. I am building a graph with AWS Neptune which vertices are geolocated points. One property of the edges is the distance between endpoint vertices.

Describe the solution you'd like It would be great that the edges use this property to set their distance proportionally, that way, the vertices would be self-distributed as in a map.

opened by AlbertoRodriguezSerrano 1
Truncate query request time in metadata
Currently the query metadata reports a query request time in ms with a large number of decimal places such as:

Request execution time (ms) | 189.0205078125

Given the resolution of statement execution and network delays, etc, not sure that number of decimals points does anything but make the results harder to read :)

Would suggest we truncate it down to a reasonable number or round it to the closest ms?
opened by jklap 0
[BUG] Full screen Visualization does not work in Safari
Describe the bug The full screen Visualization button does not work in Safari but does work just fine in Chrome.

To Reproduce Steps to reproduce the behavior:

Start Safari & open JupyterLab w/graph-notebook installed

Run a query that would produce a Visualization such as "A simple example" in 02-Visualization/Air-Routes-Gremlin.ipynb

Click the "Fullscreen" button

Observe nothing happening

Expected behavior The Visualization expands to full screen

Desktop (please complete the following information):

OS: Mac Ventura 13.0.1

Browser: Safari

Version: 16.1

Additional context Using latest graph-notebook release w/JupyterLab 3.5.2

The issue looks to be in widgets/src/force_widget.ts in toggleExpand(). It uses document.fullscreenElement which exists in Chrome but not Safari as Safari uses document.webkitFullscreenElement.

https://github.com/sindresorhus/screenfull shows a cross-browser approach but it should also be pretty easy to implement directly using something like:

const elementFunc = document.documentElement as HTMLElement & { mozRequestFullScreen(): Promise<void>; webkitRequestFullscreen(): Promise<void>; msRequestFullscreen(): Promise<void>; }; const docFunc = document as Document & { mozCancelFullScreen(): Promise<void>; webkitExitFullscreen(): Promise<void>; msExitFullscreen(): Promise<void>; webkitFullscreenElement(): Promise<void>; }; const fullScreenElement = document.fullscreenElement || document.webkitFullscreenElement || document.mozFullScreenElement || document.msFullscreenElement; const requestFullscreen = elementFunc.requestFullscreen || elementFunc.mozRequestFullScreen || elementFunc.webkitRequestFullscreen || elementFunc.msRequestFullscreen; const exitFullscreen = docFunc.exitFullscreen || docFunc.webkitExitFullscreen || docFunc.msExitFullscreen || docFunc.mozCancelFullScreen;

And then replace:

document.fullscreenElement usage with fullscreenElement

elem.requestFullscreen with requestFullscreen

elem.requestFullscreen() with requestFullscreen.call(elem)

document.exitFullscreen with exitFullScreen

document.exitFullscreen() with exitFullscreen.call(document)

ie should end up looking something like this snippet:

... if ( !fullScreenElement ) { if ( requestFullscreen ) { document.addEventListener("fullscreenchange", fullscreenchange); requestFullscreen.call(elem); this.canvasDiv.style.height = "100%"; } } else { ...

See here for the basis for the above code: https://stackoverflow.com/questions/54242775/angular-7-how-does-work-the-html5-fullscreen-api-ive-a-lot-of-errors
bug
opened by jklap 0
adding fraud detection with inductive inference notebook

Issue #, if available:

Description of changes:

Adding a notebook demonstrating real-time inductive inference on a fraud detection use case By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

opened by sojiadeshina 0
Add ECR publish workflow
Issue #, if available: N/A

Description of changes:

Adding GitHub action to build the graph-notebook Docker image and publish it to ECR on new commits. This workflow currently only publishes to a private ECR repository; official ECR Public repo will be made available at a later date.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
opened by michaelnchin 0

Releases(v3.7.0)

v3.7.0(Dec 7, 2022)
Added Neo4J section to %%graph_notebook_config (Link to PR)

Added custom Gremlin authentication and serializer support (Link to PR)

Added %statistics magic for Neptune DFE engine (Link to PR)

Added option to disable TLS certificate verification in %%graph_notebook_config (Link to PR)

Improved %load status output, fixed region option (Link to PR)

Updated 01-About-the-Neptune-Notebook for openCypher (Link to PR)

Fixed results not being displayed for SPARQL ASK queries (Link to PR)

Fixed %seed failing to load SPARQL EPL dataset (Link to PR)

Fixed %db_reset status output not displaying in JupyterLab (Link to PR)

Fixed %%gremlin throwing error for result sets with multiple datatypes Link to PR)

Fixed edge label creation in 02-Using-Gremlin-to-Access-the-Graph (Link to PR)

Fixed igraph command error in 02-Logistics-Analysis-using-a-Transportation-Network (Link to PR)

Bumped typescript to 4.1.x in graph_notebook_widgets (Link to PR)

Pinned ipywidgets==7.7.2 and jupyterlab_widgets<3 (Link to PR)

Pinned nbclient<=0.7.0 (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.6.2(Oct 18, 2022)
New Sample Applications - Security Graphs notebooks (Link to PR)

Path: 03-Sample-Applications > 04-Security-Graphs

Update sample notebooks with parallel, same-direction edges example (Link to PR)

Fixed a Gremlin widgets error caused by empty individual results (Link to PR)

Fixed %db_reset timeout handling, made timeout limit configurable (Link to PR)

Fixed Sparql visualizations occasionally failing with VisJS group assignment error (Link to PR)

Fixed start jupyterlab command in README (Link to PR)

Fixed interface rendering issue in classic notebooks (Link to PR)

Added --hide-index option for query results (Link to PR)

Added result media type selection for SPARQL queries (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.6.0(Sep 15, 2022)
New Language Tutorials - SPARQL Basics notebook (Link to PR)

Path: 06-Language-Tutorials > 01-SPARQL > 01-SPARQL-Basics

New Neptune ML - Text Encoding Tutorial notebook (Link to PR)

Path: 04-Machine-Learning > Sample-Applications > 02-Job-Recommendation-Text-Encoding.ipynb

Added --store-to option to %%graph_notebook_config (Link to PR)

Added loader status details options to %load_ids (Link to PR)

Added --all-in-queue option to %cancel_load (Link to PR)

Deprecated Python 3.6 support (Link to PR)

Added support for literal property values in Sparql visualization options (Link to PR)

Various results table improvements (Link to PR)

Disabled automatic collapsing of large explain results (Link to PR)

Fixed version-specific steps in SageMaker installation script (Link to PR)

Added new SageMaker installation script for China regions (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.5.3(Jul 26, 2022)
Docker support. Docker image can be built using the command docker build . and through Docker's buildx, this can support non-x86 CPU Architectures like ARM. (Link to PR)

Fix service.sh conditional checks, SSL parameter can now be changed. Fix permissions error on service.sh experienced by some users. (Link to PR)

Added %%neptune_config_allowlist magic (Link to PR)

Added check to remove whitespace in %graph_notebook_config host fields (Link to PR)

Added silent output option to additional magics (Link to PR)

Fixed %sparql_status magic to return query status without query ID (Link to PR)

Fixed incorrect Gremlin query --store-to output (Link to PR)

Fixed certain characters not displaying correctly in results table (Link to PR)

Fixed extra index column displaying in Gremlin results table on older Pandas versions (Link to PR)

Reverted Gremlin console tab to single results column (Link to PR)

Bumped jquery-ui from 1.13.1 to 1.13.2 ((Link to PR)

Source code(tar.gz)
Source code(zip)
v3.5.1(Jul 13, 2022)
Improved the %stream_viewer magic to show the commit timestamp and isLastOp information, if available. Also added additional hover (help) text to the stream viewer. (Link to PR)

Added --max-content-length option to %%gremlin (Link to PR)

Added proxy_host and proxy_port options to the %%graph_notebook_config options. (Link to PR)

This allows for proxied connections to your Neptune instance from outside your VPC. Supporting the patterns seen here.

Fixed results table formatting in JupyterLab (Link to PR)

Fixed several typos in the Neptune ML 00 notebook (Link to PR)

Renamed the Knowledge Graph application notebooks for clarity (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.4.1(Jun 7, 2022)
Identity Graph - ETL notebook (Link to PR)

Path: 03-Identity-Graphs>03-Jumpstart-Identity-Graphs-Using-Canonical-Model-and-ETL

Files: scripts/, glue_utils.py and 3-Identity-Graphs>03-Jumpstart-Identity-Graphs-Using-Canonical-Model-and-ETL notebook

Support variable injection in %%graph_notebook_config magic (Link to PR)

Added three notebooks to show data science workflows with Amazon Neptune (Link to PR)

Added JupyterLab startup script to auto-load magics extensions (Link to PR)

Added includeWaiting option to %oc_status, fix same for %gremlin_status (Link to PR)

Added --store-to option to %status (Link to PR)

Fixed handling of empty nodes returned from openCypher DELETE queries (Link to PR)

Fixed rendering of openCypher widgets for empty result sets (Link to PR)

Fixed graph search overriding physics setting (Link to PR)

Fixed browser-specific bug in results pagination options menu (Link to PR)

Fixed invalid queries in Gremlin sample notebooks (Link to PR)

Removed requests-aws4auth requirement (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.3.0(Mar 29, 2022)
Support rendering of widgets in JupyterLab (Link to PR)

Fixed ASCII encoding error in Profile/Explain generation (Link to PR)

Fixed inaccessible data URL in NeptuneML utils (Link to PR)

Fixed integration tests to address updated air routes data and other changes (Link to PR)

Bumped jinja2 from 2.10.1 to 3.0.3 (Link to PR)

Added documentation for JupyterLab installation (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.2.0(Feb 26, 2022)
Added new notebooks: guides for using SPARQL and RDF with Neptune ML (Link to PR)

Added the ability to run explain plans to openCypher queries via %%oc explain. (Link to PR)

Added the ability to download the explain/profile plans for openCypher/Gremlin/SPARQL. (Link to PR)

Changed the %stream_viewer magic to use PropertyGraph and RDF as the stream types. This better aligns with Gremlin and openCypher sharing the PropertyGraph stream. (Link to PR)

Updated the airports property graph seed files to the latest level and suffixed all doubles with 'd'. (Link to PR)

Added grouping by depth for Gremlin and openCypher queries (PR #1)(PR #2)

Added grouping by raw node results (Link to PR)

Added --no-scroll option for disabling truncation of query result pages (Link to PR)

Added --results-per-page option (Link to PR)

Added relaxed seed command error handling (Link to PR)

Renamed Gremlin profile query options for clarity (Link to PR)

Suppressed default root logger error output (Link to PR)

Fixed Gremlin visualizer bug with handling non-string node IDs (Link to PR)

Fixed error in openCypher Bolt query metadata output (Link to PR)

Fixed handling of Decimal type properties when rendering Gremlin query results (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.1.1(Dec 22, 2021)
Added new dataset for DiningByFriends, and associated notebook (Link to PR)

Added new Neptune ML Sample Application for People Analytics (Link to PR)

Added graph customization support for SPARQL queries (Link to PR)

Added graph reset and search refinement buttons to the graph output tab (Link to PR)

Added support for setting custom edge and node tooltips (Link to PR)

Added edge tooltips, and options for specifying edge label length (Link to PR)

Updated NeptuneML pre-trained model resources for CN regions (Link to PR)

Fixed inaccurate help message being displayed for certain GremlinServerErrors (Link to PR)

Fixed error causing query autocompletion to fail (Link to PR)

Fixed Jupyter start script for cases where the nbconfig directory is missing (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.8(Nov 4, 2021)
Added support for specifying the Gremlin traversal source (Link to PR)

Added edge tooltips, and options for specifying edge label length (Link to PR)

Fixed configuration options missing when using a CN region Neptune host (Link to PR)

Correct naming of ID parameter for NeptuneML Endpoint command (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.7(Oct 25, 2021)
Added full support for NeptuneML API command parameters to %neptune_ml (Link to PR)

Allow %%neptune_ml to accept JSON blob as parameter input for most phases (Link to PR)

Added --silent option for suppressing query output (PR #1) (PR #2)

Added all parserConfiguration options to %load (Link to PR)

Upgraded to Gremlin-Python 3.5 and Jupyter Notebook 6.x (Link to PR)

Resolved smart indent bug in openCypher magic cells (Link to PR)

Removed default /sparql path suffix from non-Neptune SPARQL requests (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.6(Sep 21, 2021)
Added a new %stream_viewer magic that allows interactive exploration of the Neptune CDC stream (if enabled). (Link to PR)

Added support for multi-property values in vertex and edge labels (Link to PR)

Added new visualization physics options, toggle button (Link to PR)

Fixed TypeError thrown for certain OC list type results (Link to PR

Documentation fixes for additional databases (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.5(Aug 28, 2021)
Disabled SigV4 signing for non-IAM AWS requests (Link to PR)

Added new --nopoll option to %load to disable status polling (Link to PR)

Made Neptune specific parameters optional for %%graph_notebook_config (Link to PR)

Upgraded Jupyter Notebook dependency to 5.7.13 for security fix (Link to PR)

Improved usability of %load Edge IDs option (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.3(Aug 11, 2021)
Gremlin visualization bugfixes (PR #1) (PR #2) (PR #3)

Updated the airport data loadable via %seed to the latest version (Link to PR)

Added support for Gremlin Profile API parameters (Link to PR)

Improved %seed so that the progress bar is seen to complete (Link to PR)

Added helper functions to neptune_ml utils to get node embeddings, model predictions and performance metrics (Link to PR)

Changed visualization behavior to add all group-less nodes to a default group (Link to PR)

Fixed a bug causing ML Export requests to fail (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.2(Jul 30, 2021)
Added new Knowledge Graph use case notebook for openCypher usage (Link to PR)

Fixed incorrect visualizations of some Gremlin results returned by valueMap (Link to PR)

Pin RDFLib version in README (Link to PR)

Fixed inconsistent node tooltips in openCypher visualizations (Link to PR)

Source code(tar.gz)
Source code(zip)
v3.0.1(Jul 29, 2021)
openCypher Support:

With the release of support for the openCypher query language in Amazon Neptune's lab mode, graph-notebook can now be used to execute and visualize openCypher queries with any compatible graph database.

Two new magic commands have been added:

%%oc/%%opencypher

%%oc_status/%%opencypher_status

These openCypher magic commands inherit the majority of the query and visualization customization features that are already available in the Gremlin and SPARQL magics.

For more detailed information and examples of how you can execute and visualize openCypher queries through graph-notebook, please refer to the new Air-Routes-openCypher and EPL-openCypher sample notebooks.

(Link to PR)

Other Updates:

Added visualization support for elementMap Gremlin step (Link to PR)

Added support for additional customization of edge node labels in Gremlin (Link to PR)

Refactored %load form display code for flexibility; fixes some descriptions being cut off (Link to PR)

Overhauled Gremlin visualization notebooks with example usage of new customization options and elementMap step (Link to PR)

Updated Neptune ML notebooks, utils, and pretrained models config (Link to PR)

Added support for modeltransform commands in %neptune_ml (Link to PR)

Included index operations metrics in metadata results tab for Gremlin Profile queries(Link to PR)

Added new notebook to explain Identity Graph data modeling (Link to PR)

Various bugfixes and documentation updates

Source code(tar.gz)
Source code(zip)
v2.1.4(Jun 27, 2021)
Support for additional customization of graph node labels in Gremlin (Link to PR)

Source code(tar.gz)
Source code(zip)
v2.1.3(Jun 22, 2021)
Support dictionary value access in variable injection (Link to PR)

Source code(tar.gz)
Source code(zip)
v2.1.2(May 11, 2021)
Pin gremlinpython to <3.5.* (Link to PR)

Add support for notebook variables in Sparql/Gremlin magic queries (Link to PR)

Add support for grouping by different properties per label in Gremlin (Link to PR)

Fix missing Boto3 dependency in setup.py (Link to PR)

Update %load execution time to HH:MM:SS format if over a minute (Link to PR)

Source code(tar.gz)
Source code(zip)
v2.1.1(Apr 23, 2021)
Fix bug in %neptune_ml export ... logic where the iam setting for the exporter endpoint wasn't getting picked up properly

Source code(tar.gz)
Source code(zip)
v2.1.0(Apr 16, 2021)
Add support for Mode, queueRequest, and Dependencies parameters when running %load command (Link to PR)

Add support for list and dict as map keys in Python Gremlin (Link to PR)

Refactor modules that call to Neptune or other SPARQL/Gremlin endpoints to use a unified client object (Link to PR)

Added an additional notebook under 02-Visualization demonstrating how to use the visualzation grouping and coloring options in Gremlin. (Link to PR)

Add metadata output tab for magic queries (Link to PR)

Source code(tar.gz)
Source code(zip)
v2.0.12(Mar 25, 2021)
Add default parameters for get_load_status

Add ipython as a dependency in setup.py (Link to PT)

Add parameters in load_status for details, errors, page, and errorsPerPage

Source code(tar.gz)
Source code(zip)
v2.0.10(Mar 18, 2021)
Print execution time when running %load command (Link to PR)

Source code(tar.gz)
Source code(zip)
v2.0.9(Mar 3, 2021)
New datasets and notebooks for Sample applications in Gremlin including:

Fraud Graph

Knowledge Graph

Identity Graph

Source code(tar.gz)
Source code(zip)
v2.0.7(Feb 1, 2021)
Added What’s Next sections to 01-Getting-Started notebooks to point users to next suggested notebook tutorials after finishing one notebook.

Source code(tar.gz)
Source code(zip)
v2.0.6(Jan 28, 2021)
Add missing __init__ to notebook directories to they get installed correctly

Update list of available magics in notebook documentation

Source code(tar.gz)
Source code(zip)
v2.0.5(Jan 8, 2021)
Gremlin Visualization

Enhanced Gremlin Visualization output to group vertices and color-code them based on groups. When not specified it will group by the label (if it exists). You can also specify the property to groupby using the switch --groupby or -g followed by the property name

Added the functionality to sort the values in the details box by key

Updated Air-Routes-Visualization notebook to discuss the group by functionality

NeptuneML

Add tutorial notebooks for NeptuneML functionality

Source code(tar.gz)
Source code(zip)
v2.0.3(Dec 29, 2020)
Integration with NeptuneML feature set in AWS Neptune

Add helper library to perform Sigv4 signing for %neptune_ml export ..., we will move our other signing at a later date.

Swap how credentials are obtained for ROLE iam credentials provider such that it uses a botocore session now instead of calling the ec2 metadata service. This should make the module more usable outside of Sagemaker.

Add sub-configuration for sparql to allow specifying path to sparql endpoint

New Line magics:

%neptune_ml export status

%neptune_ml dataprocessing start

%neptune_ml dataprocessing status

%neptune_ml training start

%neptune_ml training status

%neptune_ml endpoint create

%neptune_ml endpoint status

New Cell magics:

%%neptune_ml export start

%%neptune_ml dataprocessing start

%%neptune_ml training start

%%neptune_ml endpoint create

NOTE: If a cell magic is used, its line inputs for specifying parts of the command will be ignore such as --job-id as a line-param.

Inject variable as cell input: Currently this will only work for our new cell magic commands details above. You can now specify a variable to use as the cell input received by our neptune_ml magics using the syntax ${var_name}. For example...

# in one notebook cell: foo = {'foo', 'bar'} # in another notebook cell: %%neptune_ml export start ${foo}

NOTE: The above will only work if it is the sole content of the cell body. You cannot inline multiple variables at this time.
Source code(tar.gz)
Source code(zip)
v2.0.1(Nov 24, 2020)
Fix bug in argparser for load_status and cancel_load line magics

Expand loader status values that terminate load line magic

Source code(tar.gz)
Source code(zip)
v2.0.0(Nov 20, 2020)
Add support for storing query results to a variable for use in other notebook cells

Remove %query_mode magic in favor of query parameterization

Source code(tar.gz)
Source code(zip)