StellarGraph - Machine Learning on Graphs

Last update: Jan 5, 2023

Overview

StellarGraph Machine Learning Library

StellarGraph is a Python library for machine learning on graphs and networks.

Introduction
Getting Started
Getting Help
Example: GCN
Algorithms
Installation
Citing
References

Introduction

The StellarGraph library offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data. It can solve many machine learning tasks:

Representation learning for nodes and edges, to be used for visualisation and various downstream machine learning tasks;
Classification and attribute inference of nodes or edges;
Classification of whole graphs;
Link prediction;
Interpretation of node classification [8].

Graph-structured data represent entities as nodes (or vertices) and relationships between them as edges (or links), and can include data associated with either as attributes. For example, a graph can contain people as nodes and friendships between them as links, with data like a person's age and the date a friendship was established. StellarGraph supports analysis of many kinds of graphs:

homogeneous (with nodes and links of one type),
heterogeneous (with more than one type of nodes and/or links)
knowledge graphs (extreme heterogeneous graphs with thousands of types of edges)
graphs with or without data associated with nodes
graphs with edge weights

StellarGraph is built on TensorFlow 2 and its Keras high-level API, as well as Pandas and NumPy. It is thus user-friendly, modular and extensible. It interoperates smoothly with code that builds on these, such as the standard Keras layers and scikit-learn, so it is easy to augment the core graph machine learning algorithms provided by StellarGraph. It is thus also easy to install with pip or Anaconda.

Getting Started

The numerous detailed and narrated examples are a good way to get started with StellarGraph. There is likely to be one that is similar to your data or your problem (if not, let us know).

You can start working with the examples immediately in Google Colab or Binder by clicking the and badges within each Jupyter notebook.

Alternatively, you can run download a local copy of the demos and run them using jupyter. The demos can be downloaded by cloning the master branch of this repository, or by using the curl command below:

curl -L https://github.com/stellargraph/stellargraph/archive/master.zip | tar -xz --strip=1 stellargraph-master/demos

The dependencies required to run most of our demo notebooks locally can be installed using one of the following:

Using pip: pip install stellargraph[demos]
Using conda: conda install -c stellargraph stellargraph

(See Installation section for more details and more options.)

Getting Help

If you get stuck or have a problem, there are many ways to make progress and get help or support:

Read the documentation
Consult the examples
Contact us:
- Ask questions and discuss problems on the StellarGraph Discussions forum
- File an issue
- Send us an email at [email protected]

Example: GCN

One of the earliest deep machine learning algorithms for graphs is a Graph Convolution Network (GCN) [6]. The following example uses it for node classification: predicting the class from which a node comes. It shows how easy it is to apply using StellarGraph, and shows how StellarGraph integrates smoothly with Pandas and TensorFlow and libraries built on them.

Data preparation

Data for StellarGraph can be prepared using common libraries like Pandas and scikit-learn.

import pandas as pd
from sklearn import model_selection

def load_my_data():
    # your own code to load data into Pandas DataFrames, e.g. from CSV files or a database
    ...

nodes, edges, targets = load_my_data()

# Use scikit-learn to compute training and test sets
train_targets, test_targets = model_selection.train_test_split(targets, train_size=0.5)

Graph machine learning model

This is the only part that is specific to StellarGraph. The machine learning model consists of some graph convolution layers followed by a layer to compute the actual predictions as a TensorFlow tensor. StellarGraph makes it easy to construct all of these layers via the GCN model class. It also makes it easy to get input data in the right format via the StellarGraph graph data type and a data generator.

import stellargraph as sg
import tensorflow as tf

# convert the raw data into StellarGraph's graph format for faster operations
graph = sg.StellarGraph(nodes, edges)

generator = sg.mapper.FullBatchNodeGenerator(graph, method="gcn")

# two layers of GCN, each with hidden dimension 16
gcn = sg.layer.GCN(layer_sizes=[16, 16], generator=generator)
x_inp, x_out = gcn.in_out_tensors() # create the input and output TensorFlow tensors

# use TensorFlow Keras to add a layer to compute the (one-hot) predictions
predictions = tf.keras.layers.Dense(units=len(ground_truth_targets.columns), activation="softmax")(x_out)

# use the input and output tensors to create a TensorFlow Keras model
model = tf.keras.Model(inputs=x_inp, outputs=predictions)

Training and evaluation

The model is a conventional TensorFlow Keras model, and so tasks such as training and evaluation can use the functions offered by Keras. StellarGraph's data generators make it simple to construct the required Keras Sequences for input data.

# prepare the model for training with the Adam optimiser and an appropriate loss function
model.compile("adam", loss="categorical_crossentropy", metrics=["accuracy"])

# train the model on the train set
model.fit(generator.flow(train_targets.index, train_targets), epochs=5)

# check model generalisation on the test set
(loss, accuracy) = model.evaluate(generator.flow(test_targets.index, test_targets))
print(f"Test set: loss = {loss}, accuracy = {accuracy}")

This algorithm is spelled out in more detail in its extended narrated notebook. We provide many more algorithms, each with a detailed example.

Algorithms

The StellarGraph library currently includes the following algorithms for graph machine learning:

Algorithm	Description
GraphSAGE [1]	Supports supervised as well as unsupervised representation learning, node classification/regression, and link prediction for homogeneous networks. The current implementation supports multiple aggregation methods, including mean, maxpool, meanpool, and attentional aggregators.
HinSAGE	Extension of GraphSAGE algorithm to heterogeneous networks. Supports representation learning, node classification/regression, and link prediction/regression for heterogeneous graphs. The current implementation supports mean aggregation of neighbour nodes, taking into account their types and the types of links between them.
attri2vec [4]	Supports node representation learning, node classification, and out-of-sample node link prediction for homogeneous graphs with node attributes.
Graph ATtention Network (GAT) [5]	The GAT algorithm supports representation learning and node classification for homogeneous graphs. There are versions of the graph attention layer that support both sparse and dense adjacency matrices.
Graph Convolutional Network (GCN) [6]	The GCN algorithm supports representation learning and node classification for homogeneous graphs. There are versions of the graph convolutional layer that support both sparse and dense adjacency matrices.
Cluster Graph Convolutional Network (Cluster-GCN) [10]	An extension of the GCN algorithm supporting representation learning and node classification for homogeneous graphs. Cluster-GCN scales to larger graphs and can be used to train deeper GCN models using Stochastic Gradient Descent.
Simplified Graph Convolutional network (SGC) [7]	The SGC network algorithm supports representation learning and node classification for homogeneous graphs. It is an extension of the GCN algorithm that smooths the graph to bring in more distant neighbours of nodes without using multiple layers.
(Approximate) Personalized Propagation of Neural Predictions (PPNP/APPNP) [9]	The (A)PPNP algorithm supports fast and scalable representation learning and node classification for attributed homogeneous graphs. In a semi-supervised setting, first a multilayer neural network is trained using the node attributes as input. The predictions from the latter network are then diffused across the graph using a method based on Personalized PageRank.
Node2Vec [2]	The Node2Vec and Deepwalk algorithms perform unsupervised representation learning for homogeneous networks, taking into account network structure while ignoring node attributes. The node2vec algorithm is implemented by combining StellarGraph's random walk generator with the word2vec algorithm from Gensim. Learned node representations can be used in downstream machine learning models implemented using Scikit-learn, Keras, TensorFlow or any other Python machine learning library.
Metapath2Vec [3]	The metapath2vec algorithm performs unsupervised, metapath-guided representation learning for heterogeneous networks, taking into account network structure while ignoring node attributes. The implementation combines StellarGraph's metapath-guided random walk generator and Gensim word2vec algorithm. As with node2vec, the learned node representations (node embeddings) can be used in downstream machine learning models to solve tasks such as node classification, link prediction, etc, for heterogeneous networks.
Relational Graph Convolutional Network [11]	The RGCN algorithm performs semi-supervised learning for node representation and node classification on knowledge graphs. RGCN extends GCN to directed graphs with multiple edge types and works with both sparse and dense adjacency matrices.
ComplEx[12]	The ComplEx algorithm computes embeddings for nodes (entities) and edge types (relations) in knowledge graphs, and can use these for link prediction
GraphWave [13]	GraphWave calculates unsupervised structural embeddings via wavelet diffusion through the graph.
Supervised Graph Classification	A model for supervised graph classification based on GCN [6] layers and mean pooling readout.
Watch Your Step [14]	The Watch Your Step algorithm computes node embeddings by using adjacency powers to simulate expected random walks.
Deep Graph Infomax [15]	Deep Graph Infomax trains unsupervised GNNs to maximize the shared information between node level and graph level features.
Continuous-Time Dynamic Network Embeddings (CTDNE) [16]	Supports time-respecting random walks which can be used in a similar way as in Node2Vec for unsupervised representation learning.
DistMult [17]	The DistMult algorithm computes embeddings for nodes (entities) and edge types (relations) in knowledge graphs, and can use these for link prediction
DGCNN [18]	The Deep Graph Convolutional Neural Network (DGCNN) algorithm for supervised graph classification.
TGCN [19]	The GCN_LSTM model in StellarGraph follows the Temporal Graph Convolutional Network architecture proposed in the TGCN paper with a few enhancements in the layers architecture.

Installation

StellarGraph is a Python 3 library and we recommend using Python version 3.6. The required Python version can be downloaded and installed from python.org. Alternatively, use the Anaconda Python environment, available from anaconda.com.

The StellarGraph library can be installed from PyPI, from Anaconda Cloud, or directly from GitHub, as described below.

Install StellarGraph using PyPI:

To install StellarGraph library from PyPI using pip, execute the following command:

pip install stellargraph

Some of the examples require installing additional dependencies as well as stellargraph. To install these dependencies as well as StellarGraph using pip execute the following command:

pip install stellargraph[demos]

The community detection demos require python-igraph which is only available on some platforms. To install this in addition to the other demo requirements:

pip install stellargraph[demos,igraph]

Install StellarGraph in Anaconda Python:

The StellarGraph library is available an Anaconda Cloud and can be installed in Anaconda Python using the command line conda tool, execute the following command:

conda install -c stellargraph stellargraph

Install StellarGraph from GitHub source:

First, clone the StellarGraph repository using git:

git clone https://github.com/stellargraph/stellargraph.git

Then, cd to the StellarGraph folder, and install the library by executing the following commands:

cd stellargraph
pip install .

Some of the examples in the demos directory require installing additional dependencies as well as stellargraph. To install these dependencies as well as StellarGraph using pip execute the following command:

pip install .[demos]

Citing

StellarGraph is designed, developed and supported by CSIRO's Data61. If you use any part of this library in your research, please cite it using the following BibTex entry

@misc{StellarGraph,
  author = {CSIRO's Data61},
  title = {StellarGraph Machine Learning Library},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/stellargraph/stellargraph}},
}

References

Inductive Representation Learning on Large Graphs. W.L. Hamilton, R. Ying, and J. Leskovec. Neural Information Processing Systems (NIPS), 2017, (link webpage)
Node2Vec: Scalable Feature Learning for Networks. A. Grover, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, (link)
Metapath2Vec: Scalable Representation Learning for Heterogeneous Networks. Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 135–144, 2017, (link)
Attributed Network Embedding via Subspace Discovery. D. Zhang, Y. Jie, X. Zhu and C. Zhang, Data Mining and Knowledge Discovery, 2019, (link)
Graph Attention Networks. P. Veličković et al. International Conference on Learning Representations (ICLR), 2018, (link)
Graph Convolutional Networks (GCN): Semi-Supervised Classification with Graph Convolutional Networks. Thomas N. Kipf, Max Welling. International Conference on Learning Representations (ICLR), 2017, (link)
Simplifying Graph Convolutional Networks. F. Wu, T. Zhang, A. H. de Souza, C. Fifty, T. Yu, and K. Q. Weinberger. International Conference on Machine Learning (ICML), 2019, (link)
Adversarial Examples on Graph Data: Deep Insights into Attack and Defense. H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, and L. Zhu. IJCAI 2019, (link)
Predict then propagate: Graph neural networks meet personalized PageRank. J. Klicpera, A. Bojchevski, A., and S. Günnemann, ICLR, 2019, arXiv:1810.05997.(link)
Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks. W. Chiang, X. Liu, S. Si, Y. Li, S. Bengio, and C. Hsiej, KDD, 2019, arXiv:1905.07953.(link)
Modeling relational data with graph convolutional networks. M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling, European Semantic Web Conference, 2018, arXiv:1609.02907 (link).
Complex Embeddings for Simple Link Prediction. T. Trouillon, J. Welbl, S. Riedel, É. Gaussier and G. Bouchard, ICML, 2016. (link)
Learning Structural Node Embeddings via Diffusion Wavelets. C. Donnat, M. Zitnik, D. Hallac, and J. Leskovec, SIGKDD, 2018, arXiv:1710.10321 (link)
Watch Your Step: Learning Node Embeddings via Graph Attention. S. Abu-El-Haija, B. Perozzi, R. Al-Rfou and A. Alemi, NIPS, 2018, arXiv:1710.09599 (link)
Deep Graph Infomax. P. Veličković, W. Fedus, W. L. Hamilton, P. Lio, Y. Bengio, R. D. Hjelm. International Conference on Learning Representations (ICLR), 2019, arXiv:1809.10341, (link).
Continuous-Time Dynamic Network Embeddings. Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. Proceedings of the 3rd International Workshop on Learning Representations for Big Networks (WWW BigNet) 2018. (link)
Embedding Entities and Relations for Learning and Inference in Knowledge Bases. Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng, ICLR, 2015. arXiv:1412.6575 (link)
An End-to-End Deep Learning Architecture for Graph Classification. Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen, AAAI, 2018. (link)
T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and Haifeng Li. IEEE Transactions on Intelligent Transportation Systems, 2019. (link)

Comments

Move tensorflow to extras

Part of #546

I'm not sure if we would land this as is, since this feels like a big breaking change, so I'm open to suggestions for how to make migration easier.

I'm hoping to figure out a way to print a warning if tensorflow is not installed with stellargraph as @adocherty has suggested in #546 but let me know if you have ideas.

I also didn't add a stellargraph[gpu] option for now since we're not verifying it in CI currently, although the user would have the option to try use their own gpu tensorflow installation regardless.

opened by kjun9 35
Feature/node2vec for issue Word2Vec in StellarGraph #255
This is the pull request for adding Keras Node2Vec layer to StellarGraph library.

I mainly made the following changes:

Add the Node2Vec layer.

Add the Node2VecNodeGenerator and Node2VecLinkGenerator mapper.

Change the interface for the __init__ and run function of BiasedRandomWalk to make UnSupervisedSampler support BiasedRandomWalk

Write the Keras-node2vec notebook examples for embedding learning and node classification

Add the unit test for the added node2vec layer and mappers
opened by daokunzhang 26
Replace inheritance of NetworkX by encapsulation

Under this revision, StellarGraph is no longer also a NetworkX - it now has its own interface. The NetworkXStellarGraph implementation class (formerly StellarGraphBase) now wraps the supplied NetworkX object.

opened by geoffj-d61 20
Feature/144 new stellargraph
Notes:

The code in this branch does not yet comprise a complete replacement for all uses of StellarGraph.

The particular aim was to sufficiently capture the efficient handling of node and edge data in the form of Pandas data-frames and/or type-specific maps.

The short-term goal was to document and implement enough of the StellarGraph interface to permit the node classification of the Cora dataset using GraphSAGE.

Further work is required to finish capturing and implementing the full StellarGraph interface.

However, the current work should serve as an example for review of the techniques used, in order to see if this approach is viable in regards to memory usage and speed of access.

sg-library
opened by geoffj-d61 17
How to train node classification using custom dataset

HI, thanks for the great library, I have seen the loading pandas jupyter to generate the node classification dataset, but I confused with the train example of node classification (gat and graphsage). For my understanding CORA network dataset has only single graph input. So how can I train multiple input graph Let's say I have 10 invoice or id card image, So using OCR I able to get the bounding box of text Now I want to classify the bounding box whether it is a company name, address, etc. based on the location of structure so that's why I planned to use GCN technique. This is my dataset look like: invoice_1: source | target | distance feature | isalpha feature | label 0 1 xxx 1 header 0 2 xxx 0 bill no 0 3 xxx 1 company_name 3 7 xxx 0 date invoice_2: source | target | distance feature | isalpha feature | label 0 1 xxx 1 header 0 2 xxx 0 bill no 0 4 xxx 1 company_name 4 5 xxx 1 company_name

and so on first two invoice_2 is the same billing but ocr bounding box will differ if invoice image varies like first invoice photo taken near so the ocr detect the company service as a single bounding box but in second invoice photo taken far so, the ocr detect the company separate and service as a separate bounding box. So How can I train the gat or graphsage with this dataset which as a separate CSV file for each invoice image?

I inspired by this https://github.com/dhavalpotdar/Graph-Convolution-on-Structured-Documents GitHub link to write a code for invoice and id card to my use case and your loading pandas homogenous graph example to generate above file CSV. I hope that the above explanation will be understandable. Thanks @huonw
external sg-library

opened by vigneshgig 16

Creating embedding is not reproducible

I am using StellarGraph to create embeddings for a particular graph/feature set. Unfortunately, the embeddings are different each time I create/train the graph despite providing identical information each time.

Is this bug, or am I using StellarGraph incorrectly?

Below is the code that demonstrates the issue:

import networkx as nx
import random
import numpy as np
import pandas as pd
import keras
import stellargraph as sg
from stellargraph.mapper import GraphSAGELinkGenerator, GraphSAGENodeGenerator
from stellargraph.layer import GraphSAGE, link_classification
from stellargraph.data import UnsupervisedSampler

# Establish random seed
RANDOM_SEED = 42
random.seed(RANDOM_SEED)

# Create a graph from well-known karate club data
print(f"Creating graph")
graph = nx.karate_club_graph()

# Create features for each node
print(f"Creating features")
features = []
nodes = list(graph.nodes)
columns = ["c-" + str(x) for x in range(10)]
nodes.sort()
for node in nodes:
    f = {c: random.random() for c in columns}
    features.append(f)

features_df = pd.DataFrame(features)
print(f"features_df: \n{features_df}")

for i in range(2):
    print(f"----- Iteration: {i} -----")

    # Create the model and generators
    print(f"Creating the model and generators")
    Gs = sg.StellarGraph(graph, node_features=features_df)
    unsupervisedSamples = UnsupervisedSampler(Gs, nodes=graph.nodes(), length=5, number_of_walks=3, seed=RANDOM_SEED)
    train_gen = GraphSAGELinkGenerator(Gs, 50, [5, 5], seed=RANDOM_SEED).flow(unsupervisedSamples)
    graphsage = GraphSAGE(layer_sizes=[100, 100], generator=train_gen, bias=True, dropout=0.0, normalize="l2")
    x_inp_src, x_out_src = graphsage.node_model(flatten_output=False)
    x_inp_dst, x_out_dst = graphsage.node_model(flatten_output=False)

    x_inp = [x for ab in zip(x_inp_src, x_inp_dst) for x in ab]
    x_out = [x_out_src, x_out_dst]
    edge_embedding_method = "l2"
    prediction = link_classification(output_dim=1, output_act="sigmoid", edge_embedding_method=edge_embedding_method)(x_out)

    # Create and train the Keras model
    model = keras.Model(inputs=x_inp, outputs=prediction)
    learning_rate = 1e-2
    model.compile(
        optimizer=keras.optimizers.Adam(lr=learning_rate),
        loss=keras.losses.binary_crossentropy,
        metrics=[keras.metrics.binary_accuracy])

    _ = model.fit_generator(train_gen, epochs=5, verbose=2, use_multiprocessing=False, workers=1, shuffle=False)

    # Create the embeddings
    print(f"Creating the embeddings")
    nodes = list(graph.nodes)
    nodes.sort()
    print(f"Nodes: {nodes}")

    # Create a generator that serves up nodes for use in embedding prediction / creation
    node_gen = GraphSAGENodeGenerator(Gs, 50, [5, 5], seed=RANDOM_SEED).flow(nodes)

    embedding_model = keras.Model(inputs=x_inp_src, outputs=x_out_src)
    embeddings = embedding_model.predict_generator(node_gen, workers=4, verbose=1)
    embeddings = embeddings[:, 0, :]

    np.set_printoptions(threshold=10)
    print(f"embeddings: {embeddings.shape} \n{embeddings}")

There are a number of debug (print output) statements when the code is executed. (sample output is shown below). Note that the embeddings are different despite the identical inputs, graph configuration, model configuration, and random see values.

----- Iteration: 0 -----
:
:
1/1 [==============================] - 0s 58ms/step
embeddings: (34, 100) 
[[-0.10566715  0.02253576 -0.18743701 ... -0.1028127   0.03689012
  -0.02482301]
 [-0.03171733  0.01606975 -0.08616363 ... -0.11775644  0.0429472
  -0.02371055]
 [-0.05802531  0.03910012 -0.10229243 ... -0.15050544  0.06637941
  -0.01950052]
 ...
 [ 0.03011296  0.08852117 -0.01836969 ... -0.154132    0.03844732
  -0.08643046]
 [ 0.01052345 -0.0123206   0.08913474 ... -0.11741614  0.03202919
  -0.04432516]
 [ 0.01951274  0.06263477  0.07959272 ... -0.10350229  0.05735112
  -0.0368157 ]]
:
:
----- Iteration: 1 -----
embeddings: (34, 100) 
[[ 0.11182436 -0.02642134  0.01168384 ...  0.10322241 -0.01680471
  -0.03918815]
 [ 0.02391489  0.02674667 -0.00091334 ...  0.12946768 -0.02389602
  -0.01414653]
 [ 0.08718258 -0.01711811 -0.05704292 ...  0.13477756 -0.00658288
  -0.05889895]
 ...
 [ 0.06843725 -0.13134597 -0.10870655 ...  0.11091235 -0.05146989
  -0.06138216]
 [-0.00593233 -0.05901312 -0.02113489 ... -0.01590953 -0.02516254
  -0.02280537]
 [ 0.00871993 -0.04059998 -0.07237951 ... -0.01590569 -0.00954109
  -0.01116194]]

bug external sg-library

opened by ericbroda 16

Problem in GCN link prediction example

Describe the bug

It seems that the GCN link prediction example doesn't perform the same performance in the doc. And the GCN seems didn't learn something from the data, I am not sure why this happen.

To Reproduce

Steps to reproduce the behavior:

Go to the colab example code https://colab.research.google.com/github/stellargraph/stellargraph/blob/master/demos/link-prediction/gcn-link-prediction.ipynb#scrollTo=cW8U-jvPKdob.
Just run all the commands.
Can't get the same performance in https://stellargraph.readthedocs.io/en/stable/demos/link-prediction/gcn-link-prediction.html, the acc is close to 0 in the reproduction of example.

I just run the code w/o any modification.

Environment

Operating system: 1. Ubuntu 2. colab

Python version: 1. python3.6.9 2. colab example default

Package versions: 1. stellargraph==1.2.1, tensorflow==2.2.0 2. colab example default

<IPython.core.display.HTML object>
StellarGraph: Undirected multigraph
 Nodes: 2708, Edges: 5429

 Node types:
  paper: [2708]
    Features: float32 vector, length 1440
    Edge types: paper-cites->paper

 Edge types:
    paper-cites->paper: [5429]
        Weights: all 1 (default)
        Features: none
** Sampled 542 positive and 542 negative edges. **
** Sampled 488 positive and 488 negative edges. **
Using GCN (local pooling) filters...
Using GCN (local pooling) filters...
2020-07-08 02:02:51.125100: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
1/1 [==============================] - 0s 261us/step - loss: 1.7441 - acc: 0.0000e+00
1/1 [==============================] - 0s 372us/step - loss: 1.7190 - acc: 0.0000e+00

Train Set Metrics of the initial (untrained) model:
        loss: 1.7441
        acc: 0.0000

Test Set Metrics of the initial (untrained) model:
        loss: 1.7190
        acc: 0.0000
Epoch 1/50
1/1 - 0s - loss: 1.7230 - acc: 0.0000e+00 - val_loss: 1.6174 - val_acc: 0.0000e+00
Epoch 2/50
1/1 - 0s - loss: 1.8744 - acc: 0.0000e+00 - val_loss: 0.6929 - val_acc: 0.0000e+00
Epoch 3/50
1/1 - 0s - loss: 0.7559 - acc: 0.0000e+00 - val_loss: 0.8678 - val_acc: 0.0000e+00
Epoch 4/50
1/1 - 0s - loss: 0.8746 - acc: 0.0000e+00 - val_loss: 0.9703 - val_acc: 0.0000e+00
Epoch 5/50
1/1 - 0s - loss: 0.9689 - acc: 0.0000e+00 - val_loss: 0.9082 - val_acc: 0.0000e+00
Epoch 6/50
1/1 - 0s - loss: 0.8899 - acc: 0.0000e+00 - val_loss: 0.7689 - val_acc: 0.0000e+00
Epoch 7/50
1/1 - 0s - loss: 0.7362 - acc: 0.0000e+00 - val_loss: 0.6734 - val_acc: 0.0000e+00
Epoch 8/50
1/1 - 0s - loss: 0.6950 - acc: 0.0000e+00 - val_loss: 0.7262 - val_acc: 0.0000e+00
Epoch 9/50
1/1 - 0s - loss: 0.7597 - acc: 0.0000e+00 - val_loss: 0.9009 - val_acc: 0.0000e+00
Epoch 10/50
1/1 - 0s - loss: 1.0559 - acc: 0.0000e+00 - val_loss: 0.8252 - val_acc: 0.0000e+00
Epoch 11/50
1/1 - 0s - loss: 0.9358 - acc: 0.0000e+00 - val_loss: 0.6823 - val_acc: 0.0000e+00
Epoch 12/50
1/1 - 0s - loss: 0.7335 - acc: 0.0000e+00 - val_loss: 0.6319 - val_acc: 0.0000e+00
Epoch 13/50
1/1 - 0s - loss: 0.5645 - acc: 0.0000e+00 - val_loss: 0.6511 - val_acc: 0.0000e+00
Epoch 14/50
1/1 - 0s - loss: 0.6124 - acc: 0.0000e+00 - val_loss: 0.6859 - val_acc: 0.0000e+00
Epoch 15/50
1/1 - 0s - loss: 0.6148 - acc: 0.0000e+00 - val_loss: 0.7000 - val_acc: 0.0000e+00
Epoch 16/50
1/1 - 0s - loss: 0.6342 - acc: 0.0000e+00 - val_loss: 0.6994 - val_acc: 0.0000e+00
Epoch 17/50
1/1 - 0s - loss: 0.6002 - acc: 0.0000e+00 - val_loss: 0.6820 - val_acc: 0.0000e+00
Epoch 18/50
1/1 - 0s - loss: 0.5741 - acc: 0.0000e+00 - val_loss: 0.6816 - val_acc: 0.0000e+00
Epoch 19/50
1/1 - 0s - loss: 0.5270 - acc: 0.0000e+00 - val_loss: 0.6888 - val_acc: 0.0000e+00
Epoch 20/50
1/1 - 0s - loss: 0.5358 - acc: 0.0000e+00 - val_loss: 0.7120 - val_acc: 0.0000e+00
Epoch 21/50
1/1 - 0s - loss: 0.5582 - acc: 0.0000e+00 - val_loss: 0.7397 - val_acc: 0.0000e+00
Epoch 22/50
1/1 - 0s - loss: 0.6449 - acc: 0.0000e+00 - val_loss: 0.7800 - val_acc: 0.0000e+00
Epoch 23/50
1/1 - 0s - loss: 0.5558 - acc: 0.0000e+00 - val_loss: 0.6514 - val_acc: 0.0000e+00
Epoch 24/50
1/1 - 0s - loss: 0.5180 - acc: 0.0000e+00 - val_loss: 0.6666 - val_acc: 0.0000e+00
Epoch 25/50
1/1 - 0s - loss: 0.5070 - acc: 0.0000e+00 - val_loss: 0.6780 - val_acc: 0.0000e+00
Epoch 26/50
1/1 - 0s - loss: 0.5492 - acc: 0.0000e+00 - val_loss: 0.7167 - val_acc: 0.0000e+00
Epoch 27/50
1/1 - 0s - loss: 0.5637 - acc: 0.0000e+00 - val_loss: 0.7343 - val_acc: 0.0000e+00
Epoch 28/50
1/1 - 0s - loss: 0.6122 - acc: 0.0000e+00 - val_loss: 0.7493 - val_acc: 0.0000e+00
Epoch 29/50
1/1 - 0s - loss: 0.6122 - acc: 0.0000e+00 - val_loss: 0.7420 - val_acc: 0.0000e+00
Epoch 30/50
1/1 - 0s - loss: 0.5376 - acc: 0.0000e+00 - val_loss: 0.7160 - val_acc: 0.0000e+00
Epoch 31/50
1/1 - 0s - loss: 0.5866 - acc: 0.0000e+00 - val_loss: 0.6874 - val_acc: 0.0000e+00
Epoch 32/50
1/1 - 0s - loss: 0.5429 - acc: 0.0000e+00 - val_loss: 0.6786 - val_acc: 0.0000e+00
Epoch 33/50
1/1 - 0s - loss: 0.6190 - acc: 0.0000e+00 - val_loss: 0.6995 - val_acc: 0.0000e+00
Epoch 34/50
1/1 - 0s - loss: 0.5032 - acc: 0.0000e+00 - val_loss: 0.7124 - val_acc: 0.0000e+00
Epoch 35/50
1/1 - 0s - loss: 0.5199 - acc: 0.0000e+00 - val_loss: 0.7479 - val_acc: 0.0000e+00
Epoch 36/50
1/1 - 0s - loss: 0.5283 - acc: 0.0000e+00 - val_loss: 0.7146 - val_acc: 0.0000e+00
Epoch 37/50
1/1 - 0s - loss: 0.5227 - acc: 0.0000e+00 - val_loss: 0.6946 - val_acc: 0.0000e+00
Epoch 38/50
1/1 - 0s - loss: 0.4818 - acc: 0.0000e+00 - val_loss: 0.6404 - val_acc: 0.0000e+00
Epoch 39/50
1/1 - 0s - loss: 0.4635 - acc: 0.0000e+00 - val_loss: 0.6060 - val_acc: 0.0000e+00
Epoch 40/50
1/1 - 0s - loss: 0.4405 - acc: 0.0000e+00 - val_loss: 0.6063 - val_acc: 0.0000e+00
Epoch 41/50
1/1 - 0s - loss: 0.4395 - acc: 0.0000e+00 - val_loss: 0.6093 - val_acc: 0.0000e+00
Epoch 42/50
1/1 - 0s - loss: 0.4369 - acc: 0.0000e+00 - val_loss: 0.6352 - val_acc: 0.0000e+00
Epoch 43/50
1/1 - 0s - loss: 0.4165 - acc: 0.0000e+00 - val_loss: 0.6561 - val_acc: 0.0000e+00
Epoch 44/50
1/1 - 0s - loss: 0.3748 - acc: 0.0000e+00 - val_loss: 0.6558 - val_acc: 0.0000e+00
Epoch 45/50
1/1 - 0s - loss: 0.3929 - acc: 0.0000e+00 - val_loss: 0.6668 - val_acc: 0.0000e+00
Epoch 46/50
1/1 - 0s - loss: 0.3927 - acc: 0.0000e+00 - val_loss: 0.6897 - val_acc: 0.0000e+00
Epoch 47/50
1/1 - 0s - loss: 0.3800 - acc: 0.0000e+00 - val_loss: 0.7172 - val_acc: 0.0000e+00
Epoch 48/50
1/1 - 0s - loss: 0.3637 - acc: 0.0000e+00 - val_loss: 0.7382 - val_acc: 0.0000e+00
Epoch 49/50
1/1 - 0s - loss: 0.3720 - acc: 0.0000e+00 - val_loss: 0.7558 - val_acc: 0.0000e+00
Epoch 50/50
1/1 - 0s - loss: 0.3790 - acc: 0.0000e+00 - val_loss: 0.8127 - val_acc: 0.0000e+00
1/1 [==============================] - 0s 487us/step - loss: 0.3245 - acc: 0.0000e+00
1/1 [==============================] - 0s 266us/step - loss: 0.8127 - acc: 0.0000e+00

Train Set Metrics of the trained model:
        loss: 0.3245
        acc: 0.0000

Test Set Metrics of the trained model:
        loss: 0.8127
        acc: 0.0000

bug sg-library doc

opened by davidho27941 14

Link prediction comparison demo between Node2Vec, Attri2Vec, GraphSAGE and GCN

Following @huonw 's advice, I add a link prediction demo to compare the link prediction permanence on the Cora dataset under the same edge train-test-split setting.

The results show that Attri2Vec performs best, while GraphSAGE and Node2Vec perform comparably.
sg-library

opened by daokunzhang 14

Store adj dictionaries as one contiguous np array

This PR stored adj lists in one contiguous numpy array.

For each adj dict, this PR instead stores the edge ilocs in one contiguous numpy array sorted by the index (source/target node for in/out dicts). The edge ilocs associated with a node are then accessed by:

mapping a node_iloc -> start_index, stop_index
lookup up flat array: -> flat[start_index:stop_index]

====== Memory

Measuring the memory of FB15k in MiB for develop and this PR for different adj caches:

| Ajd Lists | Develop | PR | | ----------- |:--------:| -----:| | None | 7.3 | 7.3 | | Directed | 14.2 | 11.9 | | Undirected | 16.7 | 12.1 |

For directed graphs the adj lists are 1.4x smaller, and 1.2x smaller for undirected graphs.

A nice property of this PR is that undirected adj lists, directed adj lists, and the edge lists now all use approximately the same amount of memory. (this doesn't apply if the node_ilocs and edge_ilocs are different types).

====== Neighbour lookup

PR benchmark

--------------------------------------------------------------------------------------- benchmark 'StellarGraph neighbours': 2 tests --------------------------------------------------------------------------------------
Name (time in us)                               Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_get_neighbours[True]        649.1660 (1.0)      2,333.2520 (1.0)        727.6934 (1.0)       86.5430 (1.0)        692.0295 (1.0)       41.2615 (1.0)       198;236  1,374.2051 (1.0)        1352           1
test_benchmark_get_neighbours[False]     7,630.3470 (11.75)    8,945.7340 (3.83)     8,079.2179 (11.10)    256.9341 (2.97)     8,024.1330 (11.60)    275.9956 (6.69)         26;8    123.7744 (0.09)        116           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Develop benchmark

--------------------------------------------------------------------------------------- benchmark 'StellarGraph neighbours': 2 tests ---------------------------------------------------------------------------------------
Name (time in us)                               Min                    Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_get_neighbours[True]        694.8380 (1.0)       2,825.8630 (1.0)        746.4933 (1.0)      113.9017 (1.0)        716.6200 (1.0)       28.2870 (1.0)        84;182  1,339.5968 (1.0)        1058           1
test_benchmark_get_neighbours[False]     6,867.9860 (9.88)     10,783.6280 (3.82)     7,280.8026 (9.75)     521.6676 (4.58)     7,174.2410 (10.01)    251.7290 (8.90)          7;9    137.3475 (0.10)        129           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The lack of difference in get_neighbours benchmarks is unsurprising because the the adj list lookup only accounts for a small fraction of the time in .neighbors.

====== Adj list lookup

Running:

import stellargraph as sg

graph, *_ = sg.datasets.FB15k().load()
_ = graph.neighbors(0, use_ilocs=True)
%timeit -n 5 -r 5 _ = [graph._edges._edges_dict[i] for i in range(len(graph.nodes()))]

Yields:

19.6 ms ± 1.06 ms per loop (mean ± std. dev. of 5 runs, 5 loops each) on develop
12.2 ms ± 853 µs per loop (mean ± std. dev. of 5 runs, 5 loops each) on this PR

Surprisingly, (to me at least!), this PR is twice as fast for neighbour lookups.

====== Construction Time

Comparing construction time of test_benchmark_creation[both-100-1000-5000] shows that this PR is ~2x as fast.

test_benchmark_creation[both-100-1000-5000]            11.9162 (1.76)     16.6076 (1.75)     13.3288 (1.70)     0.8721 (2.24)     13.2245 (1.73)     0.9719 (3.23)         20;3   75.0255 (0.59)         69           1

Develop

test_benchmark_creation[both-100-1000-5000]            19.6726 (2.79)     25.4567 (2.70)     22.1017 (2.82)     1.6570 (4.12)     22.2158 (2.87)     2.8661 (8.46)         20;0   45.2454 (0.35)         49           1

For reference, on this PR the minimal construction of the above graph takes:

test_benchmark_creation[None-100-1000-5000]             9.8109 (1.41)     12.3400 (1.34)     10.4754 (1.43)     0.5610 (1.76)     10.2614 (1.42)     0.5045 (2.31)         13;6   95.4618 (0.70)         80           1

Suggesting that the average time for creating all 3 adj caches went from ~12s to ~3s, a ~4x speedup

opened by kieranricardo 14

Create an explicit API for handling node ilocs
This PR makes a number of changes to support iloc usage:

the nodes are stored as ilocs in StellarGraph._edges instead of node ids

where possible StellarGraph functions now have a use_ilocs=False argument which specifies that any nodes input to the function are ilocs and node ilocs should be returned.

StellarGraph_get_index_for_nodes will be made public

tests for use_ilocs=True

To keep the scope of this PR small, this PR does not change any samplers/generators. Many of these could be updated to natively use ilocs by passing use_ilocs=True to StellarGraph functions.

See #870
opened by kieranricardo 14
Store features as TensorFlow tensors, not NumPy arrays
This PR stores features as a dictionary of Tensors instead of a dictionary of np.arrays, and uses tf.nn.embedding_lookup to access the features.

Testing with GraphSAGE on CORA using my machine (MacOS + 9th gen quad core i7) and the following hyperparameters the training time decreases from 10s to 3s per epoch.

batch_size = 200 num_samples = [20, 20] generator = GraphSAGENodeGenerator(G, batch_size, num_samples) train_gen = generator.flow(node_subjects.index, targets, shuffle=True) graphsage = GraphSAGE( generator=generator, layer_sizes=[32, 32], activations=["relu", "relu"]

~See #1168~ See: #1228
opened by kieranricardo 14
Bump setuptools from 46.1.3 to 65.5.1 in /docs
Bumps setuptools from 46.1.3 to 65.5.1.

Release notes

Sourced from setuptools's releases.

v65.5.1

No release notes provided.

v65.5.0

No release notes provided.

v65.4.1

No release notes provided.

v65.4.0

No release notes provided.

v65.3.0

No release notes provided.

v65.2.0

No release notes provided.

v65.1.1

No release notes provided.

v65.1.0

No release notes provided.

v65.0.2

No release notes provided.

v65.0.1

No release notes provided.

v65.0.0

No release notes provided.

v64.0.3

No release notes provided.

v64.0.2

No release notes provided.

v64.0.1

No release notes provided.

v64.0.0

No release notes provided.

v63.4.3

No release notes provided.

v63.4.2

No release notes provided.

... (truncated)

Changelog

Sourced from setuptools's changelog.

v65.5.1

Misc ^^^^

#3638: Drop a test dependency on the mock package, always use :external+python:py:mod:unittest.mock -- by :user:hroncok

#3659: Fixed REDoS vector in package_index.

v65.5.0

Changes ^^^^^^^

#3624: Fixed editable install for multi-module/no-package src-layout projects.

#3626: Minor refactorings to support distutils using stdlib logging module.

Documentation changes ^^^^^^^^^^^^^^^^^^^^^

#3419: Updated the example version numbers to be compliant with PEP-440 on the "Specifying Your Project’s Version" page of the user guide.

Misc ^^^^

#3569: Improved information about conflicting entries in the current working directory and editable install (in documentation and as an informational warning).

#3576: Updated version of validate_pyproject.

v65.4.1

Misc ^^^^

#3613: Fixed encoding errors in expand.StaticModule when system default encoding doesn't match expectations for source files.

#3617: Merge with pypa/distutils@6852b20 including fix for pypa/distutils#181.

v65.4.0

Changes ^^^^^^^

#3609: Merge with pypa/distutils@d82d926 including support for DIST_EXTRA_CONFIG in pypa/distutils#177.

v65.3.0

... (truncated)

Commits

a462cb5 Bump version: 65.5.0 → 65.5.1

de35d8b Merge pull request #3656 from bmorris3/typos

58e23de Update changelog. Ref #3659.

43a9c9b Limit the amount of whitespace to search/backtrack. Fixes #3659.

5791343 Add test capturing failed expectation. Ref #3659.

1f97905 ⚫ Fade to black.

6254567 Remove workaround for emacs.

729b180 ⚫ Fade to black.

c068081 Typo corrections

f777a40 Suppress deprecation warning in --rsyncdir. Workaround for #3655.

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 2
The first half of the embeddings learned have small variances using the unsupervised GraphSAGE model

Hi everyone,

I followed this excellent tutorial Node representation learning with GraphSAGE and UnsupervisedSampler on a couple of my own data sets to learn the embeddings for the all nodes in the graph.

When I used the MeanAggregator, I noticed that the variances of the first half embeddings obtained from the model were really low compared to the rest of the embeddings.

I'm wondering if anyone else has observed this ( I observed the same thing across multiple datasets) and if this is an expected behaviour of the model.

Thanks all, Ruqian

opened by ruqianl 0
On stable version a documentation page is broken
Describe the bug

The stable version of documentation does not provide StellarGraph API information.

To Reproduce

Steps to reproduce the behavior:

Go to StellarGraph API

The page provides no information

Observed behavior

No information provided on API page

Expected behavior

API information showcase, like on any other version, (e.g., on the dev version https://stellargraph.readthedocs.io/en/latest/api.html)

Environment

Operating system: Windows10

Additional context

Screenshots: https://ctrl.vi/i/h0bDdPUig
bug sg-library
opened by AmadeusZhang 0
Interpretability options, supervised graph classification
Hi, thanks for the constantly expanding stellargraph library!

Description

I am looking for wa feature to obtain node/link/subgraph importance for stellargraph-based supervised graph classification tasks. stellargraph.interpretability.saliency_maps does not seem to be compatible with the PaddedGraphGenerator function yet. Also GNNexplainer or SubgraphX are all available for pytorch models only.

User Story

I am working in the biomedical field and I trained molecular graph structures of two different classes with the DeepGraphCNN model. I achieve promising prediction results but have no option to further interpret the classification of disease-relevant molecular structures. I would hence greatly appreciate if the interpretability feature can be implemented into the stellargraph library for supervised graph classification models. Thanks!

Done Checklist

[ ] Produced code for required functionality

[ ] Tests written and coverage checked

[ ] Code review performed

[ ] Documentation on Google Docs (if applicable)

[ ] Documentation in repo

[ ] Version number reflects new status

[ ] CHANGELOG.md updated

[ ] Team demo

enhancement sg-library
opened by mmpust 0
Bump pillow from 7.1.2 to 9.3.0 in /docs
Bumps pillow from 7.1.2 to 9.3.0.

Release notes

Sourced from pillow's releases.

9.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/9.3.0.html

Changes

Initialize libtiff buffer when saving #6699 [@radarhere]

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [@wiredfool]

Inline fname2char to fix memory leak #6329 [@nulano]

Fix memory leaks related to text features #6330 [@nulano]

Use double quotes for version check on old CPython on Windows #6695 [@hugovk]

GHA: replace deprecated set-output command with GITHUB_OUTPUT file #6697 [@nulano]

Remove backup implementation of Round for Windows platforms #6693 [@cgohlke]

Upload fribidi.dll to GitHub Actions #6532 [@nulano]

Fixed set_variation_by_name offset #6445 [@radarhere]

Windows build improvements #6562 [@nulano]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [@cgohlke]

Only use ASCII characters in C source file #6691 [@cgohlke]

Release Python GIL when converting images using matrix operations #6418 [@hmaarrfk]

Added ExifTags enums #6630 [@radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [@radarhere]

Added support for reading BMP images with RLE4 compression #6674 [@npjg]

Decode JPEG compressed BLP1 data in original mode #6678 [@radarhere]

pylint warnings #6659 [@marksmayo]

Added GPS TIFF tag info #6661 [@radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [@radarhere]

Do not attempt normalization if mode is already normal #6644 [@radarhere]

Fixed seeking to an L frame in a GIF #6576 [@radarhere]

Consider all frames when selecting mode for PNG save_all #6610 [@radarhere]

Don't reassign crc on ChunkStream close #6627 [@radarhere]

Raise a warning if NumPy failed to raise an error during conversion #6594 [@radarhere]

Only read a maximum of 100 bytes at a time in IMT header #6623 [@radarhere]

Show all frames in ImageShow #6611 [@radarhere]

Allow FLI palette chunk to not be first #6626 [@radarhere]

If first GIF frame has transparency for RGB_ALWAYS loading strategy, use RGBA mode #6592 [@radarhere]

Round box position to integer when pasting embedded color #6517 [@radarhere]

Removed EXIF prefix when saving WebP #6582 [@radarhere]

Pad IM palette to 768 bytes when saving #6579 [@radarhere]

Added DDS BC6H reading #6449 [@ShadelessFox]

Added support for opening WhiteIsZero 16-bit integer TIFF images #6642 [@JayWiz]

Raise an error when allocating translucent color to RGB palette #6654 [@jsbueno]

Moved mode check outside of loops #6650 [@radarhere]

Added reading of TIFF child images #6569 [@radarhere]

Improved ImageOps palette handling #6596 [@PososikTeam]

Defer parsing of palette into colors #6567 [@radarhere]

Apply transparency to P images in ImageTk.PhotoImage #6559 [@radarhere]

Use rounding in ImageOps contain() and pad() #6522 [@bibinhashley]

Fixed GIF remapping to palette with duplicate entries #6548 [@radarhere]

Allow remap_palette() to return an image with less than 256 palette entries #6543 [@radarhere]

Corrected BMP and TGA palette size when saving #6500 [@radarhere]

... (truncated)

Changelog

Sourced from pillow's changelog.

9.3.0 (2022-10-29)

Limit SAMPLESPERPIXEL to avoid runtime DOS #6700 [wiredfool]

Initialize libtiff buffer when saving #6699 [radarhere]

Inline fname2char to fix memory leak #6329 [nulano]

Fix memory leaks related to text features #6330 [nulano]

Use double quotes for version check on old CPython on Windows #6695 [hugovk]

Remove backup implementation of Round for Windows platforms #6693 [cgohlke]

Fixed set_variation_by_name offset #6445 [radarhere]

Fix malloc in _imagingft.c:font_setvaraxes #6690 [cgohlke]

Release Python GIL when converting images using matrix operations #6418 [hmaarrfk]

Added ExifTags enums #6630 [radarhere]

Do not modify previous frame when calculating delta in PNG #6683 [radarhere]

Added support for reading BMP images with RLE4 compression #6674 [npjg, radarhere]

Decode JPEG compressed BLP1 data in original mode #6678 [radarhere]

Added GPS TIFF tag info #6661 [radarhere]

Added conversion between RGB/RGBA/RGBX and LAB #6647 [radarhere]

Do not attempt normalization if mode is already normal #6644 [radarhere]

... (truncated)

Commits

d594f4c Update CHANGES.rst [ci skip]

909dc64 9.3.0 version bump

1a51ce7 Merge pull request #6699 from hugovk/security-libtiff_buffer

2444cdd Merge pull request #6700 from hugovk/security-samples_per_pixel-sec

744f455 Added release notes

0846bfa Add to release notes

799a6a0 Fix linting

00b25fd Hide UserWarning in logs

05b175e Tighter test case

13f2c5a Prevent DOS with large SAMPLESPERPIXEL in Tiff IFD

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 2
Pandas object type cannot be represented as a node feature
Describe the bug

I created a DataFrame which has a series that is object type. I cannot use the row of this object type as a feature of single node. I'm initiating stellar graph with following code by using my custom dataset that i attached as a screenshot below.

square_node_data = pd.DataFrame( {'of': nf_obj, 'nf': nf_num}, index=from_node_ids ) square_node_features = StellarGraph(square_node_data, square_edges)

Stellar graph initialization gives an error as:

ValueError: could not convert string to float: 'object_feature_A'

It seems that there is no support for the feature as numpy object type. I want to ensure it because I couldn't find the answer in the documentation. Maybe there is an another way of giving that feature type 🤔
bug sg-library
opened by nevzatseferoglu 0

Releases(v1.2.1)

v1.2.1(Jun 30, 2020)
StellarGraph is a Python library for machine learning on graphs and networks. It offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data.

Get started with StellarGraph's newest graph machine learning features with pip install stellargraph.

This release is a small bug fix release on top of 1.2.0.

Bug fixes and other changes

Update the URLs of some datasets (Cora, PubMedDiabetes, CiteSeer) for upstream changes #1738, #1759

Add two missing layers to the stellargraph.custom_keras_layers dictionary #1757

Experimental changes: rename RotHEScoring to RotHEScore #1756

DevOps:

Automated testing on macOS #1752

Automated testing against Neo4j 4.1, in addition to Neo4j 3.5 and 4.0 #1754

Other CI: #1732, #1740, #1741, #1744, #1745, #1747, #1750, #1751, #1753

Full Changelog
Source code(tar.gz)
Source code(zip)
v1.2.0(Jun 25, 2020)
StellarGraph is a Python library for machine learning on graphs and networks. It offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data.

Get started with StellarGraph's newest graph machine learning features with pip install stellargraph.

Jump in to this release, with the new and improved demos and examples:

Comparison of link prediction with random walks based node embedding

Unsupervised training of a Cluster-GCN model with Deep Graph Infomax

Major features and improvements

Better Windows support: StellarGraph's existing ability to run on Windows has been improved, with all tests running on CI (#1696) and several small fixes (#1671, #1704, #1705).

Edge weights are supported in GraphSAGE (#1667) and Watch Your Step (#1604). This is in addition to the existing support for edge weights in GCN, GAT, APPNP, PPNP, RGCN, GCN graph classification, DeepGraphCNN and Node2Vec sampling.

Better and more demonstration notebooks and documentation to make the library more accessible to new and existing users:

A demo notebook for a comparison of link prediction with random walks based node embedding, showing Node2Vec, Attri2Vec, GraphSAGE and GCN #1658

The demo notebook for unsupervised training with Deep Graph Infomax has been expanded with more explanation and links #1257

The documentation for models, generators and other elements now has many more links to other relevant items in a "See also" box, making it easier to fit pieces together (examples: GraphSAGE, GraphSAGENodeGenerator, BiasedRandomWalk) #1718

The Cluster-GCN training procedure supports unsupervised training via Deep Graph Infomax; this allows for scalable training of GCN, APPNP and GAT models, and includes connecting to Neo4j for large graphs demo (#1257)

KGTripleGenerator now supports the self-adversarial negative sampling training procedure for knowledge graph algorithms (from RotatE), via generator.flow(..., sample_strategy="self-adversarial") docs

Deprecations

The ClusterGCN model has been replaced with the GCN class. In the previous 1.1.0 release, GCN, APPNP and GAT were generalised to support the Cluster-GCN training procedure via ClusterNodeGenerator (which includes Neo4j support). The ClusterGCN model is now redundant and thus is deprecated: however, it still works without behaviour change.

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

RotE, RotH: knowledge graph link prediction algorithms that combine TransE and RotatE in Euclidean or hyperbolic space, respectively #1539

Bug fixes and other changes

There are now tests for saving and loading a Keras Model constructed every model in StellarGraph #1676. This includes fixes for some models (#1677, #1682). Known issues: sparse models such as GCN and RGCN (see #1251 for more info and a work-around using tf-nightly), experimental GCN-LSTM (#1681).

Various documentation, demo and error message fixes and improvements: better internal linking #1404, automated spell checking #1583, #1663, #1665, #1684, improved rendering #1722 including a better sidebar #1512, #1729, #1730

DevOps changes:

CI has been moved from Buildkite to GitHub Actions (tracking issue: #1687; pull requests: #1688, #1692, #1690, #1691, #1693, #1694, #1701, #1707, #1712, #1714, #1715, #1717, #1719)

CI: #1655, #1672, #1673, #1679, #1710, #1721, #1724

Full Changelog
Source code(tar.gz)
Source code(zip)
v1.1.0(Jun 2, 2020)
StellarGraph is a Python library for machine learning on graphs and networks. It offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data.

Get started with StellarGraph's newest graph machine learning features with pip install stellargraph.

Jump in to this release, with the new and improved demos and examples:

Neo4j graph database support: Cluster-GCN, GraphSAGE, all demos

Semi-supervised node classification via GCN, Deep Graph Infomax and fine-tuning

Loading data into StellarGraph from NumPy

Link prediction with Metapath2Vec

Unsupervised graph classification/representation learning via distances

RGCN section of Node representation learning with Deep Graph Infomax

Node2Vec with StellarGraph components: representation learning, node classification

Expanded Attri2Vec explanation: representation learning, node classification, link prediction

Major features and improvements

Support for the Neo4j graph database has been significantly improved:

There is now a Neo4jStellarGraph class that packages up a connection to a Neo4j instance, and allows it to be used for machine learning algorithms including the existing Neo4j and GraphSAGE functionality demo, #1595, #1598.

The ClusterNodeGenerator class now supports Neo4jStellarGraph in addition to the in-memory StellarGraph class, allowing it to be used to train models like GCN and GAT with data stored entirely in Neo4j demo (#1561, #1594, #1613)

Better and more demonstration notebooks and documentation to make the library more accessible to new and existing users:

There is now a glossary that explains some terms specific to graphs, machine learning and graph machine learning #1570

A new demo notebook for semi-supervised node classification using Deep Graph Infomax and GCN #1587

A new demo notebook for link prediction using the Metapath2Vec algorithm #1614

New algorithms:

Unsupervised graph representation learning demo (#1626)

Unsupervised RGCN with Deep Graph Infomax demo (#1258)

Native Node2Vec using Tensorflow Keras, not the gensim library, demo of representation learning, demo of node classification (#536, #1566)

The ClusterNodeGenerator class can be used to train GCN, GAT, APPNP and PPNP models in addition to the ClusterGCN model #1585

The StellarGraph class continues to get smaller, faster and more flexible:

Node features can now be specified as NumPy arrays or the newly added thin IndexedArray wrapper, which does no copies and has minimal runtime overhead demo (#1535, #1556, #1599). They can also now be multidimensional for each node #1561.

Edges can now have features, taken as any extra/unused columns in the input DataFrames demo (#1574)

Adjacency lists used for random walks and GraphSAGE/HinSAGE are constructed with NumPy and stored as contiguous arrays instead of dictionaries, cutting the time and memory or construction by an order of magnitude #1296

The peak memory usage of construction and adjacency list building is now monitored to ensure that there are not large spikes for large graphs, that exceed available memory #1546. This peak usage has thus been optimised: #1551,

Other optimisations: the edge_arrays, neighbor_arrays, in_node_arrays and out_node_arrays methods have been added, reducing time and memory overhead by leaving data as its underlying NumPy array #1253; the node_type method now supports multiple nodes as input, making algorithms like HinSAGE and Metapath2Vec much faster #1452; the default edge weight of 1 no longer consumes significant memory #1610.

Overall performance and memory usage improvements since 1.0.0, in numbers:

A reddit graph has 233 thousand nodes and 11.6 million edges:

construction without node features is now 2.3× faster, uses 31% less memory and has a memory peak 57% smaller.

construction with node features from NumPy arrays is 6.8× faster, uses 6.5% less memory overall and 85% less new memory (the majority of the memory is shared with the original NumPy arrays), and has a memory peak (above the raw data set) 70% smaller, compared to Pandas DataFrames in 1.0.0.

adjacency lists are 4.7-5.0× faster to construct, use 28% less memory and have a memory peak 60% smaller.

Various random walkers are faster: BiasedRandomWalk is up to 30× faster with weights and 5× faster without weights on MovieLens and up to 100× faster on some synthetic datasets, UniformRandomMetapathWalk is up to 17× faster (on MovieLens), UniformRandomWalk is up to 1.4× (on MovieLens).

Tensorflow 2.2 and thus Python 3.8 are now supported #1278

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

RotatE: a knowledge graph link prediction algorithm that uses complex rotations (|z| = 1) to encode relations #1522

GCN_LSTM (renamed from GraphConvolutionLSTM): time series prediction on spatio-temporal data. It is still experimental, but has been improved since last release:

the SlidingFeaturesNodeGenerator class has been added to yield data appropriate for the model, straight from a StellarGraph instance containing time series data as node features #1564

the hidden graph convolution layers can now have a custom output size #1555

the model now supports multivariate input and output, including via the SlidingFeaturesNodeGenerator class (with multidimensional node features) #1580

unit tests have been added #1560

Neo4j support: some classes have been renamed from Neo4J... (uppercase J) to Neo4j... (lowercase j).

Bug fixes and other changes

Edge weights are supported in methods using FullBatchNodeGenerator (GCN, GAT, APPNP, PPNP), RelationalFullBatchNodeGenerator (RGCN) and PaddedGraphGenerator (GCN graph classification, DeepGraphCNN), via the weighted=True parameter #1600

The StellarGraph class now supports conversion between node type and edge type names and equivalent ilocs #1366, which allows optimising some algorithms (#1367 optimises ranking with the DistMult algorithm from 42.6s to 20.7s on the FB15k dataset)

EdgeSplitter no longer prints progress updates #1619

The info method now merges edge types triples like A-[r]->B and B-[r]->A in undirected graphs #1650

There is now a notebook capturing time and memory resource usage on non-synthetic datasets, designed to help StellarGraph contributors understand and optimise the StellarGraph class #1547

Various documentation, demo and error message fixes and improvements: #1516 (thanks @thatlittleboy), #1519, #1520, #1537, #1541, #1542, #1577, #1605, #1606, #1608, #1624, #1628, #1632, #1634, #1636, #1643, #1645, #1649, #1652

DevOps changes:

CI: #1518, tests are run regularly on a GPU #1249, #1647, #1653

Other: #1558

Full Changelog
Source code(tar.gz)
Source code(zip)
v1.0.0(May 5, 2020)
This 1.0 release of StellarGraph is the culmination of three years of active research and engineering to deliver an open-source, user-friendly library for machine learning (ML) on graphs and networks.

Jump in to this release, with the new demos and examples:

More helpful indexing and guidance for demos in our API documentation

Loading from Neo4j

More explanatory Node2Vec link prediction

Unsupervised GraphSAGE and HinSAGE via DeepGraphInfomax

Graph classification with GCNSupervisedGraphClassification and with DeepGraphCNN

Time series prediction using spatial information, using GraphConvolutionLSTM (experimental)

Major features and improvements

Better demonstration notebooks and documentation to make the library more accessible to new and existing users:

Notebooks are now published in the API documentation, for better & faster rendering and more convenient access #1279 #1433 #1448

The demos indices and READMEs now contain more guidance and explanation to make it easier to find a relevant example #1200

Several demos have been added or rewritten: loading data from Neo4j #1184, link prediction using Node2Vec #1190, graph classification with GCN, graph classification with DGCNN

Notebooks now detect if they're being used with an incorrect version of the StellarGraph library, eliminating confusion about version mismatches #1242

Notebooks are easier to download, both individually via a button on each in the API documentation #1460 and in bulk #1377 #1459

Notebooks have been re-arranged and renamed to be more consistent and easier to find #1471

New algorithms:

GCNSupervisedGraphClassification: supervised graph classification model based on Graph Convolutional layers (GCN) #929, demo.

DeepGraphCNN (DGCNN): supervised graph classification using a stack of graph convolutional layers followed by SortPooling, and standard convolutional and pooling (such as Conv1D and MaxPool1D) #1212 #1265, demo

SortPooling layer: the node pooling layer introduced in Zhang et al #1210

DeepGraphInfomax can be used to train almost any model in an unsupervised way, via the corrupt_index_groups parameter to CorruptedGenerator #1243, demo. Additionally, many algorithms provide defaults and so can be used with DeepGraphInfomax without specifying this parameter:

any model using FullBatchNodeGenerator, including models supported in StellarGraph 0.11: GCN, GAT, PPNP and APPNP

GraphSAGE #1162

HinSAGE for heterogeneous graphs with node features #1254

UnsupervisedSampler supports a walker parameter to use other random walking algorithms such as BiasedRandomWalk, in addition to the default UniformRandomWalk. #1187

The StellarGraph class is now smaller, faster and easier to construct and use:

The StellarGraph(..., edge_type_column=...) parameter can be used to construct a heterogeneous graph from a single flat DataFrame, containing a column of the edge types #1284. This avoids the need to build separate DataFrames for each type, and is significantly faster when there are many types. Using edge_type_column gives a 2.6× speedup for loading the stellargraph.datasets.FB15k dataset (with almost 600 thousand edges across 1345 types).

StellarGraph's internal cache of node adjacencies is now computed lazily #1291 and takes into account whether the graph is directed or not #1463, and they now use the smallest integer type they can #1289

StellarGraph's internal list of source and target nodes are now stored using integer "ilocs" #1267, reducing memory use and making some functionality significantly faster #1444 #1446)

Functions like graph.node_features() no longer needs node_type specified if graph has only one node type (this includes classes like HinSAGENodeGenerator, which no longer needs head_node_type if there is only one node type) #1375

Overall performance and memory usage improvements since 0.11, in numbers:

The FB15k graph has 15 thousand nodes and 483 thousand edges: it is now 7× faster and 4× smaller to construct (without adjacency lists). It is still about 2× smaller when directed or undirected adjacency lists are computed.

Directed adjacency matrix construction is up to 2× faster

Various samplers and random walkers are faster: HinSAGENodeGenerator is 3× faster (on MovieLens), Attri2VecNodeGenerator is 4× faster (on CiteSeer), weighted BiasedRandomWalk is up to 3× faster, UniformRandomMetapathWalk is up to 7× faster

Breaking changes

The stellargraph/stellargraph docker image wasn't being published in an optimal way, so we have stopped updating it for now #1455

Edge weights are now validated to be numeric when creating a StellarGraph. Previously edge weights could be any type, but all algorithms that use them would fail with non-numeric types. #1191

Full batch layers no longer support an "output indices" tensor to filter the output rows to a selected set of nodes #1204 (this does not affect models like GCN, only the layers within them: APPNPPropagationLayer, ClusterGraphConvolution, GraphConvolution, GraphAttention, GraphAttentionSparse, PPNPPropagationLayer, RelationalGraphConvolution). Migration: post-process the output using tf.gather manually or the new sg.layer.misc.GatherIndices layer.

GraphConvolution has been generalised to work with batch size > 1, subsuming the functionality of the now-deprecated ClusterGraphConvolution (and GraphClassificationConvolution) #1205. Migration: replace stellargraph.layer.ClusterGraphConvolution with stellargraph.layer.GraphConvolution.

BiasedRandomWalk now takes multi-edges into consideration instead of collapsing them when traversing the graph. It previously required all multi-edges had to same weight and only counted one of them when considering where to walk, but now a multi-edge is equivalent to having an edge whose weight is the sum of the weights of all edges in the multi-edge #1444

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

GraphConvolutionLSTM: time series prediction on spatio-temporal data, combining GCN with a LSTM model to augment the conventional time-series model with information from nearby data points #1085, demo

Bug fixes and other changes

Random walk classes like UniformRandomWalk and BiasedRandomWalk can have their hyperparameters set on construction, in addition to in each call to run #1179

Node feature sampling was made ~4× faster by ensuring a better data layout, this makes some configurations of GraphSAGE (and HinSAGE) noticably faster #1225

The PROTEINS dataset has been added to stellargraph.datasets, for graph classification #1282

The BlogCatalog3 dataset can now be successfully downloaded again #1283

Knowledge graph model evaluation via rank_edges_against_all_nodes now defaults to the random strategy for breaking ties, and supports top (previous default) and bottom as alternatives #1223

Creating a RelationalFullBatchNodeGenerator is now significantly faster and requires much less memory (18× speedup and 560× smaller for the stellargraph.datasets.AIFB dataset) #1274

Creating a FullBatchNodeGenerator or FullBatchLinkGenerator is now significantly faster and requires much less memory (3× speedup and 480× smaller for the stellargraph.datasets.PubMedDiabetes dataset) #1277

StellarGraph.info now shows a summary of the edge weights for each edge type #1240

The plot_history function accepts a return_figure parameter to return the matplotlib.figure.Figure value, for further manipulation #1309 (Thanks @LarsNeR)

Tests now pass against the TensorFlow 2.2.0 release candidates, in preparation for the full 2.2.0 release #1175

Some functions no longer fail for some particular cases of empty graphs: StellarGraph.to_adjacency_matrix #1378, StellarGraph.from_networkx #1401

CorruptedGenerator on a FullBatchNodeGenerator can be used to train DeepGraphInfomax on a subset of the nodes in a graph, instead of all of them #1415

The stellargraph.custom_keras_layers dictionary for use when loading a Keras model now includes all of StellarGraph's layers #1280

PaddedGraphGenerator.flow now also accepts a list of StellarGraph objects as input #1458

Supervised Graph Classification demo now prints progress update messages during training #1485

Explicit contributors file has been removed to avoid inconsistent acknowledgement #1484. Please refer to the Github display for contributors instead.

Various documentation, demo and error message fixes and improvements: #1141, #1219, #1246, #1260, #1266, #1361, #1362, #1385, #1386, #1363, #1376, #1405 (thanks @thatlittleboy), #1408, #1393, #1403, #1401, #1397, #1396, #1391, #1394, #1434 (thanks @thatlittleboy), #1442, #1438 (thanks @thatlittleboy), #1413, #1450, #1440, #1453, #1447, #1467, #1465 (thanks @thatlittleboy), #1470, #1475, #1480, #1468, #1472, #1474

DevOps changes:

CI: #1161, #1189, #1230, #1122, #1421

Other: #1197, #1322, #1407

Source code(tar.gz)
Source code(zip)
v1.0.0rc1(Apr 22, 2020)
This is the first release candidate for StellarGraph 1.0. The 1.0 release will be the culmination of 2 years of activate development, and this release candidate is the first milestone for that release.

Jump in to this release, with the new demos and examples:

More helpful indexing and guidance in demo READMEs

Loading from Neo4j

More explanatory Node2Vec link prediction

Unsupervised GraphSAGE and HinSAGE via DeepGraphInfomax

Graph classification with GCNSupervisedGraphClassification

Time series prediction using spatial information, using GraphConvolutionLSTM (experimental)

Major features and improvements

Better demonstration notebooks and documentation to make the library more accessible to new and existing users:

The demos READMEs now contain more guidance and explanation to make it easier to find a relevant example #1200

A demo for loading data from Neo4j has been added #1184

The demo for link prediction using Node2Vec has been rewritten to be clearer #1190

Notebooks are now included in the API documentation, for more convenient access #1279

Notebooks now detect if they're being used with an incorrect version of the StellarGraph library, elimanting confusion about version mismatches #1242

New algorithms:

GCNSupervisedGraphClassification: supervised graph classification model based on Graph Convolutional layers (GCN) #929, demo.

DeepGraphInfomax can be used to train almost any model in an unsupervised way, via the corrupt_index_groups parameter to CorruptedGenerator #1243, demo. Additionally, many algorithms provide defaults and so can be used with DeepGraphInfomax without specifying this parameter:

any model using FullBatchNodeGenerator, including models supported in StellarGraph 0.11: GCN, GAT, PPNP and APPNP

GraphSAGE #1162

HinSAGE for heterogeneous graphs with node features #1254

UnsupervisedSampler supports a walker parameter to use other random walking algorithms such as BiasedRandomWalk, in addition to the default UniformRandomWalk. #1187

The StellarGraph class is now smaller, faster and easier to construct:

The StellarGraph(..., edge_type_column=...) parameter can be used to construct a heterogeneous graph from a single flat DataFrame, containing a column of the edge types #1284. This avoids the need to build separate DataFrames for each type, and is significantly faster when there are many types. Using edge_type_column gives a 2.6× speedup for loading the stellargraph.datasets.FB15k dataset (with almost 600 thousand edges across 1345 types).

StellarGraph's internal cache of node adjacencies now uses the smallest integer type it can #1289. This reduces memory use by 31% on the FB15k dataset, and 36% on a reddit dataset (with 11.6 million edges).

Breaking changes

Edge weights are now validated to be numeric when creating a StellarGraph, previously edge weights could be any type, but all algorithms that use them would fail. #1191

Full batch layers no longer support an "output indices" tensor to filter the output rows to a selected set of nodes #1204 (this does not affect models like GCN, only the layers within them: APPNPPropagationLayer, ClusterGraphConvolution, GraphConvolution, GraphAttention, GraphAttentionSparse, PPNPPropagationLayer, RelationalGraphConvolution). Migration: post-process the output using tf.gather manually or the new sg.layer.misc.GatherIndices layer.

GraphConvolution has been generalised to work with batch size > 1, subsuming the functionality of the now-deprecated ClusterGraphConvolution (and GraphClassificationConvolution) #1205. Migration: replace stellargraph.layer.ClusterGraphConvolution with stellargraph.layer.GraphConvolution.

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

SortPooling layer: the node pooling layer introduced in Zhang et al #1210

DeepGraphConvolutionalNeuralNetwork (DGCNN): supervised graph classification using a stack of graph convolutional layers followed by SortPooling, and standard convolutional and pooling (such as Conv1D and MaxPool1D) #1212 #1265

GraphConvolutionLSTM: time series prediction on spatio-temporal data, combining GCN with a LSTM model to augment the conventional time-series model with information from nearby data points #1085, demo

Bug fixes and other changes

Random walk classes like UniformRandomWalk and BiasedRandomWalk can have their hyperparameters set on construction, in addition to in each call to run #1179

Node feature sampling was made ~4× faster by ensuring a better data layout, this makes some configurations of GraphSAGE (and HinSAGE) noticably faster #1225

The PROTEINS dataset has been added to stellargraph.datasets, for graph classification #1282

The BlogCatalog3 dataset can now be successfully downloaded again #1283

Knowledge graph model evaluation via rank_edges_against_all_nodes now defaults to the random strategy for breaking ties, and supports top (previous default) and bottom as alternatives #1223

Creating a RelationalFullBatchNodeGenerator is now significantly faster and requires much less memory (18× speedup and 560× smaller for the stellargraph.datasets.AIFB dataset) #1274

StellarGraph.info now shows a summary of the edge weights for each edge type #1240

Various documentation, demo and error message fixes and improvements: #1141, #1219, #1246, #1260, #1266

DevOps changes:

CI: #1161, #1189, #1230, #1122

Other: #1197

Source code(tar.gz)
Source code(zip)
v0.11.1(Mar 31, 2020)
StellarGraph is a Python library for machine learning on graphs and networks. It offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data.

Get started with StellarGraph's newest graph machine learning features with pip install stellargraph.

This bugfix release contains the same code as 0.11.0, and just fixes the metadata in the Anaconda package so that it can be installed successfully.

Bug fixes and other changes

The Conda package for StellarGraph has been updated to require TensorFlow 2.1, as TensorFlow 2.0 is no longer supported. As a result, StellarGraph will currently install via Conda on Linux and Windows - Mac support is waiting on the Tensorflow 2.1 osx-64 release to Conda. #1165

Source code(tar.gz)
Source code(zip)
v0.11.0(Mar 25, 2020)
StellarGraph is a Python library for machine learning on graphs and networks. It offers state-of-the-art algorithms for graph machine learning, making it easy to discover patterns and answer questions about graph-structured data.

Get started with StellarGraph's newest graph machine learning features with pip install stellargraph.

Major features and improvements

The onboarding/getting-started process has been optimised and improved:

The README has been rewritten to highlight our numerous demos, and how to get help #1081

Example Jupyter notebooks can now be run directly in Google Colab and Binder, providing an easy way to get started with StellarGraph - simply click the and badges within each notebook. #1119.

The new demos/basics directory contains two notebooks demonstrating how to construct a StellarGraph object from Pandas, and from NetworkX #1074

The GCN node classification demo now has more explanation, to serve as an introduction to graph machine learning using StellarGraph #1125

New algorithms:

Watch Your Step: computes node embeddings by simulating the effect of random walks, rather than doing them. #750.

Deep Graph Infomax: performs unsupervised node representation learning #978.

Temporal Random Walks (Continuous-Time Dynamic Network Embeddings): random walks that respect the time that each edge occurred (stored as edge weights) #1120.

ComplEx: computes multiplicative complex-number embeddings for entities and relationships (edge types) in knowledge graphs, which can be used for link prediction. #901 #1080

DistMult: computes multiplicative real-number embeddings for entities and relationships (edge types) in knowledge graphs, which can be used for link prediction. #755 #865 #1136

Breaking changes

StellarGraph now requires TensorFlow 2.1 or greater, TensorFlow 2.0 is no longer supported #1008

The legacy constructor using NetworkX graphs has been deprecated #1027. Migration: replace StellarGraph(some_networkx_graph) with StellarGraph.from_networkx(some_networkx_graph), and similarly for StellarDiGraph.

The build method on model classes (such as GCN) has been renamed to in_out_tensors #1140. Migration: replace model.build() with model.in_out_tensors().

The node_model and link_model methods on model classes has been replaced by in_out_tensors #1140 (see that PR for the exact list of types). Migration: replace model.node_model() with model.in_out_tensors() or model.in_out_tensors(multiplicity=1), and model.node_model() with model.in_out_tensors() or model.in_out_tensors(multiplicity=2).

Re-exports of calibration and ensembling functionality from the top-level of the stellargraph module were deprecated, in favour of importing from the stellargraph.calibration or stellargraph.ensemble submodules directly #1107. Migration: replace uses of stellargraph.Ensemble with stellargraph.ensemble.Ensemble, and similarly for the other names (see #1107 for all replacements).

StellarGraph.to_networkx parameters now use attr to refer to NetworkX attributes, not name or label #973. Migration: for any named parameters in graph.to_networkx(...), change node_type_name=... to node_type_attr=... and similarly edge_type_name to edge_type_attr, edge_weight_label to edge_weight_attr, feature_name to feature_attr.

StellarGraph.nodes_of_type is deprecated in favour of the nodes method #1111. Migration: replace some_graph.nodes_of_type(some_type) with some_graph.nodes(node_type=some_type).

StellarGraph.info parameters show_attributes and sample were deprecated #1110

Some more layers and models had many parameters move from **kwargs to real arguments: Attri2Vec (#1128), ClusterGCN (#1129), GraphAttention & GAT (#1130), GraphSAGE & its aggregators (#1142), HinSAGE & its aggregators (#1143), RelationalGraphConvolution & RGCN (#1148). Invalid (e.g. incorrectly spelled) arguments would have been ignored previously, but now may fail with a TypeError; to fix, remove or correct the arguments.

The method="chebyshev" option to FullBatchNodeGenerator, FullBatchLinkGenerator and GCN_Aadj_feats_op has been removed for now, because it needed significant revision to be correctly implemented #1028

The fit_generator, evaluate_generator and predict_generator methods on Ensemble and BaggingEnsemble have been renamed to fit, evaluate and predict, to match the deprecation in TensorFlow 2.1 of the tensorflow.keras.Model methods of the same name #1065. Migration: remove the _generator suffix on these methods.

The default_model method on Attri2Vec, GraphSAGE and HinSAGE has been deprecated, in favour of in_out_tensors #1145. Migration: replace model.default_model() with model.in_out_tensors().

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

GCNSupervisedGraphClassification: supervised graph classification model based on Graph Convolutional layers (GCN) #929.

Bug fixes and other changes

StellarGraph.to_adjacency_matrix is at least 15× faster on undirected graphs #932

ClusterNodeGenerator is now noticably faster, which makes training and predicting with a ClusterGCN model faster #1095. On a random graph with 1000 nodes and 5000 edges and 10 clusters, iterating over an epoch with q=1 (each clusters individually) is 2× faster, and is even faster for larger q. The model in the Cluster-GCN demo notebook using Cora trains 2× faster overall.

The node_features=... parameter to StellarGraph.from_networkx now only needs to mention the node types that have features, when passing a dictionary of Pandas DataFrames. Node types that aren't mentioned will automatically have no features (zero-length feature vectors). #1082

A subgraph method was added to StellarGraph for computing a node-induced subgraph #958

A connected_components method was added to StellarGraph for computing the nodes involved in each connected component in a StellarGraph #958

The info method on StellarGraph now shows only 20 node and edge types by default to be more useful for graphs with many types #993. This behaviour can be customized with the truncate=... parameter.

The info method on StellarGraph now shows information about the size and type of each node type's feature vectors #979

The EdgeSplitter class supports StellarGraph input (and will output StellarGraphs in this case), in addition to NetworkX graphs #1032

The Attri2Vec model class stores its weights statefully, so they are shared between all tensors computed by build #1101

The GCN model defaults for some parameters now match the GraphConvolution layer's defaults: specifically kernel_initializer (glorot_uniform) and bias_initializer (zeros) #1147

The datasets submodule is now accessible as stellargraph.datasets, after just import stellargraph #1113

All datasets in stellargraph.datasets now support a load method to create a StellarGraph object (and other information): AIFB (#982), CiteSeer (#989), Cora (#913), MovieLens (#947), PubMedDiabetes (#986). The demo notebooks using these datasets are now cleaner.

Some new datasets were added to stellargraph.datasets:

MUTAG: a collection of graphs representing chemical compounds #960

WN18, WN18RR: knowledge graphs based on the WordNet linguistics data #977

FB15k, FB15k_237: knowledge graphs based on the FreeBase knowledge base #977

IAEnronEmployees: a small set of employees of Enron, and the many emails between them #1058

Warnings now point to the call site of the function causing the warning, not the warnings.warn call inside StellarGraph; this means DeprecationWarnings will be visible in Jupyter notebooks and scripts run with Python 3.7 #1144

Some code that triggered warnings from other libraries was fixed or removed #995 #1008, #1051, #1064, #1066

Some demo notebooks have been updated or fixed: demos/use-cases/hateful-twitters.ipynb (#1019), rgcn-aifb-node-classification-example.ipynb (#983)

The documentation "quick start" guide duplicated a lot of the information in the README, and so has been replaced with the latter #1096

API documentation now lists items under their recommended import path, not their definition. For instance, stellargraph.StellarGraph instead of stellargraph.core.StellarGraph (#1127), stellargraph.layer.GCN instead of stellargraph.layer.gcn.GCN (#1150) and stellargraph.datasets.Cora instead of stellargraph.datasets.datasets.Cora (#1157)

Some API documentation is now formatted better #1061, #1068, #1070, #1071

DevOps changes:

Neo4j functionality is now tested on CI, and so will continue working #1046 #1050

CI: #967, #968, #1036, #1067, #1097

Other: #956, #962, #974

Source code(tar.gz)
Source code(zip)
v0.10.0(Feb 26, 2020)
Major features and improvements

The StellarGraph and StellarDiGraph classes are now backed by NumPy and Pandas #752. The StellarGraph(...) and StellarDiGraph(...) constructors now consume Pandas DataFrames representing node features and the edge list. This significantly reduces the memory use and construction time for these StellarGraph objects.

The following table shows some measurements of the memory use of g = StellarGraph(...), and the time required for that constructor call, for several real-world datasets of different sizes, for both the old form backed by NetworkX code and the new form backed by NumPy and Pandas (both old and new store node features similarly, using 2D NumPy arrays, so the measurements in this table include only graph structure: the edges and nodes themselves):

| dataset | nodes | edges | size old (MiB) | size new (MiB) | size change | time old (s) | time new (s) | time change | |---------|-------:|---------:|---------------:|---------------:|------------:|-------------:|-------------:|------------:| | Cora | 2708 | 5429 | 4.1 | 1.3 | -69% | 0.069 | 0.034 | -50% | | FB15k | 14951 | 592213 | 148 | 28 | -81% | 5.5 | 1.2 | -77% | | Reddit | 231443 | 11606919 | 6611 | 493 | -93% | 154 | 33 | -82% |

The old backend has been removed, and conversion from a NetworkX graph should be performed via the StellarGraph.from_networkx function (the existing form StellarGraph(networkx_graph) is supported in this release but is deprecated, and may be removed in a future release).

More detailed information about Heterogeneous GraphSAGE (HinSAGE) has been added to StellarGraph's readthedocs documentation #839.

New algorithms:

Link prediction with directed GraphSAGE, via DirectedGraphSAGELinkGenerator #871

GraphWave: computes structural node embeddings by using wavelet transforms on the graph Laplacian #822

Breaking changes

Some layers and models had many parameters move from **kwargs to real arguments: GraphConvolution, GCN. #801 Invalid (e.g. incorrectly spelled) arguments would have been ignored previously, but now may fail with a TypeError; to fix, remove or correct the arguments.

The stellargraph.data.load_dataset_BlogCatalog3 function has been replaced by the load method on stellargraph.datasets.BlogCatalog3 #888. Migration: replace load_dataset_BlogCatalog3(location) with BlogCatalog3().load(); code required to find the location or download the dataset can be removed, as load now does this automatically.

stellargraph.data.train_test_val_split and stellargraph.data.NodeSplitter have been removed. #887 Migration: this functionality should be replaced with pandas and sklearn (for instance, sklearn.model_selection.train_test_split).

Most of the submodules in stellargraph.utils have been moved to top-level modules: stellargraph.calibration, stellargraph.ensemble, stellargraph.losses and stellargraph.interpretability #938. Imports from the old location are now deprecated, and may stop working in future releases. See the linked issue for the full list of changes.

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

Temporal Random Walks: random walks that respect the time that each edge occurred (stored as edge weights) #787. The implementation does not have an example or thorough testing and documentation.

Watch Your Step: computes node embeddings by simulating the effect of random walks, rather than doing them. #750. The implementation is not fully tested.

ComplEx: computes embeddings for nodes and edge types in knowledge graphs, and use these to perform link prediction #756. The implementation hasn't been validated to match the paper.

Neo4j connector: the GraphSAGE algorithm can execute doing neighbourhood sampling in a Neo4j database, so that the edges of a graph do not have to fit entirely into memory #799. The implementation is not automatically tested, and doesn't support functionality like loading node feature vectors from Neo4j.

Bug fixes and other changes

StellarGraph now supports TensorFlow 2.1, which includes GPU support by default: #875

Demos now focus on Jupyter notebooks, and demo scripts that duplicate notebooks have been removed: #889

The following algorithms are now reproducible:

Supervised GraphSAGE Node Attribute Inference #844

Randomness can be more easily controlled using stellargraph.random.set_seed #806

StellarGraph.edges() can return edge weights as a separate NumPy array with include_edge_weights=True #754

StellarGraph.to_networkx supports ignoring node features (and thus being a little more efficient) with feature_name=None #841

StellarGraph.to_adjacency_matrix now ignores edge weights (that is, defaults every weight to 1) by default, unless weighted=True is specified #857

stellargraph.utils.plot_history visualises the model training history as a plot for each metric (such as loss) #902

the saliency maps/interpretability code has been refactored to have more sharing as well as to make it cleaner and easier to extend #855

DevOps changes:

Most demo notebooks are now tested on CI using Papermill, and so won't become out of date #575

CI: #698, #760, #788, #817, #860, #874, #877, #878, #906, #908, #915, #916, #918

Other: #708, #746, #791

Source code(tar.gz)
Source code(zip)
v0.9.0(Jan 29, 2020)
Major features and improvements

StellarGraph is now available as a conda package on Anaconda Cloud #516

New algorithms:

Cluster-GCN: an extension of GCN that can be trained using SGD, with demo #487

Relational-GCN (RGCN): a generalisation of GCN to relational/multi edge type graphs, with demo #490

Link prediction for full-batch models: FullBatchLinkGenerator allows doing link prediction with algorithms like GCN, GAT, APPNP and PPNP #543

Unsupervised GraphSAGE has now been updated and tested for reproducibility. Ensuring all seeds are set, running the same pipeline should give reproducible embeddings. #620

A datasets subpackage provides easier access to sample datasets with inbuilt downloading. #690

Breaking changes

The stellargraph library now only supports tensorflow version 2.0 #518, #732. Backward compatibility with earlier versions of tensorflow is not guaranteed.

The stellargraph library now only supports Python versions 3.6 and above #641. Backward compatibility with earlier versions of Python is not guaranteed.

The StellarGraph class no longer exposes NetworkX internals, only required functionality. In particular, calls like list(G) will no longer return a list of nodes; use G.nodes() instead. #297 If NetworkX functionality is required, use the new .to_networkx() method to convert to a normal networkx.MultiGraph or networkx.MultiDiGraph.

Passing a NodeSequence or LinkSequence object to GraphSAGE and HinSAGE classes is now deprecated and no longer supported #498. Users might need to update their calls of GraphSAGE and HinSAGE classes by passing generator objects instead of generator.flow() objects.

Various methods on StellarGraph have been renamed to be more succinct and uniform:

get_feature_for_nodes is now node_features

type_for_node is now node_type

Neighbourhood methods in StellarGraph class (neighbors, in_nodes, out_nodes) now return a list of neighbours instead of a set. This addresses #653. This means multi-edges are no longer collapsed into one in the return value. There will be an implicit change in behaviour for explorer classes used for algorithms like GraphSAGE, Node2Vec, since a neighbour connected via multiple edges will now be more likely to be sampled. If this doesn't sound like the desired behaviour, consider pruning the graph of multi-edges before running the algorithm.

GraphSchema has been simplified to remove type look-ups for individual nodes and edges #702 #703. Migration: for nodes, use StellarGraph.node_type; for edges, use the triple argument to the edges method, or filter when doing neighbour queries using the edge_types argument.

NodeAttributeSpecification and the supporting Converter classes have been removed #707. Migration: use the more powerful and flexible preprocessing tools from pandas and sklearn (see the linked PR for specifics)

Experimental features

Some new algorithms and features are still under active development, and are available as an experimental preview. However, they may not be easy to use: their documentation or testing may be incomplete, and they may change dramatically from release to release. The experimental status is noted in the documentation and at runtime via prominent warnings.

The StellarGraph and StellarDiGraph classes supports using a backend based on NumPy and Pandas that uses dramatically less memory for large graphs than the existing NetworkX-based backend #668. The new backend can be enabled by constructing with StellarGraph(nodes=..., edges=...) using Pandas DataFrames, instead of a NetworkX graph.

Bug fixes and other changes

Documentation for every relased version is published under a permanent URL, in addition to the stable alias for the latest release, e.g. https://stellargraph.readthedocs.io/en/v0.8.4/ for v0.8.4 #612

Neighbourhood methods in StellarGraph class (neighbors, in_nodes, out_nodes) now support additional parameters to include edge weights in the results or filter by a set of edge types. #646

Changed GraphSAGE and HinSAGE class API to accept generator objects the same as GCN/GAT models. Passing a NodeSequence or LinkSequence object is now deprecated. #498

SampledBreadthFirstWalk, SampledHeterogeneousBreadthFirstWalk and DirectedBreadthFirstNeighbours have been made 1.2-1.5× faster #628

UniformRandomWalk has been made 2× faster #625

FullBatchNodeGenerator.flow has been reduced from O(n^2) quadratic complexity to O(n), where n is the number of nodes in the graph, making it orders of magnitude faster for large graphs #513

The dependencies required for demos and testing have been included as "extras" in the main package: demos and igraph for demos, and test for testing. For example, pip install stellargraph[demos,igraph] will install the dependencies required to run every demo. #661

The StellarGraph and StellarDiGraph constructors now list their arguments explicitly for clearer documentation (rather than using *arg and **kwargs splats) #659

sys.exit(0) is no longer called on failure in load_dataset_BlogCatalog3 #648

Warnings are printed using the Python warnings module #583

Numerous DevOps changes:

CI results are now publicly viewable: https://buildkite.com/stellar/stellargraph-public

CI: #524, #534, #544, #550, #551, #557, #562, #574 #578, #579, #587, #592, #595, #596, #602, #609, #613, #615, #631, #637, #639, #640, #652, #656, #663, #675

Git and Github configuration: #516, #588, #624, #672, #682, #683,

Other: #523, #582, #590, #654

Source code(tar.gz)
Source code(zip)
v0.8.4(Jan 20, 2020)
Fixed bugs:

Fix DirectedGraphSAGENodeGenerator always hitting TypeError exception. #695

Source code(tar.gz)
Source code(zip)
v0.8.3(Dec 12, 2019)
Fixed bugs:

Fixed the issue in the APPNP class that causes appnp to propagate excessive dropout layers. #525

Added a fix into the PPNP node classification demo so that the softmax layer is no longer propagated. #525

Source code(tar.gz)
Source code(zip)
v0.8.2(Nov 8, 2019)
Fixed bugs:

Updated requirements to Tensorflow>=1.14, as tensorflow with lower versions causes errors with sparse full batch node methods: GCN, APPNP, and GAT. #519

Source code(tar.gz)
Source code(zip)
v0.8.1(Oct 29, 2019)
Fixed bugs:

Reverted erroneous demo notebooks.

Source code(tar.gz)
Source code(zip)
v0.8.0(Oct 25, 2019)
We are excited to announce the 0.8.0 release of the library. This release extends stellargraph by adding new algorithms and demos, enhancing interpretability via saliency maps for GAT, and further simplifying graph ML workflows through standardised model APIs and arguments. More details on new features and enhancements are listed below.

New algorithms:

Directed GraphSAGE algorithm (a generalisation of GraphSAGE to directed graphs) + demo #479

Attri2vec algorithm + demo #470 #455

PPNP and APPNP algorithms + demos #485

GAT saliency maps for interpreting node classification with Graph Attention Networks + demo #435

Implemented enhancements:

New demo of node classification on Twitter hateful users \430

New demo of graph saliency on Twitter hateful users #448

Added Directed SampledBFS walks on directed graphs #464

Unified API of GCN, GAT, GraphSAGE, and HinSAGE classses by adding build() method to GCN and GAT classes #439

Added activations argument to GraphSAGE and HinSAGE classes #381

Unified activations for GraphSAGE, HinSAGE, GCN and GAT #493 #381

Added optional regularisation on the weights for GCN, GraphSage, and HinSage #172 #469

Unified regularisation of GraphSAGE, HinSAGE, GCN and GAT #494 (geoffj-d61)

Unsupervised GraphSage speed up via multithreading #474 #477

Support of sparse generators in the GCN saliency map implementation. #432

Refactoring:

Refactored Ensemble class into Ensemble and BaggingEnsemble. The former implements naive ensembles and the latter bagging ensembles. #459

Changed from using keras to use tensorflow.keras #471

Removed flatten_output arguments for all models #447

Fixed bugs:

Updated Yelp example to support new dataset version #442

Fixed bug where some nodes and edges did not get a default type #451

Inconsistency in Ensemble.fit_generator() argument #461

Fixed source--target node designations for code using Cora dataset #444

IndexError: index 1 is out of bounds for axis 1 with size 1 in: demos/node-classification/hinsage #434

GraphSAGE and GAT/GCN predictions have different shapes #425

Source code(tar.gz)
Source code(zip)
v0.7.3(Oct 18, 2019)

Limited NetworkX version to <2.4 and Tensorflow version to <1.15 in installation requirements, to avoid errors due to API changes in the recent versions of NetworkX and Tensorflow.
Source code(tar.gz)
Source code(zip)
v0.7.2(Sep 20, 2019)

Limited Keras version to <2.2.5 and Tensorflow version to <2.0 in installation requirements, to avoid errors due to API changes in the recent versions of Keras and Tensorflow.
Source code(tar.gz)
Source code(zip)
v0.7.1(Jun 25, 2019)
Fixed bugs:

Removed igraph and mplleaflet from demos requirements in setup.py. Python-igraph doesn't install on many systems and is only required for the clustering notebook. See the README.md in that directory for requirements and installation directions.

Updated GCN interpretability notebook to work with new FullBatchGenerator API #429

Source code(tar.gz)
Source code(zip)
v0.7.0(Jun 25, 2019)
New features and enhancements:

SGC Implementation #361

Updated to support Python 3.7 #348

FullBatchNodeGenerator now supports a simpler interface to apply different adjacency matrix pre-processing options #405

Full-batch models (GCN, GAT, and SGC) now return predictions for only those nodes provided to the generator in the same order #417

GAT now supports using a sparse adjacency matrix making execution faster #420

Added interpretability of GCN models and a demo of finding important edges for a node prediction #383

Added a demo showing inductive classification with the PubMed dataset #372

Refactoring:

Added build() method for GraphSAGE and HinSAGE model classes #385 This replaces the node_model() and link_model() methods, which will be deprecated in future versions (deprecation warnings added).

Changed the FullBatchNodeGenerator to accept simpler method and transform arguments #405

Fixed bugs:

Removed label from features for pubmed dataset. #362

Python igraph requirement fixed #392

Simplified random walks to not require passing a graph #408

Source code(tar.gz)
Source code(zip)
v0.6.1(Apr 1, 2019)
Fixed bugs:

a bug in passing graph adjacency matrix to the optional func_opt function in FullBatchNodeGenerator class

a bug in demos/node-classification/gcn/gcn-cora-example.py:144: incorrect argument was used to pass the optional function to the generator for GCN

Enhancements:

separate treatment of gcn and gat models in demos/ensembles/ensemble-node-classification-example.ipynb

Source code(tar.gz)
Source code(zip)
v0.6.0(Mar 14, 2019)
New features and enhancements:

Graph Attention (GAT) layer and model (stack of GAT layers), with demos #216, #315

Unsupervised GraphSAGE #331 with a demo #335

Model Ensembles #343

Community detection based on unsupervised graph representation learning #354

Saliency maps and integrated gradients for model interpretability #345

Shuffling of head nodes/edges in node and link generators at each epoch #298

Fixed bugs:

a bug where seed was not passed to sampler in GraphSAGELinkGenerator constructor #337

UniformRandomMetaPathWalk doesn't update the current node neighbors #340

seed value for link mapper #336

Source code(tar.gz)
Source code(zip)
v0.5.0(Feb 11, 2019)
Implemented new features and enhancements:

Added model calibration #326

Added GraphConvolution layer, GCN class for a stack of GraphConvolution layers, and FullBatchNodeGenerator class for feeding data into GCN models #318

Added GraphSAGE attention aggregator #317

Added GraphSAGE MaxPoolAggregator and MeanPoolAggregator #278

Added shuffle option to all flow methods for GraphSAGE and HinSAGE generators #328

GraphSAGE and HinSAGE: ensure that a MLP can be created by using zero samples #301

Handle isolated nodes in GraphSAGE #294

Ensure isolated nodes are handled correctly by GraphSAGENodeMapper and GraphSAGELinkMapper #182

EdgeSplitter: introduce a switch for keeping the reduced graph connected #285

Node2vec for weighted graphs #241

Fix edge types in demos #237

Add docstrings to StellarGraphBase class #175

Make L2-normalisation of the final embeddings in GraphSAGE and HinSAGE optional #115

Check/change the GraphSAGE mapper's behaviour for isolated nodes #100

Added GraphSAGE node embedding extraction and visualisation #290

Fixed bugs:

Fixed the bug in running demos when no options given #271

Fixed the bug in LinkSequence that threw an error when no link targets were given #273

Refactoring:

Refactored link inference classes to use edge_embedding_method instead of edge_feature_method #327

Source code(tar.gz)
Source code(zip)
0.4.1(Oct 4, 2018)
What's new:

BiasedRandomWalk sampler is improved and optimized

Typos fixed

Some broken links in documentation are fixed

Source code(tar.gz)
Source code(zip)
v0.4.0(Sep 14, 2018)
Features of this release:

The library is refactored for better structure and more intuitive use;

/demos populated with examples covering representation learning, node attribute inference, and link prediction/inference for both homogeneous and heterogeneous networks;

Usage of StellarGraph class is simplified;

Documentation improved;

Various bugs fixed.

Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 16, 2018)
What's new in this release:

UniformRandomMetaPathWalk class added to /stellar/data/explorer.py, enabling metapath-driven uniform random walks on heterogeneous graphs with multiple node types (but currently limited to one edge type per pair of nodes)

Added GraphSAGELinkMapper class to /stellar/mapper/link_mapper.py, for feeding minibatches of links into the GraphSAGE layer, for link prediction/classification/attribute inference workflows.

A demo of link prediction using GraphSAGE is added in /demos/link-prediction_graphsage/, showing how to use GraphSAGELinkMapper to build an end-to-end link prediction/classification workflow with GraphSAGE.

StellarGraph base class is introduced in /stellar/data/stellargraph.py. This class is intended to be the default class for graph objects in this library.

A convenience graph loader function, from_epgm(), is added to /stellar/data/loader.py

Some bugs are fixed

Code documentation is improved

Source code(tar.gz)
Source code(zip)
v0.2.0(Jul 4, 2018)
What's new in this release:

UniformRandomWalk class added to /stellar/data/explorer.py

SampledBreadthFirstWalk class added, to be used with GraphSAGENodeMapper class

GraphSAGENodeMapper class added, to feed minibatches of sampled subgraphs to GraphSAGE layer

a demo added /demos/node-classification/epgm-example.py showing how to use GraphSAGENodeMapper with SampledBreadthFirstWalk sampler to build an end-to-end node classification workflow with GraphSAGE.

Source code(tar.gz)
Source code(zip)

StellarGraph - Machine Learning on Graphs

Related tags

Overview

StellarGraph Machine Learning Library

Table of Contents

Introduction

Getting Started

Getting Help

Example: GCN

Data preparation

Graph machine learning model

Training and evaluation

Algorithms

Installation

Install StellarGraph using PyPI:

Install StellarGraph in Anaconda Python:

Install StellarGraph from GitHub source:

Citing

References

Comments

Describe the bug

To Reproduce

Environment

v65.5.1

v65.5.0

v65.4.1

v65.4.0

v65.3.0

v65.2.0

v65.1.1

v65.1.0

v65.0.2

v65.0.1

v65.0.0

v64.0.3

v64.0.2

v64.0.1

v64.0.0

v63.4.3

v63.4.2

v65.5.1

v65.5.0

v65.4.1

v65.4.0

Describe the bug

To Reproduce

Observed behavior

Expected behavior

Environment

Additional context

Description

User Story

Done Checklist

9.3.0

Changes

9.3.0 (2022-10-29)

Describe the bug

Releases(v1.2.1)

v1.2.1(Jun 30, 2020)

Bug fixes and other changes

v1.2.0(Jun 25, 2020)

Major features and improvements

Deprecations

Experimental features

Bug fixes and other changes

v1.1.0(Jun 2, 2020)

Major features and improvements

Experimental features

Bug fixes and other changes

v1.0.0(May 5, 2020)

Major features and improvements

Breaking changes

Experimental features

Bug fixes and other changes

v1.0.0rc1(Apr 22, 2020)

Major features and improvements

Breaking changes

Experimental features

Bug fixes and other changes

v0.11.1(Mar 31, 2020)