Learn to build a Graph Convolutional Network that can handle heterogeneous graph data for link prediction
This article is a detailed technical deep dive into how to build a powerful model for anomaly detection with graph data containing entities of different types (heterogeneous graph data).
The model you will learn about is based on the paper titled “Interaction-Focused Anomaly Detection on Bipartite Node-and-Edge-Attributed Graphs” presented by Grab, an Asian tech company, at the 2023 International Joint Conference on Neural Networks (IJCNN) conference.
This Graph Convolutional Network (GCN) model can handle heterogeneous graph data, meaning that nodes and edges are of different types. These graphs are structurally complex as they represent relationships between different types of entities or nodes.
GCNs that can handle heterogeneous graph data is an active area of research. The convolutional operations in the model have been adapted to address challenges around handling different node types and their relationships in a heterogeneous graph.
In contrast, homogeneous graphs involve nodes and edges of the same type. This type of graph is structurally simpler. An example of a homogeneous graph include LinkedIn connections, where all nodes represent individuals and edges exist between individuals if they are connected.
The example you will see here applies Grab’s GraphBEAN model (Bipartite Node-and-Edge-Attributed Networks) to a Kaggle dataset on healthcare provider fraud. (This dataset is currently licensed CC0: Public Domain on Kaggle. Please note that this dataset might not be accurate, and it’s used in this article only for demonstration purposes). The dataset contains multiple csv files with claims and insights on inpatient data, outpatient data, and beneficiary data.
I will demonstrate how to build a GCN to predict healthcare provider fraud using the inpatient dataset and train set containing ProviderIDand a label column (PotentialFraud).
While graph data can be difficult to visualize in tabular form, like the csv files, you can make interesting…