← all repositories
seongjunyun/Graph_Transformer_Networks

When your graph has too many node types for plain GNNs

GTN learns which meta-paths matter instead of making you hand-engineer them.

1.1k stars Jupyter Notebook Other AI
Graph_Transformer_Networks
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does

Graph Transformer Networks tackle heterogeneous graphs — the kind where nodes and edges come in multiple flavors, like academic papers, authors, and venues all mixed together. Instead of requiring you to manually define meta-paths (think “paper → author → paper”), GTN learns which composite paths are useful and how to weight them. The repo implements both the original NeurIPS 2019 GTN and the 2022 FastGTN follow-up, which adds non-local operations and better scalability.

The interesting bit

The core trick is learning a weighted combination of adjacency matrix powers to automatically discover useful multi-hop connections across node types. FastGTN then layers in non-local operations — essentially letting distant nodes talk directly without traversing every intermediate step — which the authors note needs >24 GB VRAM for some DBLP configurations.

Key highlights

  • Implements both GTN (NeurIPS 2019) and FastGTN (Neural Networks 2022) in PyTorch
  • Works with standard heterogeneous benchmarks: DBLP, ACM, IMDB
  • Includes preprocessing code for ACM data, with other datasets borrowed from the HAN repo
  • Updated in 2022 to handle torch_geometric’s removal of sparse matrix backward ops
  • Authors recommend the DGL-based OpenHGNN implementation for best GTN results on DBLP/ACM

Caveats

  • The current torch.sparse.mm implementation is memory-hungry; num_layers > 1 on DBLP/ACM won’t run in this repo’s best configuration
  • FastGTN with non-local ops demands serious GPU memory (>24 GB for DBLP)
  • You’ll need to manually download datasets from Google Drive and extract them

Verdict

Worth a look if you’re working with heterogeneous graphs and tired of hand-crafting meta-paths. Skip it if your graphs are homogeneous or if you’re GPU-poor — the memory requirements are real, and the authors themselves point to OpenHGNN for production-grade GTN results.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.