Graph Attention Networks
Attention-like method to compute correlation between nodes.
About
- PDF Link: here
- Authors: Petar Velickovic, Guillen Cucurull
- Institute: University of Cambridge
Innovative Designs
-
Simply caculate the coefficient between a node with its neighboring nodes, and assign the weighted linear combination of the neighboring nodes to obtain the feature of the node for next iteration.
-
Use a parametrized matrix to convert a node feature to a high-level attentionable feature (much like the \(W_{query}\), \(W_{value}\) in Transformer). Use a parametrized vector to convert concatenated features of two nodes to the edge weight between them.
-
Use multi-head attention (multiple \(W\)) and simply concatenated or averaged the features of the next layer.
Diagram

Diagram of Graph Attention Networks
Reference
@inproceedings{velivckovic2018graph,
title={Graph Attention Networks},
author={Veli{\v{c}}kovi{\'c}, Petar and Cucurull, Guillem and Casanova, Arantxa and Romero, Adriana and Li{\`o}, Pietro and Bengio, Yoshua},
booktitle={International Conference on Learning Representations},
year={2018}
}