Graph Attention Networks

Attention-like method to compute correlation between nodes.

About

  • PDF Link: here
  • Authors: Petar Velickovic, Guillen Cucurull
  • Institute: University of Cambridge

Innovative Designs

  • Simply caculate the coefficient between a node with its neighboring nodes, and assign the weighted linear combination of the neighboring nodes to obtain the feature of the node for next iteration.

  • Use a parametrized matrix to convert a node feature to a high-level attentionable feature (much like the \(W_{query}\), \(W_{value}\) in Transformer). Use a parametrized vector to convert concatenated features of two nodes to the edge weight between them.

  • Use multi-head attention (multiple \(W\)) and simply concatenated or averaged the features of the next layer.

Diagram

Diagram of Graph Attention Networks

Reference

@inproceedings{velivckovic2018graph,
  title={Graph Attention Networks},
  author={Veli{\v{c}}kovi{\'c}, Petar and Cucurull, Guillem and Casanova, Arantxa and Romero, Adriana and Li{\`o}, Pietro and Bengio, Yoshua},
  booktitle={International Conference on Learning Representations},
  year={2018}
}