Graph Hopper Kernel

Given two graphs, the GraphHopper kernel compares shortest paths between pairs of vertices from the two graphs [FKP+13]. The kernel takes into account both path lengths and the vertices encountered while ``hopping’’ along shortest paths. The kernel is equivalent to a weighted sum of node kernels.

Let \(G=(V,E)\) be a graph. The graph contains either discrete node labels or continuous node attributes. Let \(\ell : \mathcal{V} \rightarrow \mathcal{L}\) be a labeling function that assigns either discrete labels or continuous attributes to vertices. The kernel compares node labels/attributes using a kernel \(k_n\) (eg delta kernel in the case of node labels, and linear or gaussian kernel in the case of node attributes). Given two vertices \(v,u \in V\), a path \(\pi\) from \(v\) to \(u\) in \(G\) is defined as a sequence of vertices

\[\pi = [v_1, v_2, v_3, \ldots, v_l]\]

where \(v_1 = v\), \(v_l = u\) and \((v_i, v_{i+1}) \in E\) for all \(i=1,\ldots,l-1\). Let \(\pi(i) = v_i\) denote the \(i^{th}\) vertex encountered when ``hopping’’ along the path. Denote by \(l(\pi)\) the weighted length of \(\pi\) and by \(|\pi|\) its discrete length, defined as the number of vertices in \(\pi\). The shortest path \(\pi_{ij}\) from \(v_i\) to \(v_j\) is defined in terms of weighted length. The diameter \(\delta(G)\) of \(G\) is the maximal number of nodes in a shortest path in \(G\), with respect to weighted path length.

The GraphHopper kernel is defined as a sum of path kernels \(k_p\) over the families \(P, P'\) of shortest paths in \(G,G'\)

\[k(G,G') = \sum_{\pi \in P} \sum_{\pi' \in P'} k_p(\pi, \pi')\]

The path kernel \(k_p(\pi, \pi')\) is a sum of node kernels \(k_n\) on vertices simultaneously encountered while simultaneously hopping along paths \(\pi\) and \(\pi'\) of equal discrete length, that is

\[\begin{split}k_p(\pi, \pi') = \begin{cases} \sum_{j=1}^{|\pi|} k_n(\pi(j), \pi'(j)), & \text{if $|\pi| = |\pi'|$},\\ 0, & \text{otherwise.} \end{cases}\end{split}\]

The \(k(G,G')\) kernel can be decomposed into a weighted sum of node kernels

\[k(G,G') = \sum_{v \in V} \sum_{v' \in V'} w(v,v') k_n(v, v')\]

where \(w(v,v')\) counts the number of times \(v\) and \(v'\) appear at the same hop, or coordinate, \(i\) of shortest paths \(\pi,\pi'\) of equal discrete length \(|\pi| = |\pi'|\). We can decompose the weight \(w(v,v')\) as

\[w(v,v') = \sum_{j=1}^\delta \sum_{i=1}^\delta | \{ (\pi,\pi') : \pi(i)=v, \pi'(i)=v', |\pi|=|\pi'|=j \} | = \sum_{j=1}^\delta \sum_{i=1}^\delta [\mathbf{M_v}]_{ij} [\mathbf{M_{v'}}]_{ij}\]

where \(\mathbf{M_v}\) is a \(\delta \times \delta\) matrix whose entry \([\mathbf{M_v}]_{ij}\) counts how many times \(v\) appears at the \(i^{th}\) coordinate of a shortest path in \(G\) of discrete length \(j\), and \(\delta = \max(\delta(G), \delta(G'))\). The components of these matrices can be computed efficiently using recursive message-passing algorithms. The total complexity of computing \(k(G,G')\) is \(\mathcal{O}(n^2(m + \log n + d + \delta^2))\) where \(n\) is the number of vertices, \(m\) is the number of edges and \(d\) is the dimensionality of the node attributes (\(d=1\) in the case of discrete node labels).

The implementation of the neighborhood hash kernel can be found below

GraphHopper([n_jobs, normalize, verbose, …])

Graph Hopper Histogram kernel as found in [FKP+13].

Bibliography

FKP+13(1,2)

Aasa Feragen, Niklas Kasenburg, Jens Petersen, Marleen de Bruijne, and Karsten Borgwardt. Scalable kernels for graphs with continuous attributes. In Advances in Neural Information Processing Systems, 216–224. 2013.